Running Bulk Extraction
Bulk extraction mechanism employs direct calls on Azure API calls to fetch all blobs and containers in ADLS Gen2 storage account.
If there is a failure while extracting the complete Azure metadata, the bulk extraction
must be resumed from the last checkpoint by changing
atlas.adls.extraction.resume.from.progress.file=true
configuration
at adls.conf
.
The following command line example runs the bulk extraction. Assuming the mandatory properties are set in the default configuration file, only the parameter to enable bulk mode is required:
/opt/cloudera/parcels/CDH/lib/atlas/extractors/bin/adls-extractor.sh
Or
/opt/cloudera/parcels/CDH/lib/atlas/extractors/bin/adls-extractor.sh -e BULK
Refer to Extraction Configurationfor more details on different optional configurations.