- 1. About This Guide
- 2. The Cloud Storage Connectors
- 3. Working with Amazon S3
- Limitations of Amazon S3
- Configuring Access to S3
- Defining Authentication Providers
- IAM Role Permissions for Working with S3
- Referencing S3 Data in Applications
- Configuring Per-Bucket Settings
- Using S3Guard for Consistent S3 Metadata
- Introduction to S3Guard
- Configuring S3Guard
- Monitoring and Maintaining S3Guard
- Disabling S3Guard and Destroying a S3Guard Database
- Pruning Old Data from S3Guard Tables
- Importing a Bucket into S3Guard
- Verifying that S3Guard is Enabled on a Bucket
- Using the S3Guard CLI
- S3Guard: Operational Issues
- S3Guard: Known Issues
- Safely Writing to S3 Through the S3A Committers
- Introducing the S3A Committers
- Enabling the Directory Committer in Hadoop
- Configuring Directories for Intermediate Data
- Using the Directory Committer in MapReduce
- Enabling the Directory Committer in Spark
- Verifying That an S3A Committer Was Used
- Cleaning up After Failed Jobs
- Using the S3Guard Command to List and Delete Uploads
- Advanced Committer Configuration
- Securing the S3A Committers
- The S3A Committers and Third-Party Object Stores
- Limitations of the S3A Committers
- Troubleshooting the S3A Committers
- Security Model and Operations on S3
- S3A and Checksums (Advanced Feature)
- A List of S3A Configuration Properties
- Encrypting Data on S3
- Improving Performance for S3A
- Working with Third-party S3-compatible Object Stores
- Troubleshooting S3
- 4. Working with ADLS
- 5. Working with WASB
- Configuring Access to WASB
- Protecting the Azure Credentials for WASB with Credential Providers
- Protecting the Azure Credentials for WASB within an Encrypted File
- Referencing WASB in URLs
- Configuring Page Blob Support
- Configuring Atomic Folder Rename
- Configuring Support for Append API
- Configuring Multithread Support
- Configuring WASB Secure Mode
- Configuring Authorization Support in WASB
- 6. Working with Google Cloud Storage
- 7. Accessing Cloud Data in Hive
- 8. Accessing Cloud Data in Spark
- 9. Copying Cloud Data with Hadoop