Data Movement and Integration
- 1. What's New in the Data Movement and Integration Guide
- 2. HDP Data Movement and Integration
- 3. Data Management and Falcon Overview
- 4. Prerequisite to Installing or Upgrading Falcon
- 5. Considerations for Using Falcon
- 6. Configuring for High Availability
- 7. Creating Falcon Entity Definitions
- 8. Mirroring Data with Falcon
- 9. Replicating Data with Falcon
- 10. Mirroring Data with HiveDR in a Secure Environment
- 11. Enabling Mirroring and Replication with Azure Cloud Services
- 12. Using Advanced Falcon Features
- Locating and Managing Entities
- Accessing File Properties from Ambari
- Enabling Transparent Data Encryption
- Putting Falcon in Safe Mode
- Viewing Alerts in Falcon
- Late Data Handling
- Setting a Retention Policy
- Setting a Retry Policy
- Enabling Email Notifications
- Understanding Dependencies in Falcon
- Viewing Dependencies
- 13. Using Apache Sqoop to Transfer Bulk Data
- Apache Sqoop Connectors
- Storing Protected Passwords in Sqoop
- Sqoop Import Table Commands
- Sqoop Import Jobs Using --as-avrodatafile
- Netezza Connector
- Sqoop-HCatalog Integration
- Controlling Transaction Isolation
- Automatic Table Creation
- Delimited Text Formats and Field and Line Delimiter Characters
- HCatalog Table Requirements
- Support for Partitioning
- Schema Mapping
- Support for HCatalog Data Types
- Providing Hive and HCatalog Libraries for the Sqoop Job
- Examples
- Configuring a Sqoop Action to Use Tez to Load Data into a Hive Table
- Troubleshooting Sqoop
- 14. Using HDP for Workflow and Scheduling With Oozie
- 15. Using Apache Flume for Streaming
- 16. Troubleshooting
- 17. Appendix