Data Movement and Integration
- 1. HDP Data Movement and Integration
- 2. Data Management and Falcon Overview
- 3. Considerations for Using Falcon
- 4. Prerequisite to Installing or Upgrading Falcon
- 5. Configuring for High Availability
- 6. Creating Falcon Entity Definitions
- 7. Mirroring Data with Falcon
- 8. Replicating Data with Falcon
- 9. Mirroring Data with HiveDR in a Secure Environment
- 10. Enabling Mirroring and Replication with Azure Cloud Services
- 11. Using Advanced Falcon Features
- Locating and Managing Entities
- Accessing File Properties from Ambari
- Enabling Transparent Data Encryption
- Putting Falcon in Safe Mode
- Viewing Alerts in Falcon
- Late Data Handling
- Setting a Retention Policy
- Setting a Retry Policy
- Enabling Email Notifications
- Understanding Dependencies in Falcon
- Viewing Dependencies
- 12. Using Apache Sqoop to Transfer Bulk Data
- Apache Sqoop Connectors
- Sqoop Import Table Commands
- Netezza Connector
- Sqoop-HCatalog Integration
- Controlling Transaction Isolation
- Automatic Table Creation
- Delimited Text Formats and Field and Line Delimiter Characters
- HCatalog Table Requirements
- Support for Partitioning
- Schema Mapping
- Support for HCatalog Data Types
- Providing Hive and HCatalog Libraries for the Sqoop Job
- Examples
- Configuring a Sqoop Action to Use Tez to Load Data into a Hive Table
- Troubleshooting Sqoop
- 13. Using HDP for Workflow and Scheduling With Oozie
- 14. Using Apache Flume for Streaming
- 15. Troubleshooting
- 16. Appendix