Data Movement and Integration
Also available as:
PDF
loading table of contents...

Chapter 12. Using Apache Sqoop to Transfer Bulk Data

Hortonworks Data Platform deploys Apache Sqoop for your Hadoop cluster. Sqoop is a tool designed to transfer data between Hadoop and relational databases. You can use Sqoop to import data from a relational database management system (RDBMS) such as MySQL or Oracle into the Hadoop Distributed File System (HDFS), transform the data in Hadoop MapReduce, and then export the data back into an RDBMS. Sqoop automates most of this process, relying on the database to describe the schema for the data to be imported. Sqoop uses MapReduce to import and export the data, which provides parallel operation as well as fault tolerance.

For additional information see the Apache Sqoop documentation, including these sections in the Sqoop User Guide: