Moving Data into Apache Hive
There are multiple methods of moving data into Hive. How you move the data into Hive depends on the source format of the data and the target data format that is required. Generally, ORC is the preferred target data format because of the performance enhancements that it provides.
The following methods are most commonly used:
Table 2.11. Most Common Methods to Move Data into Hive
Source of Data | Target Data Format in Hive | Method Description |
---|---|---|
ETL for legacy systems | ORC file format |
|
Operational SQL database | ORC file format |
|
Streaming source that is "append only" | ORC file format |
|