Incrementally update an imported table
In CDP Private Cloud Base, updating imported tables involves importing incremental changes made to the original table using Sqoop and then merging changes with the tables imported into Hive.
You can automate the steps to incrementally update data in Hive by using Oozie.
- The first time the data was ingested into hive, you stored entire base table in Hive in ORC format.
- The base table definition after moving it from the external table to a Hive-managed
table has the following
schema:
CREATE TABLE base_table ( id STRING, field1 STRING, modified_date DATE) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';