Automatic Table Creation
One of the key features of Sqoop is to manage and create the table metadata when importing into Hadoop.
HCatalog import jobs also provide this feature with the option --create-hcatalog-table
.
Furthermore, one of the important benefits of the HCatalog integration is to provide storage agnosticism to
Sqoop data movement jobs. To provide for that feature, HCatalog import jobs provide an option that lets a
user specify the storage format for the created table.
The option --create-hcatalog-table
is used as an indicator that a table has to be
created as part of the HCatalog import job.
The option --hcatalog-storage-stanza
can be used to specify the
storage format of the newly created table. The default value for this option is "stored as
rcfile". The value specified for this option is assumed to be a valid Hive storage format
expression. It will be appended to the CREATE TABLE command generated by the HCatalog import
job as part of automatic table creation. Any error in the storage stanza will cause the table
creation to fail and the import job will be aborted.
Any additional resources needed to support the storage format referenced in the option
--hcatalog-storage-stanza
should be provided to the job either by placing
them in $HIVE_HOME/lib
or by providing them in HADOOP_CLASSPATH
and LIBJAR
files.
If the option --hive-partition-key
is specified, then the value of this option is
used as the partitioning key for the newly created table. Only one partitioning key can be specified
with this option.
Object names are mapped to the lowercase equivalents as specified below when mapped to an HCatalog table. This includes the table name (which is the same as the external store table name converted to lower case) and field names.