Default Managed Tables
In CDP, managed tables are transactional tables with the insert_only
property by default. You must be aware of the new default behavior of modifying file systems on
a managed table in CDP and the methods to switch to the old behavior.
New Default Behavior
- You can no longer perform file system modifications (add/remove files) on a managed table in CDP. The directory structure for transactional tables is different than non-transactional tables, and any out-of-band files which are added may or may not be picked up by Hive and Impala.
- The
insert_only
transactional tables cannot be currently altered in Impala. TheALTER TABLE
statement on a transactional table currently displays an error. - Impala does not currently support compaction on transaction tables. You should use Hive to compact the tables.
- The
SELECT
,INSERT
,INSERT OVERWRITE
, andTRUNCATE
statements are supported on the insert-only transactional tables.
Steps to switch to the CDH behavior:
- If you do not want transactional tables, set the
DEFAULT_TRANSACTIONAL_TYPE
query option toNONE
so that any newly created managed tables are not transactional by default. -
External tables do not drop the data files when the table is dropped. To purge the data along with the table when the table is dropped, add
external.table.purge = true
in the table properties. Whenexternal.table.purge
is set totrue
, the data is removed when theDROP TABLE
statement is executed.