Default Managed Tables

In CDP, managed tables are transactional tables with the insert_only property by default. You must be aware of the new default behavior of modifying file systems on a managed table in CDP and the methods to switch to the old behavior.

New Default Behavior

  • You can no longer perform file system modifications (add/remove files) on a managed table in CDP. The directory structure for transactional tables is different than non-transactional tables, and any out-of-band files which are added may or may not be picked up by Hive and Impala.
  • The insert_only transactional tables cannot be currently altered in Impala. The ALTER TABLE statement on a transactional table currently displays an error.
  • Impala does not currently support compaction on transaction tables. You should use Hive to compact the tables.
  • The SELECT, INSERT, INSERT OVERWRITE, and TRUNCATE statements are supported on the insert-only transactional tables.
Steps to switch to the CDH behavior:
  • If you do not want transactional tables, set the DEFAULT_TRANSACTIONAL_TYPE query option to NONE so that any newly created managed tables are not transactional by default.
  • External tables do not drop the data files when the table is dropped. To purge the data along with the table when the table is dropped, add external.table.purge = true in the table properties. When external.table.purge is set to true, the data is removed when the DROP TABLE statement is executed.