Dropping an external table and data

When you run DROP TABLE on an external table, by default Hive and Impala drop only the metadata (schema). If you want the DROP TABLE command to also remove the actual data in the external table, as DROP TABLE does on a managed table, you need to configure the table properties accordingly.

Set up Hadoop SQL policies in Ranger to include S3 URLs.

  1. Create a CSV file of data you want to query.
  2. Start Hive or Impala.
  3. Create an external table to store the CSV data, configuring the table so you can drop it along with the data.
    Hive and Impala example:
    CREATE EXTERNAL TABLE IF NOT EXISTS names_text2(
      a INT, b STRING)
      ROW FORMAT DELIMITED
      FIELDS TERMINATED BY ','
      STORED AS TEXTFILE
      LOCATION 's3a://cdpsaasdemo-cdp-private-default-1ogvplm/user/andrena'
      TBLPROPERTIES ('external.table.purge'='true');
  4. Run DROP TABLE on the external table.
    Hive and Impala example:
    DROP TABLE names_text2;
    The table is removed from Hive Metastore and the data stored externally is also removed. For example, names_text is removed from the Hive Metastore and the CSV file that stored the data is also deleted from the file system.
  5. Prevent data in the external table from being deleted by a DROP TABLE statement.
    Hive and Impala example:
    ALTER TABLE addresses_text SET TBLPROPERTIES ('external.table.purge'='false');