Creating an S3-based external table

You use the LOCATION clause in the CREATE EXTERNAL TABLE statement to create an external data having source data on S3.

In this task, you create a partitioned, external table and load data from the source on S3. You can use the LOCATION clause in the CREATE TABLE to specify the location of external table data. The metadata is stored in the Hive warehouse.
  • Set up Hive policies in Ranger to include S3 URLs.
  1. Put data source files on S3.
  2. Create an external table based on the data source files.
    CREATE EXTERNAL TABLE `inventory`(
      `inv_item_sk` int,
      `inv_warehouse_sk` int,
      `inv_quantity_on_hand` int)
      PARTITIONED BY (
      `inv_date_sk` int) STORED AS ORC
      LOCATION
      's3a://BUCKET_NAME/tpcds_bin_partitioned_orc_200.db/inventory';