Catalog operations

Short descriptions and the syntax of catalog operations, which include creating, dropping, and describing an Apache Hive database and table from Apache Spark, helps you write HWC API apps.

Methods

.sql is the recommended method for executing catalog operations.

  • Set the current database for unqualified Hive table references

    hive.setDatabase(<database>)

  • Run a catalog operation and return a DataFrame

    hive.sql("describe extended web_sales").show()

  • Show databases

    hive.showDatabases().show(100)

  • Show tables for the current database

    hive.showTables().show(100)

  • Describe a table

    hive.describeTable(<table_name>).show(100)

  • Create a database

    hive.createDatabase(<database_name>,<ifNotExists>)

  • Create an ORC table

    hive.createTable("web_sales").ifNotExists().column("sold_time_sk", "bigint").column("ws_ship_date_sk", "bigint").create()

    See the CreateTableBuilder interface section below for additional table creation options. You can also create Hive tables using hive.executeUpdate.

  • Drop a database

    hive.dropDatabase(<databaseName>, <ifExists>, <useCascade>)

  • Drop a table

    hive.dropTable(<tableName>, <ifExists>, <usePurge>)