Catalog operations

Catalog operations include creating, dropping, and describing a Hive database and table from Spark. Three methods of executing catalog operations are supported: .sql (recommended), .execute() ( spark.datasource.hive.warehouse.read.jdbc.mode = client), or .executeQuery() for backward compatibility in LLAP mode.

Catalog operations

  • Set the current database for unqualified Hive table references

    hive.setDatabase(<database>)

  • Execute a catalog operation and return a DataFrame

    hive.execute("describe extended web_sales").show()

  • Show databases

    hive.showDatabases().show(100)

  • Show tables for the current database

    hive.showTables().show(100)

  • Describe a table

    hive.describeTable(<table_name>).show(100)

  • Create a database

    hive.createDatabase(<database_name>,<ifNotExists>)

  • Create an ORC table

    hive.createTable("web_sales").ifNotExists().column("sold_time_sk", "bigint").column("ws_ship_date_sk", "bigint").create()

    See the CreateTableBuilder interface section below for additional table creation options. You can also create Hive tables using hive.executeUpdate.

  • Drop a database

    hive.dropDatabase(<databaseName>, <ifExists>, <useCascade>)

  • Drop a table

    hive.dropTable(<tableName>, <ifExists>, <usePurge>)