Catalog operations
Short descriptions and the syntax of catalog operations, which include creating, dropping, and describing an Apache Hive database and table from Apache Spark, helps you write HWC API apps.
Methods
Three methods of executing catalog operations are supported: .sql
(recommended), .execute()
(
spark.datasource.hive.warehouse.read.mode=JDBC_CLIENT
), or
.executeQuery()
for backward compatibility for LLAP reads.
Set the current database for unqualified Hive table references
hive.setDatabase(<database>)
Execute a catalog operation and return a DataFrame
hive.execute("describe extended web_sales").show()
Show databases
hive.showDatabases().show(100)
Show tables for the current database
hive.showTables().show(100)
-
Describe a table
hive.describeTable(<table_name>).show(100)
-
Create a database
hive.createDatabase(<database_name>,<ifNotExists>)
Create an ORC table
hive.createTable("web_sales").ifNotExists().column("sold_time_sk", "bigint").column("ws_ship_date_sk", "bigint").create()
See the CreateTableBuilder interface section below for additional table creation options. You can also create Hive tables using
hive.executeUpdate
.-
Drop a database
hive.dropDatabase(<databaseName>, <ifExists>, <useCascade>)
Drop a table
hive.dropTable(<tableName>, <ifExists>, <usePurge>)