Configuring Catalog
When using Spark SQL to query an Iceberg table from Spark, you refer to a table using the following dot notation:
<catalog_name>.<database_name>.<table_name>
The default catalog used by Spark is named spark_catalog.
When referring to a table in a database known to spark_catalog, you can
omit <catalog_name>. .
SparkCatalog property that understands Iceberg tables, and a
SparkSessionCatalog property that understands both Iceberg and
non-Iceberg tables. The following are configured by
default:spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog
spark.sql.catalog.spark_catalog.type=hive This replaces
Spark’s default catalog by Iceberg’s SparkSessionCatalog and allows you
to use both Iceberg and non-Iceberg tables out of the box.
There is one caveat when
using SparkSessionCatalog. Iceberg supports CREATE TABLE … AS
SELECT (CTAS) and REPLACE TABLE … AS SELECT (RTAS) as atomic
operations when using SparkCatalog. Whereas, the CTAS and RTAS are
supported but are not atomic when using SparkSessionCatalog. As a
workaround, you can configure another catalog that uses SparkCatalog. For
example, to create the catalog named iceberg_catalog, set the following:
spark.sql.catalog.iceberg_catalog=org.apache.iceberg.spark.SparkCatalog
spark.sql.catalog.iceberg_catalog.type=hive
