Write to Hive bucketed tables
Type of change: Property/Spark SQL changes
Spark 1.6
By default, you can write to Hive bucketed tables.
Spark 2.4
By default, you cannot write to Hive bucketed tables.
For example,
the following code snippet writes the data into a bucketed Hive table:
newPartitionsDF.write.mode(SaveMode.Append).format("hive").insertInto(hive_test_db.test_bucketing)
The
code above will throw the following
error:org.apache.spark.sql.AnalysisException: Output Hive table `hive_test_db`.`test_bucketing` is bucketed but Spark currently does NOT populate bucketed output which is compatible with Hive.
Action Required
To write to a Hive bucketed table, you must use hive.enforce.bucketing=false
and hive.enforce.sorting=false
to forego bucketing guarantees.