Configuring HiveServer2 for Transactions (ACID Support)
Hive supports transactions that adhere to traditional relational database ACID characteristics: atomicity, consistency, isolation, and durability. See the article about ACID characteristics on Wikipedia for more information.
Limitations
Currently, ACID support in Hive has the following limitations:
BEGIN
,COMMIT
, andROLLBACK
are not yet supported.Only the ORC file format is supported.
Transactions are configured to be off by default.
Tables that use transactions, must be bucketed. For a discussion of bucketed tables, see the Apache site.
Hive ACID only supports Snapshot Isolation. Transactions only support auto-commit mode and may include exactly one SQL statement.
ZooKeeper and in-memory lock managers are not compatible with transactions. See the Apache site for a discussion of how locks are stored for transactions.
Schema changes made by using
ALTER TABLE
are not supported. HIVE-11421 is tracking this issue.
To configure HiveServer2 for transactions:
Important | |
---|---|
|
Set the following parameters in the
hive-site.xml
file:<property> <name>hive.support.concurrency</name> <value>true</value> </property> <property> <name>hive.txn.manager</name> <value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value> </property> <property> <name>name>hive.enforce.bucketing</name> <value>true</value> </property> <property> <name>hive.exec.dynamic.partition.mode</name> <value>nostrict</value> </property>
Ensure that a standalone Hive metastore is running with the following parameters set in its
hive-site.xml
file:<property> <name>hive.compactor.initiator.on</name> <value>true</value> </property> <property> <name>hive.compactor.worker.threads</name> <value><positive_number></value> </property>
Important These are the minimum properties required to enable transactions in the standalone Hive metastore. See Hive Transactions on the Apache web site for information about configuring Hive for transactions and additional configuration parameters.
Even though HiveServer2 runs with an embedded metastore, a standalone Hive metastore is required for ACID support to function properly. If you are not using ACID support with HiveServer2, you do not need a standalone metastore.
The default value for
hive.compactor.worker.threads
is0
. Set this to a positive number to enable Hive transactions. Worker threads spawn MapReduce jobs to perform compactions, but they do not perform the compactions themselves. Increasing the number of worker threads decreases the time that it takes tables or partitions to be compacted. However, increasing the number of worker threads also increases the background load on the Hadoop cluster because they cause more MapReduce jobs to run in the background.