Support for transactions in Hive 0.13 enables SQL atomicity of operations at the row level rather than at the level of a table or partition. This allows a Hive client to read from a partition at the same time that another Hive client is adding rows to the same partition. In addition, transactions provide a mechanism for streaming clients to rapidly update Hive tables and partitions. Hive transactions differ from RDBMS transactions in that each transaction has an identifier, and multiple transactions are grouped into a single transaction batch. A streaming client requests a set of transaction IDs after connecting to Hive and subsequently uses these transaction IDs one at a time during the initialization of new transaction batches. Clients write one or more records for each transaction and either commit or abort a transaction before moving to the next transaction.
ACID is an acronym for four required traits of database transactions: atomicity, consistency, isolation, and durability.
Transaction Attribute | Description |
---|---|
Atomicity | An operation either succeeds completely or fails; it does not leave partial data. |
Consistency | Once an application performs an operation, the results of that operation are visible to the application in every subsequent operation. |
Isolation | Operations by one user do not cause unexpected side effects for other users. |
Durability | Once an operation is complete, it is preserved in case of machine or system failure. |
By default, transactions are disabled in Hive. To use ACID-based transactions, administrators must use a transaction manager that supports ACID and the ORC file format. See Configuring the Hive Transaction Manager later in this section for instructions on configuring a transaction manager for Hive.
Note | |
---|---|
See the Hive wiki for more information about Hive's support of ACID semantics for transactions. |
Understanding Compactions
Hive stores data in base files that cannot be updated by HDFS. Instead, Hive creates a set of delta files for each transaction that alters a table or partition and stores them in a separate delta directory. Occasionally, Hive compacts, or merges, the base and delta files. Hive performs all compactions in the background without affecting concurrent reads and writes of Hive clients. There are two types of compactions:
Table 2.3. Hive Compaction Types
Compaction Type | Description |
---|---|
Minor | Rewrites a set of delta files to a single delta file for a bucket. |
Major | Rewrites one or more delta files and the base file as a new base file for a bucket. |
By default, Hive automatically compacts delta and base files at regular intervals. However, Hadoop
administrators can configure automatic compactions, as well as perform manual compactions of base and
delta files using the following configuration parameters in hive-site.xml
.
Table 2.4. Hive Transaction Configuration Parameters
Configuration Parameter | Description |
---|---|
| Specifies the class name of the transaction manager used by Hive.
Set this property to
|
| Specifies whether to run the initiator and cleaner threads on this Metastore instance. The default value is false. Must be set to true for exactly one instance of the Hive metastore service. |
| Specifies the number of of worker threads to run on this Metastore instance. The default value is 0, which must be set to greater than 0 to enable compactions. Worker threads initialize MapReduce jobs to do compactions. Increasing the number of worker threads decreases the time required to compact tables after they cross a threshold that triggers compactions. However, increasing the number of worker threads also increases the background load on a Hadoop cluster. |
| Specifies the time period, in seconds, after which a compaction job is failed and re-queued. The default value is 86400 seconds, or 24 hours. |
| Specifies the time period, in seconds, between checks to see if any partitions require compacting. The default value is 300 seconds. Decreasing this value reduces the time required to start a compaction for a table or partition. However, it also increases the background load on the NameNode since each check requires several calls to the NameNode. |
| Specifies the number of delta directories in a partition that triggers an automatic minor compaction. The default value is 10. |
| Specifies the percentage size of delta files relative to the corresponding base files that triggers an automatic major compaction. The default value is.1, which is 10 percent. |
| Specifies the number of aborted transactions on a single partition that trigger an automatic major compaction. |
Configuring the Hive Transaction Manager
Configure the following Hive properties to enable transactions:
hive.txn.manager
hive.compactor.initiator.on
hive.compactor.worker.threads
Tip | |
---|---|
To disable automatic compactions for individual tables, set the |
If you experience problems while enabling Hive transactions, check the Hive log file at /tmp/hive/hive.log
.
Performing Manual Compactions
Hive administrators use the ALTER TABLE
DDL command to queue requests that compact base and delta files for a table or partition:
ALTER TABLE tablename [PARTITION (partition_key='partition_value' [,...])] COMPACT 'compaction_type'
Use the SHOW COMPACTIONS
command to monitor the progress of the compactions:
SHOW COMPACTIONS
Note | |
---|---|
ALTER TABLE will compact tables even if the NO_AUTO_COMPACTION table property is set. |
The SHOW COMPACTIONS
command provides the following output for each compaction:
Database name
Table name
Partition name
Major or minor compaction
Compaction state:
Initiated - waiting in queue
Working - currently compacting
Ready for cleaning - compaction completed and old files scheduled for removal
Thread ID
Start time of compaction
Hive administrators can also view a list of currently open and aborted transactions with the the SHOW TRANSACTIONS
command. This command provides the following output for each transaction:
Transaction ID
Transaction state
Hive user who initiated the transaction
Host machine where transaction was initiated
New Lock Manager
Hive 0.13 introduced a new lock manager, DbLockManager
, to store all transaction and related lock information in the Hive Metastore. Heartbeats are sent regularly from lock holders and transaction initiators to the Hive metastore to prevent stale locks and transactions. The lock or transaction is aborted if the Metastore does not receive a heartbeat within the amount of time specified by the hive.txn.timeout
configuration property. Hive administrators use the SHOW LOCKS
DDL command to view information about locks associated with transactions.
This command provides the following output for each lock:
Database name
Table name
Partition, if the table is partitioned
Lock state:
Acquired - transaction initiator hold the lock
Waiting - transaction initiator is waiting for the lock
Aborted - the lock has timed out but has not yet been cleaned
Lock type:
Exclusive - the lock may not be shared
Shared_read - the lock may be shared with any number of other shared_read locks
Shared_write - the lock may be shared by any number of other shared_read locks but not with other shared_write locks
Transaction ID associated with the lock, if one exists
Last time lock holder sent a heartbeat
Time the lock was acquired, if it has been acquired
Hive user who requested the lock
Host machine on which the Hive user is running a Hive client
Note | |
---|---|
The output of the command reverts to behavior prior to Hive 0.13 if administrators use Zookeeper or in-memory lock managers. |
Transaction Limitations
HDP currently has the following limitations for ACID-based transactions in Hive:
The BEGIN, COMMIT, and ROLLBACK SQL statements are not yet supported. All operations are automatically committed as transactions.
The user initiating the Hive session must have write permission for the destination partition or table.
Zookeeper and in-memory locks are not compatible with transactions.
Only ORC files are supported.
Destination tables must be bucketed and not sorted.
Snapshot-level isolation, similar to READ COMMITTED. A query is provided with a consistent snapshot of the data during execution.