Data Access
Also available as:
PDF
loading table of contents...

Transactions in Hive

Support for transactions in Hive 0.13 and later enables SQL atomicity of operations at the row level rather than at the level of a table or partition. This allows a Hive client to read from a partition at the same time that another Hive client is adding rows to the same partition. In addition, transactions provide a mechanism for streaming clients to rapidly update Hive tables and partitions. Hive transactions differ from RDBMS transactions in that each transaction has an identifier, and multiple transactions are grouped into a single transaction batch. A streaming client requests a set of transaction IDs after connecting to Hive and subsequently uses these transaction IDs one at a time during the initialization of new transaction batches. Clients write one or more records for each transaction and either commit or abort a transaction before moving to the next transaction.

ACID is an acronym for four required traits of database transactions: atomicity, consistency, isolation, and durability.

Transaction AttributeDescription

Atomicity

An operation either succeeds completely or fails; it does not leave partial data.

Consistency

Once an application performs an operation, the results of that operation are visible to the application in every subsequent operation.

Isolation

Operations by one user do not cause unexpected side effects for other users.

Durability

Once an operation is complete, it is preserved in case of machine or system failure.

Administrators:

To use ACID-based transactions, administrators must use a transaction manager that supports ACID and the ORC file format. See Understanding and Administering Hive Compactions for instructions on configuring a transaction manager for Hive.

Developers:

Developers and others can create ACID tables by either of the following methods:

Creating Hive ACID Transaction Tables in Ambari
Creating the tables with SQL outside the Ambari framework
[Note]Note

See the Hive wiki for more information about Hive's support of ACID semantics for transactions.