Using Apache HivePDF version

Apache Hive query basics

Using Apache Hive, you can query distributed data storage including Hadoop data. You need to know the ANSI SQL to view, maintain, or analyze Hive data. Examples of the basics, such as how to insert, update, and delete data from a table, helps you get started with Hive.

Hive supports ANSI SQL and atomic, consistent, isolated, and durable (ACID) transactions. For updating data, you can use the MERGE statement, which now also meets ACID standards. Materialized views optimize queries based on access patterns. Hive supports tables up to 300PB in Optimized Row Columnar (ORC) format. Other file formats are also supported. You can create tables that resemble those in a traditional relational database. You use familiar insert, update, delete, and merge SQL statements to query table data. The insert statement writes data to tables. Update and delete statements modify and delete values already written to Hive. The merge statement streamlines updates, deletes, and changes data capture operations by drawing on co-existing tables. These statements support auto-commit that treats each statement as a separate transaction and commits it after the SQL statement is executed.