Apache Hive Metastore OverviewPDF version

Introduction to Hive metastore

Hive metastore (HMS) is a service that stores Apache Hive and other metadata in a backend RDBMS, such as MySQL or PostgreSQL. Impala, Spark, Hive, and other services share the metastore. The connections to and from HMS include HiveServer, Ranger, and the NameNode, which represents HDFS.

Beeline, Hue, JDBC, and Impala shell clients make requests through thrift or JDBC to HiveServer. The HiveServer instance reads/writes data to HMS. By default, redundant HMS operate in active/active mode. The physical data resides in a backend RDBMS, one for HMS. You must configure all HMS instances to use the same backend database. A separate RDBMS supports the security service, Ranger for example. All connections are routed to a single RDBMS service at any given time. HMS talks to the NameNode over thrift and functions as a client to HDFS.

HMS connects directly to Ranger and the NameNode (HDFS), and so does HiveServer, but this is not shown in the diagram for simplicity. One or more HMS instances on the backend can talk to other services, such as Ranger.