Data Services
Also available as:

Chapter 4. Using HDP for Metadata Services (HCatalog)

Hortonworks Data Platform (HDP) deploys Apache HCatalog to manage the metadata services for your Hadoop cluster.

Apache HCatalog is a table and storage management service for data created using Apache Hadoop. This includes:

  • Providing a shared schema and data type mechanism.

  • Providing a table abstraction so that users need not be concerned with where or how their data is stored.

  • Providing interoperability across data processing tools such as Pig, MapReduce, and Hive.

Start the HCatalog CLI with the following command:


HCatalog 0.5.0 was the final version released from the Apache Incubator. In March 2013, HCatalog graduated from the Apache Incubator and became part of the Apache Hive project. New releases of Hive include HCatalog, starting with Hive 0.11.0.

HCatalog includes two documentation sets:

  1. General information about HCatalog

    This documentation covers installation and user features. The next section, Using HCatalog, provides links to individual documents in the HCatalog documentation set.

  2. WebHCat information

    WebHCat is a web API for HCatalog and related Hadoop components. The section Using WebHCat provides links to user and reference documents, and includes a technical update about standard WebHCat parameters.

For more details on the Apache Hive project, including HCatalog and WebHCat, see "Using Apache Hive" and the following resources: