Non-Ambari Cluster Installation Guide
Also available as:
PDF
loading table of contents...

Setting Up the Hive/HCatalog Configuration Files

[Note]Note

When using HiveServer2 in HTTP mode, you must configure the mapping from Kerberos Principals to short names in the “hadoop.security.auth_to_local" property setting in the core-site.xml file.

Use the following instructions to set up the Hive/HCatalog configuration files. Hortonworks provides a set of configuration files that represent a working Hive/HCatalog configuration. (See Download Companion Files. You can use these files as a reference point. However, you will need to modify them to match your own cluster environment.

If you choose to use the provided configuration files to set up your Hive/HCatalog environment, complete the following steps:

  1. Extract the configuration files to a temporary directory.

    The files are located in the configuration_files/hive directories where you decompressed the companion files.

  2. Modify the configuration files.

    In the configuration_files/hive directory, edit the hive-site.xml file and modify the properties based on your environment.

    Edit the connection properties for your Hive metastore database in hive-site.xml to match your own cluster environment.

    [Warning]Warning

    To prevent memory leaks in unsecure mode, disable file system caches by setting the following parameters to true in hive-site.xml:

    • fs.hdfs.impl.disable.cache

    • fs.file.impl.disable.cache

  3. (Optional) If you want storage-based authorization for Hive, set the following Hive authorization parameters in the hive-site.xml file:

    <property>
         <name>hive.metastore.pre-event.listeners</name>
         <value>org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener</value>
    </property>
    
    <property>
         <name>hive.security.metastore.authorization.manager</name>
         <value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider</value>
    </property>
    
    <property>
         <name>hive.security.authenticator.manager</name>
         <value>org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator</value>
    </property>

    Hive also supports SQL standard authorization. See "Hive Authorization" for more information about Hive authorization models.

  4. For a remote Hive metastore database, use the following hive-site.xml property value to set the IP address (or fully-qualified domain name) and port of the metastore host.

    <property> 
         <name>hive.metastore.uris</name> 
         <value>thrift://$metastore.server.full.hostname:9083</value> 
         <description>URI for client to contact metastore server. To enable HiveServer2, leave the property value empty.     
         </description>
    </property>

    To enable HiveServer2 for remote Hive clients, assign a value of a single empty space to this property. Hortonworks recommends using an embedded instance of the Hive Metastore with HiveServer2. An embedded metastore runs in the same process with HiveServer2 rather than as a separate daemon. You can also configure HiveServer2 to use an embedded metastore instance from the command line:

    hive --service hiveserver2 -hiveconf hive.metastore.uris=""

  5. (Optional) By default, Hive ensures that column names are unique in query results returned for SELECT statements by prepending column names with a table alias. Administrators who do not want a table alias prefix to table column names can disable this behavior by setting the following configuration property:

    <property>
         <name>hive.resultset.use.unique.column.names</name>
         <value>false</value>
    </property>
    [Important]Important

    Hortonworks recommends that deployments disable the DataNucleus cache by setting the value of the datanucleus.cache.level2.type configuration parameter to none. Note that the datanucleus.cache.level2 configuration parameter is ignored, and assigning a value of none to this parameter will not have the desired effect.