Command Line Upgrade
Also available as:
PDF
loading table of contents...

Configure and Start Apache Hive and Apache HCatalog

[Note]Note

The su commands in this section use "hive" to represent the Apache Hive Service user. If you are using another name for your Hive Service user, you need to substitute your Hive Service user name for "hive" in each of the su commands.

  1. Prior to starting the upgrade process, set the following in your hive configuration file:

    datanucleus.autoCreateSchema=false
  2. Copy the jdbc connector jar from OLD_HIVE_HOME/lib to CURRENT_HIVE_HOME/lib.

  3. Restore the JDBC jar files backed up into the $HIVE_HOME/lib directory. Make sure to restore all Metastore-related properties, (such as, ConnectionURL, user etc), from your older hive installation.

  4. Upgrade the Hive Metastore database schema. Restart the Hive Metastore database and run:

    su - hive -c "/usr/hdp/current/hive-metastore/bin/schematool -upgradeSchema -dbType <$databaseType>"

    The value for $databaseType can be derby, mysql, oracle, or postgres.

    [Important]Important

    When you use MySQL as your Hive metastore, you must use mysql-connector-java-5.1.35.zip or later JDBC driver.

    [Note]Note

    If you are using PostgreSQL, you should reset the Hive Metastore database owner to <HIVE_USER>:

    sudo <POSTGRES_USER>

    Start the PostgreSQL CLU using the psql command.

    Execute: ALTER DATABASE <HIVE-METASTORE-DB-NAME> OWNER TO <HIVE_USER>

    [Note]Note

    If you are using Oracle 11, you might see the following error message:

    14/11/17 14:11:38 WARN conf.HiveConf: HiveConf of name hive.optimize.mapjoin.mapreduce does not exist
    14/11/17 14:11:38 WARN conf.HiveConf: HiveConf of name hive.heapsize does not exist
    14/11/17 14:11:38 WARN conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist
    14/11/17 14:11:38 WARN conf.HiveConf: HiveConf of name hive.semantic.analyzer.factory.impl does not exist
    14/11/17 14:11:38 WARN conf.HiveConf: HiveConf of name hive.auto.convert.sortmerge.join.noconditionaltask does not exist
    Metastore connection URL: jdbc:oracle:thin:@//ip-172-31-42-1.ec2.internal:1521/XE
    Metastore Connection Driver : oracle.jdbc.driver.OracleDriver
    Metastore connection User: hiveuser
    Starting upgrade metastore schema from version 0.13.0 to 0.14.0
    Upgrade script upgrade-0.13.0-to-0.14.0.oracle.sql
    Error: ORA-00955: name is already used by an existing object (state=42000,code=955)
    Warning in pre-upgrade script pre-0-upgrade-0.13.0-to-0.14.0.oracle.sql: Schema script failed, errorcode 2
    Completed upgrade-0.13.0-to-0.14.0.oracle.sql
    schemaTool completed

    You can safely ignore this message. The error is in the pre-upgrade script and can be ignored; the schematool succeeded.

    [Note]Note

    Copy only the necessary configuration files. Do not copy the env.sh files, for example, hadoop-env.sh, hive-env.sh, and so forth. Additionally, all env.sh files must be properly configured.

  5. Edit the hive-site.xml file and modify the properties based on your environment.

    1. Edit the following properties in the hive-site.xml file:

      <property>
       <name>fs.file.impl.disable.cache</name>
       <value>false</value>
       <description>Set to false or remove fs.file.impl.disable.cache</description> 
      </property>
       
      <property>
       <name>fs.hdfs.impl.disable.cache</name>
       <value>false</value>
       <description>Set to false or remove fs.hdfs.impl.disable.cache
       </description>
      </property>
    2. Optional: To enable the Hive buildin authorization mode, make the following changes. If you want to use the advanced authorization provided by Ranger, refer to the Ranger instructions.

      Set the following Hive authorization parameters in the hive-site.xml file:

      <property>
       <name>hive.server2.enable.doAs</name>
       <value>false</value>
      </property>
       
      <property>
       <name>hive.security.metastore.authorization.manager</name>
       <value>org.apache.hadoop.hive.ql.security.authorization.
         StorageBasedAuthorizationProvider,org.apache.hadoop.hive.ql.security.authorization.MetaStoreAuthzAPIAuthorizeEmbedOnly</value>
      </property>
       
      <property>
       <name>hive.security.authorization.manager</name>
       <value>org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizeFactory</value>
      </property>

      Also set hive.users.in.admin.role to the list of comma-separated users who need to be added to admin role. A user who belongs to the admin role needs to run the "set role" command before getting the privileges of the admin role, as this role is not in the current roles by default.

      Set the following in the hiveserver2-site.xml file.

      <property>
       <name>hive.security.authenticator.manager</name>
       <value>org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator</value>
      </property>
       
      <property>
       <name>hive.security..authorization.enabled</name>
       <value>true</value>
      </property>
       
      <property>
       <name>hive.security.authorization.manager</name>
       <value>org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizeFactory/value>
      </property>
    3. For a remote Hive metastore database, set the IP address (or fully-qualified domain name) and port of the metastore host using the following hive-site.xml property value.

      <property> 
       <name>hive.metastore.uris</name> 
       <value>thrift://$metastore.server.full.hostname:9083</value> 
       <description>URI for client to contact metastore server. 
         To enable HiveServer2, leave the property value empty. 
         </description>
      </property>

      You can further fine-tune your configuration settings based on node hardware specifications, using the HDP utility script.

  6. Start Hive Metastore.

    On the Hive Metastore host machine, run the following command:

    su - hive -c "nohup /usr/hdp/current/hive-metastore/bin/hive --service metastore -hiveconf hive.log.file=hivemetastore.log >/var/log/hive/hivemetastore.out 2>/var/log/hive/hivemetastoreerr.log &"

  7. Start Hive Server2.

    On the Hive Server2 host machine, run the following command:

    su - hive

    nohup /usr/hdp/current/hive-server2/bin/hiveserver2 -hiveconf hive.metastore.uris=" " -hiveconf hive.log.file=hiveserver2.log >/var/log/hive/hiveserver2.out 2> /var/log/hive/hiveserver2err.log &