Upgrade Troubleshooting
In the event of a problem, contacting Hortonworks Support is highly recommended. Alternatively, you can perform these troubleshooting procedures.
Restoring the Hive Metastore
On the node where the database for Hive Metastore resides, create databases you want to restore. For example:
$ mysql -u hiveuser -p -e "create database <hive_db_schema_name>;"
Restore each Metastore database from the dump you created. For example:
$ mysql -u hiveuser -p <hive_db_schema_name> < </path/to/dump_file>
Reconfigure Hive Metastore if necessary. Reconfiguration might be necessary if the upgrade fails. Contacting Hortonworks Support for help with reconfiguration is recommended. Alternatively, in HDP 3.x, set key=value commands on the command line to configure Hive Metastore.
Solving Problems Using Spark and Hive
Use the Hive Warehouse Connector (HWC) and low-latency analytical processing (LLAP) to access Spark data after upgrading. HWC is a Spark library/plugin that is launched with the Spark app. HWC and LLAP are required for certain tasks, as shown in the following table:
Table 4.2. Spark Compatibility
Tasks | HWC Required | LLAP Required | Other Requirement/Comments |
---|---|---|---|
Read Hive managed tables from Spark | Yes | Yes | Ranger ACLs enforced. |
Write Hive managed tables from Spark | Yes | No | Ranger ACLs enforced. |
Read Hive external tables from Spark | No | Only if HWC is used | Table must be defined in Spark catalog. Ranger ACLs not enforced. |
Write Hive external tables from Spark | No | No | Ranger ACLs enforced. |
Spark-submit and pyspark are supported. The spark thrift server is not
supported.
Accessing Hive tables using SparkSQL
To access tables, which were converted to ACID tables during the upgrade, using SparkSQL, you create a new external table using Hive 3 and migrate the data from the managed to the new table.
Rename the managed table to
*_old
.Migrate data from *_old to <new> external table using the original name in the historical or the default location (
/warehouse/tablespace/external/hive/<?>.db/<tablename>
).CREATE EXTERNAL TABLE new_t as SELECT * FROM old_t;
Recovering missing Hive tables
Login as Superuser HDFS. For example:
$ sudo su - hdfs
Read the snapshots on HDFS that backup your table data. For example:
$ hdfs dfs -cat /apps/hive/warehouse/.snapshot/s20181204-164645.898/students/000000_0
Example output for a trivial table having two rows and three columns might look something like this:
fred flintstone351.28
barney rubble322.32
In Hive, insert the data into the table if the schema exists in the Hive warehouse; otherwise, restore the Hive Metastore, which includes the schemas, from the database dump you created in the pre-upgrade process.
YARN Registry DNS instance fails to start
The YARN Registry DNS instance will fail to start if another process on the host is bound to port 53. Ensure no other services that are binding to port 53 are on the host where the YARN Registry DNS instance is deployed.
Class Loading Issue When Starting Solr
If you do not follow sequential steps during the upgrade, the Infra Solr instance may fail to start with the following exception:
null:org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.security.InfraRuleBasedAuthorizationPlugin'
If you see this exception, follow the steps in this HCC article to work around the issue:
Ambari Metrics System (AMS) does not start
When the Ambari Metrics System (AMS) does not start after upgrade, you can observe the following log snippet in the HBase Master:
master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1543610616273, server=regionserver1.domain.com,41213,1543389145213}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined
The workaround is to manually clean up the znode from ZooKeeper.
If AMS mode = embedded, Remove the znode data from local filesystem path, e.g.:
rm -f /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/zookeeper_0/version-2/*
If AMS mode = distributed, connect to the cluster zookeeper instance and delete the following node before restart:
/usr/hdp/current/zookeeper-client/bin/zkCli.sh -server localhost:2181 [zk: localhost:2181(CONNECTED) 0] rmr /ams-hbase-unsecure/meta-region-server