5. Resolving General Problems
5.1. Problem: When installing HDP 2.3.0 or 2.3.2, YARN ATS fails to start.
If you install an HDP cluster using HDP 2.3.0 or HDP 2.3.2, the YARN ATS server will fail to start with the following error in the yarn log:
2015-12-09 22:56:41,816 FATAL applicationhistoryservice.ApplicationHistoryServer (ApplicationHistoryServer.java:launchAppHistoryServer (161)) - Error starting ApplicationHistoryServer java.lang.RuntimeException: java.lang.RuntimeException : java.lang.ClassNotFoundException : Class org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore not found at org.apache.hadoop.conf.Configuration.getClass (Configuration.java:2227)
5.1.1. Solution:
Update the YARN configuration to use the LevelDB store:
In Ambari Web, browse to Services > YARN > Configs.
Filter for the
yarn.timeline-service.store-class
property and set toorg.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore
value.Save the configuration change and restart YARN.
5.2. Problem: After upgrading to Ambari 2.2, you receive File Does Not Exist alerts.
After upgrading to Ambari 2.2, you receive alerts for "DataNode Unmounted Data Dir"
that the /var/lib/ambari-agent/data/datanode/dfs_data_dir_mount.hist
file
does not exist. The hadoop-env/dfs.datanode.data.dir.mount.file configuration property is no
longer customizable from Ambari. The original default value of
/etc/hadoop/conf/dfs_data_dir_mount.hist
is now
/var/lib/ambari-agent/data/datanode/dfs_data_dir_mount.hist
, which is
not customizable. On Ambari Agent upgrade, Ambari will automatically move the file from
/etc/hadoop/conf/dfs_data_dir_mount.hist
to
/var/lib/ambari-agent/data/datanode/dfs_data_dir_mount.hist
. If you
have not modified this configuration property, no action is required.
5.2.1. Solution:
If you had previously modified the
hadoop-env/dfs.datanode.data.dir.mount.file
value to a custom
location, after upgrading to Ambari 2.2, you must restart your DataNodes for the file to
written to be the new location.
5.3. During Enable Kerberos, the Check Kerberos operation fails.
When enabling Kerberos using the wizard, the Check Kerberos operation fails. In /var/log/ambari-server/ambari-server.log, you see a message: 02:45:44,490 WARN [qtp567239306-238] MITKerberosOperationHandler:384 - Failed to execute kadmin:
5.3.1. Solution 1:
Check that NTP is running and confirm your hosts and the KDC times are in sync. A time skew as little as 5 minutes can cause Kerberos authentication to fail.
5.3.2. Solution 2: (on RHEL/CentOS/Oracle Linux)
Check that the Kerberos Admin principal being used has the necessary KDC ACL rights as
set in /var/kerberos/krb5kdc/kadm5.acl
.
5.4. Problem: Hive developers may encounter an exception error message during Hive Service Check
MySQL is the default database used by the Hive metastore. Depending on several factors, such as the version and configuration of MySQL, a Hive developer may see an exception message similar to the following one:
An exception was thrown while adding/validating classes) : Specified key was too long; max key length is 767 bytes
5.4.1. Solution
Administrators can resolve this issue by altering the Hive metastore database to use
the Latin1 character set, as shown in the following example: mysql> ALTER
DATABASE
<metastore.database.name> character set
latin1;
5.5. Problem: API calls for PUT, POST, DELETE respond with a "400 - Bad Request"
When attempting to perform a REST API call, you receive a 400 error response. REST API calls require the "X-Requested-By" header.
5.5.1. Solution
Starting with Ambari 1.4.2, you must include the "X-Requested-By" header with the REST API calls.
For example, if using curl, include the -H "X-Requested-By:
ambari"
option. curl -u admin:admin -H "X-Requested-By:
ambari" -X DELETE
http://<ambari-host>:8080/api/v1/hosts/host1
5.6. Problem: Ambari is checking disk full on non-local disks; causing a high number of auto-mounted home directories
When Ambari issues it's check to detect local disk capacity and use for each Ambari
Agent, it uses df
by default instead of df -l
to only
check local disks. If using NFS auto-mounted home directories, this can lead to a high
number of home directories being mounted on each host; causing shutdown delays and disk
capacity check delays.
5.6.1. Solution:
On the Ambari Server, edit the /etc/ambari-server/conf/ambari.properties
and add the following property to only check locally mounted devices.
agent.check.remote.mounts=false
5.7. Problem: Ambari Web shows Storm summary values as N/A in a Kerberized cluster
With a Kerberos-enabled cluster that includes Storm, in Ambari Web > Services > Storm, the Summary values for Slots, Tasks, Executors and Topologies show as "n/a". Ambari Server log also includes the following ERROR:
24 Mar 2015 13:32:41, 288 ERROR [pool-2-thread-362] AppCookieManager:122 - SPNego authentication failed, cannot get hadoop.auth cookie for URL: http: //c6402.ambari.apache.org:8744/api/ v1/topology/summary?field=topologies
5.7.1. Solution:
When Kerberos is enabled, Storm API requires SPNEGO authentication. Refer to the Ambari Security Guide to Set Up Ambari for Kerberos to enable Ambari to authenticate against the Storm API via SPNEGO.
5.8. Problem: kadmin running Ambari Server as non-root, cannot open log file.
When running Ambari Server as non-root, when enabling Kerberos, if kadmin fails to authenticate, you will see the following error in ambari-server.log if Ambari cannot access the kadmind.log.
STDERR: Couldn't open log file /var/log/kadmind.log: Permission denied kadmin: GSS-API (or Kerberos) error while initializing kadmin interface
5.8.1. Solution:
Be sure the user that Ambari Server is configured to run has permissions to write to the kadmind.log.
5.9. Problem: After changing NameNode RPC port, Ambari shows both NameNodes as standby.
If you have enabled NameNode HA and change the NameNode RPC ports (by customizing the dfs.namenode.servicerpc-address property), Ambari will show both NameNodes as standby.
5.9.1. Solution:
When modifying the NameNode RPC port (dfs.namenode.servicerpc-address) after enabling NameNode HA, you need to format ZKFC to make sure that the config data in ZooKeeper is refreshed. Run the following command to format ZKFC znode:
su - <hdfs-user> -c 'hdfs zkfc -formatZK'