5. Known Issues

Ambari 2.2.1.0 has the following known issues, scheduled for resolution in a future release. Also, refer to the Ambari Troubleshooting Guide for additional information.

Table 1.4. Ambari 2.2.1 Known Issues

Apache Jira

HWX Jira

Problem

Solution

  BUG-55082

Ambari 2.2.1.0 changed how Ambari performs Agent registration when using the SSH option. Because of this change, after upgrading from Ambari 2.1.2 to Ambari 2.2.1.0 and then attempting to perform an HDP upgrade (Express or Rolling), during the HDP upgrade, you will see import errors show up in the upgrade operations. For example:

from ambari_commons.constants import UPGRADE_TYPE_NON_ROLLING,
UPGRADE_TYPE_ROLLING
ImportError: cannot import name UPGRADE_TYPE_NON_ROLLING

These errors result in the HDP upgrade entering an indeterminate state and you are unable to move forward to complete the HDP upgrade.

Users at risk of being affected by this bug meet one or both of the conditions:

  • Performed an Ambari upgrade from Ambari 2.1.2 > 2.2.1.0.

  • Used the SSH option to register the Agents with Ambari.

Workaround:

If you are affected by this bug, after upgrading to Ambari 2.2.1.0, on every host in the cluster, stop the Ambari Agent and remove the /var/lib/ambari-agent/tmp/ambari_commons directory BEFORE performing any HDP upgrade (Express or Rolling). Then restart the Ambari Agent. Be sure to make a backup of this directory prior to deleting. For example:

  1. On each host, stop the Ambari Agent:

    ambari-agent stop 
  2. Make a backup of the ambari_commons directory:

    tar zcvf /var/lib/ambari-agent/ambari_commons.tgz /var/lib/ambari-agent/tmp/ambari_commons 
  3. Remove the ambari_commons directory:

    rm -rf /var/lib/ambari-agent/tmp/ambari_commons 
  4. Restart the Ambari Agent:

    ambari-agent start 

Solution:

A fix for this issue is included with Ambari 2.2.2.0.

  BUG-52221

When using HDP 2.4 and adding Atlas Service to a cluster that includes Hive, the Atlas Metadata Server fails to start if Ambari was upgraded from a previous release and then HDP stack was then upgraded to 2.4 version.

When adding Atlas to a cluster that includes Hive, the Atlas Metadata Server fails to start with the following error:

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/ATLAS/0.1.0.2.3/ package/scripts/metadata_server.py", line 132, in <module>
    MetadataServer().execute()
  File  "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute
    method(env)
  File  "/var /lib/ambari-agent/cache/common-services/ATLAS/0.1.0.2.3/ package /scripts/metadata_server.py", line 53, in start
    self.configure(env)
  ...
  File  "/usr/lib/python2.6/site-packages/resource_management/core/source.py", line 51, in __call__
     return self.get_content()
  File  "/usr/lib/python2.6/site-packages/resource_management/core/source.py", line 75, in get_content
    raise Fail( "{0} Source file {1} is not found".format(repr(self), path))
resource_management.core.exceptions.Fail: StaticFile('/usr/hdp/current/atlas-server/server/webapp/atlas.war')
Source file /usr/hdp/current/atlas-server/server/webapp/atlas.war is not found

This situation occurs because Atlas was not installed on the host although Ambari thought it was successful (and therefore, failed to start). The workaround is to manually install the following package on the host.

yum install atlas-metadata_2_4_* 

and then copy the file client.properties in /usr/hdp/<2.4.x.y>/etc/atlas/conf.dist/ directory to /etc/atlas/conf/ directory where <2.4.x.y> is the HDO version, for example 2.4.0.0-160 .

Then, start Atlas from Ambari Web.

  BUG-51515

When selecting a time range that does not include any data, the metric widget graph will not be updated.

When viewing a metric widget graph, if you are select a time range that does not include any metrics data, the graph will not be updated in the Ambari Web UI. This might occur if you select a time range that is prior to collection starting on the cluster. Select a different time range.

  BUG-51254

When using RHEL/CentOS 7.2 or above, systemd does not work on ambari-server.

The following message appears when systemctl on ambari-server is used on RHEL/CentOS 7.2 or above:

[root@c7201 ~]# systemctl status ambari-server ambari-server.service Loaded: not-found (Reason: No such file or directory) Active: inactive (dead)

To correct the problem, run the following on the ambari-server host:

unlink /etc/rc.d/init.d/ambari-server && cp -a /usr/sbin/ambari-server /etc/rc.d/init.d/ambari-server && systemctl daemon-reload

  BUG-50992

Ambari is not showing active/standby status for NameNodes. This is because, when NameNode HA is enabled, if the NameNode ports specified through dfs.namenode.http(s)-address.{nameservice}.{namenode} e.g. dfs.namenode.http(s)-address.nameservice.nn1 and dfs.namenode.http(s)-address.nameservice.nn2 are different than the one specified using dfs.namenode.http-address.

Modify the port number specified in dfs.namenode.http(s)-address to match the port numbers specified in dfs.namenode.http(s)-address.{nameservice}.{namenode}

  BUG-50791

If you have configured your cluster for LZO, MapReduce and Oozie Service checks might fail after HDP Upgrade.

  1. If you are using LZO, make sure before starting upgrade that LZO codec path does not contain hard-coded HDP version. LZO codec path is specified as a part of mapreduce.application.classpath property value at mapred-site config for YARN. The correct LZO codec path should look like the following:

    /usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar                                       
  2. Also, if you configured your cluster for Tez and LZO, then make sure before starting upgrade that property tez.cluster.additional.classpath.prefix of tez-site config does not contain hard-coded HDP version. Proper classpath value should look like the following:

    /usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure
                                            
  3. In addition, if you have Oozie service installed, after removing the hdp version hardcoding from LZO path update the oozie-env with following:

    export HADOOP_OPTS="-Dhdp.version=$HDP_VERSION $HADOOP_OPTS"
  BUG-49728

When adding a ZooKeeper service, Kafka property is not updated.

If you are running Kafka and add an additional ZooKeeper server to your cluster, the zookeeper.connect property is not automatically updated to include the newly added ZooKeeper server.

You must manually add the ZooKeeper server to the zookeeper.connect property.

AMBARI-14012 BUG-41044

After upgrading from HDP 2.1 and restarting Ambari, the Admin > Stack and Versions > Versions tab does not show in Ambari Web.

After performing an upgrade from HDP 2.1 and restarting Ambari Server and the Agents, if you browse to Admin > Stack and Versions in Ambari Web, the Versions tab does not display. Give all the Agent hosts in the cluster a chance connect to Ambari Server by wait for Ambari to show the Agent heartbeats as green and then refresh your browser.

AMBARI-12389 BUG-41040

After adding Falcon to your cluster, the Oozie configuration is not properly updated.

After adding Falcon to your cluster using "Add Service", the Oozie configuration is not properly updated. After completion of Add Service wizard, add properties on Services > Oozie > Configs > Advanced > Custom oozie-site. The list of properties can be found here: https://github.com/apache/ambari/blob/branch-2.1/ambari-server/src/main/resources/common-services/FALCON/0.5.0.2.1/configuration/oozie-site.xml. Once added, Restart Oozie and execute this command on oozie server host:

su oozie -c '/usr/hdp/current/oozie-server/bin/oozie-setup.sh prepare-war'

Start Oozie.

AMBARI-12412 BUG-41016

Storm has no metrics if service is installed via a Blueprint.

The following properties need to be added to storm-site. Browse to Services > Storm > Configs and add the following properties. Restart the Storm service.

topology.metrics.consumer.register=
[{'class': 'org.apache.hadoop.metrics2.sink.storm.StormTimelineMetricsSink'
, 'parallelism.hint': 1}]
metrics.reporter.register=
org.apache.hadoop.metrics2.sink.storm.StormTimelineMetricsReporter
  BUG-40773 Kafka broker fails to start after disabling Kerberos security.

When enabling Kerberos, Kafka security configuration is set and all the ZooKeeper nodes in Kafka will have ACLs set so that only Kafka brokers can modify entries in ZooKeeper. If you disable Kerberos, you user must set all the Kafka ZooKeeper entries to world readable/writable prior to disabling Kerberos. Before disabling kerberos for Kafka. Log-in as user "kafka" on one of Kafka nodes:

kinit -k -t /etc/security/keytabs/kafka.service.keytab kafka/_HOST

where _HOST should be replaced by the hostname of that node. Run the following command to open zookeeper shell:

/usr/hdp/current/kafka-broker/bin/zookeeper-shell.sh hostname:2181

where hostname here should be replaced by one of the zookeeper nodes

setAcl /brokers world:anyone:crdwa 
setAcl /config world:anyone:crdwa 
setAcl /controller world:anyone:crdwa 
setAcl /admin world:anyone:crdwa

If the above commands do not run prior to disabling Kerberos, the only option is to set "zookeeper.connect property" to a new ZooKeeper root. This can be done by appending "/newroot" to "zookeeper.connect.property" string. For example "host1:port1,host2:port2,host3:port3/newroot"

  BUG-40694 The Slider view is not supported on a cluster with SSL (wire encryption) enabled. Only use the Slider view on clusters without wire encryption enabled. If it is required to run Slider on a cluster with wire encryption enabled, please contact Hortonworks support for further help.
  BUG-40541 If there is a trailing slash in the Ranger External URL the NameNode will fail to startup. Remove the trailing slash from the External URL and and start up the Name Node.
AMBARI-12436 BUG-40481

Falcon Service Check may fail when performing Rolling Upgrade, with the following error:

2015-06-25 18:09:07,235 ERROR - [main:]
 ~ Failed to start ActiveMQ JMS Message Broker.
 Reason: java.io.IOException: Invalid location: 1:6763311, :
 java.lang.NegativeArraySizeException (BrokerService:528) 
 java.io.IOException: Invalid location: 1:6763311, :
 java.lang.NegativeArraySizeException
 at
 org.apache.kahadb.journal.DataFileAccessor.readRecord(DataFileAccessor.java:94)

This condition is rare.

When performing a Rolling Upgrade from HDP 2.2 to HDP 2.3 and Falcon Service Check fails with the above error, browse to the Falcon ActiveMQ data dir (specified in falcon properties file), remove the corrupted queues, and stop and start the Falcon Server.

cd <ACTIVEMQ_DATA_DIR>
rm -rf ./localhost
cd /usr/hdp/current/falcon-server 
su -l <FALCON_USER> 
./bin/falcon-stop
./bin/falcon-start
  BUG-40323

After switching RegionServer ports, Ambari will report RegionServers are live and dead.

HBase maintains the list of dead servers and live servers according to it's semantics. Normally a new server coming up again with the same port will cause the old server to be removed from the dead server list. But due to port change, it will stay in that list for ~2 hours. If the server does not come at all, it will still be removed from the list after 2 hours. Ambari will alert, based on that list until the RegionServers are removed from the list by HBase.

AMBARI-12283 BUG-40300

After adding or deleting ZooKeeper Servers to an existing cluster, Service Check fails.

After adding or deleting ZooKeeper Servers to an existing cluster, Service Check fails due to conflicting zk ids. Restart ZooKeeper service to clear the ids.

AMBARI-12005 BUG-24902 Setting cluster names hangs Ambari.

If you attempt to rename a cluster to a string > 100 chars, Ambari Server will hang. Restart Ambari Server to clear the hang.