Troubleshooting Installation and Upgrade Problems
For information on known issues, see Known Issues and Workarounds in Cloudera Manager 5.
Symptom | Reason | Solution |
---|---|---|
The Cloudera Manager Server fails to start after upgrade. | There were active commands running before upgrade. This includes commands a user might have run and also for commands Cloudera Manager automatically triggers, either in response to a state change, or something that's on a schedule. | Downgrade the Cloudera Manager
Server, stop the commands, and reapply the upgrade. If you must proceed without downgrade, active commands can be stopped if you start the Cloudera Manager Server with the following command:
|
"Failed to start server" reported by cloudera-manager-installer.bin. /var/log/cloudera-scm-server/cloudera-scm-server.log contains a message beginning Caused by: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver... | You may have SELinux enabled. | Disable SELinux by running sudo setenforce 0 on the Cloudera Manager Server host. To disable it permanently, edit /etc/selinux/config. For more information, see Disabling SELinux. |
Installation interrupted and installer won't restart. | You need to do some manual cleanup. | See Uninstalling Cloudera Manager and Managed Software. |
Cloudera Manager Server fails to start and the Server is configured to use a MySQL database to store information about service configuration. | Tables may be configured with the ISAM engine. The Server will not start if its tables are configured with the MyISAM engine, and an
error such as the following will appear in the log file:
Tables ... have unsupported engine type ... . InnoDB is required. |
Make sure that the InnoDB engine is configured, not the MyISAM engine. To check what engine your tables are using, run the following
command from the MySQL shell: mysql> show table status;
For more information, see MySQL Database. |
Agents fail to connect to Server. Error 113 ('No route to host') in /var/log/cloudera-scm-agent/cloudera-scm-agent.log. | You may have SELinux or iptables enabled. | Check /var/log/cloudera-scm-server/cloudera-scm-server.log on the Server host and /var/log/cloudera-scm-agent/cloudera-scm-agent.log on the Agent hosts. Disable SELinux and iptables. For more information, see Disabling SELinux and Disabling the Firewall. |
Some cluster hosts do not appear when you click Find Hosts in install or update wizard. | You may have network connectivity problems. |
|
"Access denied" in install or update wizard during database configuration for Activity Monitor or Reports Manager. | Hostname mapping or permissions are incorrectly set up. |
|
Activity Monitor, Reports Manager, or Service Monitor databases fail to start. | MySQL binlog format problem. | Set binlog_format=mixed in /etc/my.cnf. For more information, see this MySQL bug report. See also Cloudera Manager and Managed Service Datastores. |
You have upgraded the Cloudera Manager Server, but now cannot start services. | You may have mismatched versions of the Cloudera Manager Server and Agents. | Make sure you have upgraded the Cloudera Manager Agents on all hosts. (The previous version of the Agents will heartbeat with the new version of the Server, but you cannot start HDFS and MapReduce with this combination.) |
Cloudera services fail to start. | Java may not be installed or may be installed at a custom location. | See Configuring a Custom Java Home Location for more information on resolving this issue. |
The Activity Monitor displays a status of BAD in the Cloudera Manager Admin Console. The log file
contains the following message:
ERROR 1436 (HY000): Thread stack overrun: 7808 bytes used of a 131072 byte stack, and 128000 bytes needed. Use 'mysqld -O thread_stack=#' to specify a bigger stack. |
The MySQL thread stack is too small. |
|
The Activity Monitor fails to start. Logs contain the error read-committed isolation not safe for the statement binlog format. | The binlog_format is not set to mixed. | Modify the mysql.cnf file to include the entry for binlog format as specified in MySQL Database. |
Attempts to reinstall lower versions of CDH or Cloudera Manager using yum fails. | It is possible to install, uninstall, and reinstall CDH and Cloudera Manager. In certain cases, this does not complete as expected. If you install Cloudera Manager 5 and CDH 5, then uninstall Cloudera Manager and CDH, and then attempt to install CDH 4 and Cloudera Manager 4, incorrect cached information may result in the installation of an incompatible version of the Oracle JDK. | Clear information in the yum cache:
|
The Create Hive Metastore Database Tables command fails due to a problem with an escape string. | PostgreSQL versions 9 and higher require special configuration for Hive because of a backward-incompatible change in the default value of the standard_conforming_strings property. Versions up to PostgreSQL 9.0 defaulted to off, but starting with version 9.0 the default is on. | As the administrator user, use the following command to turn standard_conforming_strings off:
ALTER DATABASE <hive_db_name> SET standard_conforming_strings = off; |
After upgrading to CDH 5, HDFS DataNodes fail to start with exception:
Exception in secureMainjava.lang.RuntimeException: Cannot start datanode because the configured max locked memory size (dfs.datanode.max.locked.memory) of 4294967296 bytes is more than the datanode's available RLIMIT_MEMLOCK ulimit of 65536 bytes. |
HDFS caching, which is enabled by default in CDH 5, requires new memlock functionality from Cloudera Manager Agents. | Do the following:
|
You see the following error in NameNode log:
2014-10-16 18:36:29,112 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage java.io.IOException:File system image contains an old layout version -55.An upgrade to version -59 is required. Please restart NameNode with the "-rollingUpgrade started" option if a rolling upgrade is already started; or restart NameNode with the "-upgrade" option to start a new upgrade. at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:231) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:994) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:726) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:529) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:585) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:751) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:735) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1410) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1476) 2014-10-16 18:36:29,126 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:50070 2014-10-16 18:36:29,127 WARN org.apache.hadoop.http.HttpServer2: HttpServer Acceptor: isRunning is false. Rechecking. 2014-10-16 18:36:29,127 WARN org.apache.hadoop.http.HttpServer2: HttpServer Acceptor: isRunning is false 2014-10-16 18:36:29,127 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system... 2014-10-16 18:36:29,128 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped. 2014-10-16 18:36:29,128 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete. 2014-10-16 18:36:29,128 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join java.io.IOException: File system image contains an old layout version -55.An upgrade to version -59 is required. Please restart NameNode with the "-rollingUpgrade started" option if a rolling upgrade is already started; or restart NameNode with the "-upgrade" option to start a new upgrade. at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:231) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:994) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:726) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:529) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:585) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:751) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:735) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1410) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1476) 2014-10-16 18:36:29,130 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 2014-10-16 18:36:29,132 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: |
You upgraded CDH to 5.2 using Cloudera Manager and did not run the HDFS Metadata Upgrade command. | Stop the HDFS service in Cloudera Manager and follow the steps for upgrade (depending on whether you are using packages or parcels) described in Upgrading to CDH 5.2. |
If you are using an Oracle database and the Cloudera Navigator Analytics > Audit > Activity tab displays "No data available" and there is an Oracle error about "invalid identifier" with the query containing the reference to dbms_crypto in the log. | You have not granted execute permission to sys.dbms_crypto. | Run GRANT EXECUTE ON sys.dbms_crypto TO nav;, where nav is the user of the Navigator Audit Server database. |