Solutions to Common Problems
The table below describes solutions to common cluster configuration problems.
| Symptom | Reason | Solution | 
|---|---|---|
| Cloudera Manager | ||
| The Cloudera Manager service will not be running as it exited in an unusual
                  manner. Running  The Cloudera Manager Server log file
                      | 
                  Out of memory. | Examine the heap dump that the Cloudera Manager Server creates when it runs out
                of memory. The heap dump file is created in the
                /tmp directory, has file extension
                .hprof and file permission of 600. Its owner
                and group will be the owner and group of the Cloudera Manager
                server process, normally
                cloudera-scm:cloudera-scm. | 
               
You are unable to start service on the Cloudera Manager server, that is,
                service cloudera-scm-server start does not work
                and there are errors in the log file located at
                /var/log/cloudera-scm-server/cloudera-scm-server.log
                   | 
                  The server has been disconnected from the database or the database has stopped responding or has shut down. | Go to /etc/cloudera-scm-server/db.properties and make sure the
                database you are trying to connect to is listed there and has
                been started. | 
               
Logs include APPARENT DEADLOCK entries for c3p0. | 
                  These deadlock messages are cause by the c3p0 process not making progress at the expected rate. This can indicate either that c3p0 is deadlocked or that its progress is slow enough to trigger these messages. In many cases, progress is occurring and these messages should not be seen as catastrophic. | There are a variety of ways to react to these log entries.
  | 
               
| Starting Services | ||
| After you click the Start button to start a service, the Finished
                status does not display. This may not be merely a case of the status not getting displayed. It could be for a number of reasons such as network connectivity issues or subcommand failures.  | 
                  The host is disconnected from the Server, as will be indicated by missing heartbeats on the Hosts tab. | 
                     
  | 
               
| Subcommands failed resulting in errors in the log file indicating that either the command timed out or the target port was already occupied | 
                     
  | 
               |
| After you click Start to start a service, the Finished status displays but there are error messages. The subcommands to start service components (such as JobTracker and one or more TaskTrackers) do not start. | A port specified in the Configuration tab of the service is already being used in your cluster. For example, the JobTracker port is in use by another process. | Enter an available port number in the port property (such as JobTracker port) in the Configuration tab of the service. | 
| There are incorrect directories specified in the Configuration tab of the service (such as the log directory). | Enter correct directories in the Configuration tab of the service. | |
| Job is Failing | No space left on device. | One approach is to use a system monitoring tool such as Nagios to alert on the
                disk space or quickly check disk space across all systems. If
                you do not have Nagios or equivalent you can do the following to
                determine the source of the space issue: In the JobTracker Web UI, drill down from the job, to the map or reduce, to the
                  task attempt details to see which TaskTracker the task
                  executed and failed on due to disk space. For example:
                   In the NameNode Web UI, inspect the % used column on the NameNode Live Nodes
                  page:
                    | 
               
| Send Test Alert and Diagnose SMTP Errors | ||
| You have enabled sending alerts from the Cloudera Manager Admin Console,
                however, Cloudera Manager does not seem to be sending any
                alerts. Using the Send Test Alert link under shows success even though you do not receive an alert email.  | 
                  There is possibly a mismatch of protocol or port numbers between your mail server and the Alert Publisher. For example, if the Alert Publisher is sending alerts to SMTPS on port 465 and your mail servers are not configured for SMTPS, you wouldn't receive any alerts. | Use the following steps to make changes to the Alert Publisher configuration:
  | 
               
