Issues Fixed in Cloudera Director

Issues Fixed in Cloudera Director 2.6.1

Proxy settings do not work with the Cloudera Director CLI

The Cloudera Director command line client does not recognize proxy settings.

Cloudera Bug: DIR-7405

SSH heartbeat threads leaking

SSH heartbeat threads can be leaked if connectivity to the underlying instance fails.

Cloudera Bug: DIR-7675

Bootstrap of clusters for "All Services" or "Real Time Ingest" fails when using web UI

When using the Cloudera Director web UI to bootstrap a cluster, there are several choices for the set of services to install in the cluster. The selections All Services and Real Time Ingest include Kafka as a cluster service. The default parcel repositories for cluster services in Cloudera Director 2.6.0 include Kafka 3.0, but the Cloudera Director UI erroneously requests Kafka version 2 by default, causing bootstrap for clusters including Kafka to fail.

Cloudera Bug: DIR-7437

Update failure when replacing instances with reused private IP addresses

When Cloudera Director is used to replace bad or failed instances, the update can fail if the new host has the same private IP address as the host being replaced.

Cloudera Bug: DIR-7332

Solr servers treated as master roles by Cloudera Director for migrations

Starting in Director 2.5.0, the Solr server cluster role was treated as a master role and not available for automatic role migration. It should instead be treated as a worker role. This miscategorization resulted in Cloudera Director invoking the manual migration workflow when replacing instances with the Solr server role.

Workaround: Follow the manual migration workflow to transfer roles to the new instances.

Cloudera Bug: DIR-7329

NumberFormatException during cluster bootstrap

In Cloudera Director 2.6.0, near the end of cluster bootstrap or update, there may be a parsing error expressed as a NumberFormatException in the server or client log:
java.lang.NumberFormatException: null
   at java.lang.Long.parseLong(Long.java:552)
   at java.lang.Long.parseLong(Long.java:631)
   at
 com.cloudera.launchpad.bootstrap.cluster.AdjustClouderaManagementServices.makeAdjustments(AdjustClouderaManagementServices.java:155)
While the cluster is still usable after this error, the cluster may end up in an UPDATE_FAILED state.
Workaround: Set the following configuration property in Cloudera Director's application.properties file. This workaround only works if you are not using the new auto-TLS feature of Cloudera Director 2.6.
lp.bootstrap.cluster.adjustcm.useAutoConfigure: true

Cloudera Bug: DIR-7655

Issues Fixed in Cloudera Director 2.6.0

Upgrade from Director 2.4 to 2.5 will fail if a cluster has a heterogeneous instance group

Cloudera Director 2.5 will fail to start after upgrading from 2.4 if a cluster contains a heterogeneous instance group.

Cloudera Bug: DIR-7037

Cluster refresh fails on clusters with incompletely allocated instance groups

Users request some number of instances for each instance group. Cloudera Director allows the user to specify a minimum count required for each instance group, which may be less than the requested number of instances. The cluster refresh process fails if an instance group does not contain the requested number of instances.

Cloudera Bug: DIR-7036

Upgrade from Cloudera Director 2.4 to 2.5 may fail

The database models for clusters with a heterogeneous instance group are not properly updated during upgrade. This will cause Cloudera Director to fail to start.

Workaround: Ensure all instance groups are consistent prior to upgrading to Cloudera Director 2.5. See Ensuring Consistency of Virtual Instance Groups for more information.

Cloudera Bug: DIR-7037

AWS plugin has broken disk preparation script

Due to a bug in the AWS plugin's disk preparation script, prepare_unmounted_volumes, the /data0 directory may be symlinked to the root volume.

Cloudera Bug: DIR-7268

Usernames should support user@domain.com format

When "." is used in a username, the Cloudera Director UI fails to create the user. Cloudera Director client does create a user, but the get user and update user commands return 404.

Cloudera Bug: DIR-6896

Upgrade from Director 2.4 to 2.5 will fail if a cluster has a heterogeneous instance group

Cloudera Director 2.5 will fail to start after upgrading from 2.4 if a cluster contains a heterogeneous instance group.

Cloudera Bug: DIR-7037

Cloudera Director may orphan AWS Spot instances

Cloudera Director tags AWS instances with an instance id. Due to eventual consistency in the AWS API, Cloudera Director may not find the tagged instances during the bootstrap or update process.

Cloudera Bug: DIR-92

Issues Fixed in Cloudera Director 2.5.1

Custom tag mapping renames user-supplied tag names

The custom tag mapping functionality in the AWS plugin inadvertently renames user-supplied tags as well. For example: if a custom tag mapping is given to rename the "Name" field to "Director_Name," a user-defined tag of "Name" will also be renamed to "Director_Name."

Workaround: Utilize instanceNamePrefixes, which can simulate a custom name, albeit with a UUID at the end of it.

Cloudera Bug: DIR-6704

iptables modules blacklisted during normalization

The iptables.ko and iptables_filter.ko modules are blacklisted after instance normalization. This can conflict with applications like Docker, which utilize iptables for their own networking.

Workaround: Delete the /etc/modprobe.d/iptables-blacklist.conf file on each instance using an instance post create script.

Cloudera Bug: DIR-6824

Unable to grow cluster when upgrading Cloudera Manager configured with custom repository URLs

If Cloudera Manager in Cloudera Director was upgraded using a non-archive.cloudera.com URL and the cluster is grown, agent installation can fail on the new instances.

Cloudera Bug: DIR-6811

Failure of AWS Spot instance allocation may cause bootstrap or update to fail

Cloudera Director deletes Spot instance requests after attempting to allocate Spot instances. A Spot instance request may fail due to AWS API eventual consistency, causing the overall bootstrap or grow process to fail.

Cloudera Bug: DIR-6700

Incorrect retrieval of SSH credentials

If the SSH username is overridden in the instance template it can lead to an SSH authentication exception.

Cloudera Bug: DIR-6848

The setting preemptiveBasicProxyAuth did not work as expected

The lp.proxy.http.preemptiveBasicProxyAuth setting had no effect before.

Cloudera Bug: DIR-6868

Issues Fixed in Cloudera Director 2.5.0

AWS IAM permission for RDS required even when RDS not in use

When Cloudera Director validates an environment definition, it performs a call to AWS that requires the rds:DescribeDBSecurityGroups IAM permission. This is true whether or not RDS is to be used for any deployments or clusters in the environment.

Workaround: Include the rds:DescribeDBSecurityGroups permission in the IAM permissions for the user account defined in the environment; if no user is defined, then include the permission in the permission for the IAM role associated with the instance profile of the Director instance.

Cloudera Bug: DIR-2165

Cloudera Director masks real reason for cluster bootstrap or update failure during host installation

When Cloudera Director tries to add a host to Cloudera Manager, and errors occur that trigger retry, Cloudera Director can produce an exception that includes the following message while retrying:
'JdbcSQLException: Value too long for column "CALL_STACK' 
This is an internal error that masks the root cause, which may be found earlier in the Cloudera Director log, or in the Cloudera Manager log.

Cloudera Bug: DIR-3866

Preemptive proxy authentication not working as expected

Preemptive proxy authentication does not work as intended and currently has no effect.

Workaround: Use the proxy without preemptive proxy authentication.

Cloudera Bug: DIR-5354

Restrictive umask prevents bootstrap

Cloudera Director inappropriately sets the file ownership when creating database scripts for use by Cloudera Manager. When using a more restrictive umask with Cloudera Director, this can prevent deployment bootstrap from succeeding properly.

Workaround: Use the default umask supplied with your filesystem, or use a more permissive umask.

Cloudera Bug: DIR-5394

First run fails for cluster with HA enabled for HDFS cloned through Cloudera Director

If you create a non-HA cluster through Cloudera Director with a Cloudera Manager version prior to 5.12, and then enable HA using the Cloudera Manager wizard, Cloudera Director cannot clone the cluster.

Workaround: In order to be able to clone such a cluster, update Cloudera Manager to 5.12 or later, and wait for Cloudera Director to refresh its internal cluster model, before enabling HA through the Cloudera Manager wizard.

Cloudera Bug: DIR-5555

Presence of external databases prevents deletion of environments

When you delete a database server, we check to see if it is an external db or a registered db. If it is external, we show the delete action: 'Terminate Database Server.' The delete action for an external database should be shown as "Unregister Database Server." Once the external database is unregistered, the environment can be safely deleted.

Cloudera Bug: DIR-5749

Refreshing a deployment may overwrite the deployment template

User updates to the deployment template for a deployment may be inadvertently overwritten by refreshing the deployment if they are performed at the same time. This currently only impacts Cloudera Manager credentials.

Workaround: Re-update the Cloudera Manager credentials.

Cloudera Bug: DIR-5851

Cloudera Director may fail to repair master roles for certain services when repairing an instance

When repairing an instance with master roles, Cloudera Director may incorrectly attempt to automatically configure those roles instead of running the role migration. This may lead to update failure.

Cloudera Bug: DIR-5862

Use numberOfHealthCacheExecutorThreads when provided

Previous versions of Cloudera Director ignored numberOfHealthCacheExecutorThreads when specified in the server configuration file.

Cloudera Bug: DIR-6104

Use numberOfCacheExecutorThreads when provided

Previous versions of Cloudera Director ignored numberOfCacheExecutorThreads when specified in the server configuration file.

Cloudera Bug: DIR-6105

Terminating cluster creation at the wrong time can leak instances

EC2 instances can potentially leak instances when terminated during deployment or cluster bootstrapping.

Workaround: Delete the leaked instances manually.

Cloudera Bug: DIR-6107

Unhealthy host(s) causes 'apply host template' failure when growing the Cluster

When growing an existing cluster, the update operation may fail to add the instances. If the server log shows API call to Cloudera Manager failed. Method=HostTemplatesResource.applyHostTemplate, the user can check the CM API debug logs. One of the reasons for failure could be that the CDH parcel hasn't been activated by the time Cloudera Director attempts to apply the host template. This specific scenario is likely to happen if newly added instances show up as unhealthy in Cloudera Manager, which can cause parcel distribution errors.

Workaround: The best course of action is to try to figure out why the newly-added instance(s) comes up as unhealthy. This can sometimes be fixed by using a different AMI or instance type. If that doesn't work, Cloudera Director's lp.update.sleepTimeForAddInstanceSeconds server property (added in Cloudera Director 2.4.1) can be increased to add additional time for the host to come back as healthy so that the parcel gets distributed and activated before the API call to apply host template.

Cloudera Bug: DIR-6163

Shrink operation fails to complete

Host decommissioning can hang in Cloudera Manager, causing a cluster shrink operation to fail to complete successfully.

Cloudera Bug: DIR-6179

Cluster termination fails during backup of Cloudera Manager configuration files

When a cluster is terminated while Cloudera Director is backing up Cloudera Manager configuration files, it is possible for Cloudera Director to hang attempting to clean up the associated pipelines.

Cloudera Bug: DIR-6221

Erroneous error message during cluster creation with Spark 2

When creating a cluster that includes the Spark 2 service with bootstrap-remote, the Cloudera Director client will display the following warning:
Found warnings in cluster configuration: Unknown role type: 
GATEWAY for service type: SPARK2_ON_YARN in instance group
The warning is a false positive, but it does not stop the cluster creation.

Cloudera Bug: DIR-6305

Failure to update IP addresses in Cloudera Manager for repaired instances

When repairing instances, Cloudera Director will try to correlate instances known to Cloudera Director with instances known to Cloudera Manager. The correlation is done via the instance's IP address. However if the instance is terminated, the IP address known to Cloudera Director will be a placeholder while Cloudera Manager keeps its original IP address, resulting in failure when attempting to establish the mapping.

Cloudera Bug: DIR-6380

Cloudera Director continues trying to update a cluster even if the cluster was terminated in the middle of the update

When updating a cluster, Cloudera Director will first check the cluster status, and compute update steps if it passes the check, and start pipelines to update the cluster with the update steps. However, the check and launching pipelines is not transactional, so if a terminate cluster request is fulfilled in the middle, the update will still kick off the pipeline.

Cloudera Bug: DIR-6391

Update fails if 'Redeploy Client Configuration' is checked unnecessarily

Update fails when the Redeploy Client Configuration checkbox on the Modify Cluster page is checked and redeployment of client configurations is not needed.

Workaround: Do not check the Redeploy Client Configuration checkbox if redeployment of the client configuration is not needed.

Cloudera Bug: DIR-6464

Cluster bootstrap fails when Spot instance capacity is exceeded

If the number of Spot instances requested exceeds the user's Spot capacity, then the allocation of the Spot instances will fail and cause cluster bootstrap to fail.

Workaround: Ensure your Spot instance limit on your EC2 account is sufficient for the number of instances you request in the specified region.

Cloudera Bug: DIR-6473

Cloudera Director may leak instances in AWS

Cloudera Director may retry a failed instance allocation, resulting in two instances tagged with the same ID. Due to the tagging, Cloudera Director may terminate only one of the instances.

Cloudera Bug: DIR-6551

Cloudera Director client does not support unicode

HOCON substitution in Cloudera Director configurations is not supported.

Workaround: Write configurations without substitutions.

Cloudera Bug: DIR-5274

Block volume limit error reporting

If the EBS volume limit is reached when creating a cluster, the Cloudera Director log might not reflect this root cause, though it might mention creating the cluster failed because it cannot satisfy the minimum threshold limit for specific roles in the cluster.

Cloudera Bug: DIR-5459

Issues Fixed in Cloudera Director 2.4.1

Errors when using MySQL 5.7 for the Cloudera Director database

The defaults related to TIMESTAMP field handling changed drastically in MySQL 5.7.4 and later, which is documented in SQL Mode Changes in MySQL 5.7 in the MySQL documentation. One of the tables created by Cloudera Director, SERVER_CONFIGS, conflicts with these new defaults, which were valid in previous versions of MySQL. This is further complicated by the fact that MySQL 5.7.x will allow upgrades from MySQL 5.6 with tables that violate these defaults.

The result is that any modifications attempted on the SERVER_CONFIGS table in MySQL 5.7.x will fail. Cloudera Director 2.4 introduced a change to this table, triggering this problem. Additionally, new installs have been observed failing on MySQL 5.7.x due to the SERVER_CONFIGS table violating the expected defaults.

This issue has been fixed in Cloudera Director 2.4.1 with database changes that:
  • Adjust the creation of the SERVER_CONFIGS table on new installations
  • Correct SERVER_CONFIGS for users upgrading to Cloudera Director 2.4
  • Correct SERVER_CONFIGS for users who have already upgraded to Cloudera Director 2.4
For fresh installations on MySQL 5.7.x, this may affect any version of Cloudera Director starting with version 2.0. For existing installations that are now running on MySQL 5.7.x, this may affect users attempting to upgrade to Cloudera Director 2.4 from Cloudera Director versions 2.0 to 2.3. Running on MySQL 5.5.x or 5.6.x will behave as expected without any database failures.
Workarounds:
  • Cloudera recommends contacting Cloudera Support in order to fix this issue. However, if that is not an option, the following steps can be used to address the issue.
  • For a fresh install of Cloudera Director, the simplest workaround is to disable strict mode on MySQL. For more information about strict mode and how to disable it, see SQL Server Modes in the MySQL documentation. Using Cloudera Director 2.4.1 will avoid this issue.
  • For existing installs, manually modify the MySQL database to avoid this issue:
    • Upgrading from versions lower than 2.0.0: In this case, Cloudera Director will fail when trying to create the SERVER_CONFIGS table. In the database housing the Cloudera Director tables, examine the core_schema_version table and remove the line with the script value V3_2.0.0_1__init_serverconfig.sql.
      delete from core_schema_version where script = 'V3_2.0.0_1__init_serverconfig.sql';
      You should see a response like the following:
      Query OK, 1 row affected (0.02 sec)
      After this, retry the upgrade using Cloudera Director 2.4.1, or disable strict mode.
    • Upgrading from versions 2.0.0 to 2.4.0: In this case, Cloudera Director will fail when trying to modify several tables to remove the VERSION column. You must complete the migration manually and fix the TIMESTAMP issue.
      ALTER TABLE SERVER_CONFIGS MODIFY UPDATED_AT TIMESTAMP NULL, MODIFY CREATED_AT TIMESTAMP NULL;
      
      ALTER TABLE AUTHORITIES DROP COLUMN VERSION;
      ALTER TABLE CLUSTERS DROP COLUMN VERSION;
      ALTER TABLE DEPLOYMENTS DROP COLUMN VERSION; 
      ALTER TABLE ENVIRONMENTS DROP COLUMN VERSION;
      ALTER TABLE EXTERNAL_DATABASE_SERVERS DROP COLUMN VERSION;
      ALTER TABLE INSTANCE_TEMPLATES DROP COLUMN VERSION;
      ALTER TABLE SERVER_CONFIGS DROP COLUMN VERSION;
      ALTER TABLE USERS DROP COLUMN VERSION;
      
      UPDATE core_schema_version set success = 1 where script = 'V3_2.4.0_1__remove_versions.sql' 
      One or more of the ALTER TABLE statements may fail with an error that looks like the following:
      ERROR 1091 (42000): Can't DROP 'VERSION'; check that column/key exists
      This can be ignored because it was correctly deleted as part of the initial attempt to upgrade to Cloudera Director 2.4.

      After this, retry the migration. Cloudera recommends upgrading to Cloudera Director 2.4.1 as soon as possible, although these manual corrections should alleviate the issue.

Cloudera Bug: DIR-5921

Bootstrap fails because of empty parcel list

Cloudera Director fails in the middle of bootstrap with IllegalArgumentException: Parcel validation failed. This can happen when Cloudera Manager instances take longer than usual to refresh the list of parcels.

Cloudera Bug: DIR-6131

Unhealthy host causes "apply host template" to fail when growing the cluster

When growing an existing cluster the update operation may fail to add the instances. If the server log indicates "API call to Cloudera Manager failed. Method=HostTemplatesResource.applyHostTemplate," the user should enable Cloudera Manager API Debugging and check the server logs in Cloudera Manager to get more information on the failure. See Cloudera Manager API Call Fails in Troubleshooting Cloudera Director for information about checking Cloudera Manager logs.

One of the reasons for failure could be that the CDH parcel wasn't activated by the time Cloudera Director attempted to apply the host template. This specific scenario is likely to happen if newly added instances show up as unhealthy in Cloudera Manager.

Workaround: The best course of action is to try and figure out why the newly added instances comes up as unhealthy. This can sometimes be fixed by using a different AMI or instance type. If that doesn't work, Cloudera Director's lp.update.sleepTimeForAddInstanceSeconds server property can be increased to add additional time for the host to come back as healthy so that parcel gets distributed and activated before the API call to apply host template.

Cloudera Bug: None

Azure VMs with manually attached Public IPs from different Resource Groups are marked as "not found"

An Azure VM with a manually attached Public IP from a different Resource Group will no longer be marked as "not found" and excluded from the the list of active cluster nodes. As of 2.4.1 the VMs will not report Public IPs from different Resource Groups but they will function as expected otherwise.

Workaround: Create the Public IP for manually attaching to the VM in the same Resource Group as the VM itself.

Cloudera Bug: PARTNER-3927

NullPointerException when Cloudera Director retrieves the private FQDN of a VM instance

In some rare cases the OS Profile metadata of an Azure VM can be empty. This can be confirmed by inspecting the VM metadata on Azure Resource Explorer ("osProfile" JSON block will be missing from the VM properties block). The OS Profile contains information such as the VM's private FQDN. An empty OS Profile can be related to Azure VM agents not running correctly on the VM. Cloudera recommends contacting Microsoft Azure support to resolve the issue where OS Profile is empty for an Azure VM. As of Cloudera Director 2.4.1, VM with missing OS Profile will no longer cause NullPointerException.

Cloudera Bug: PARTNER-3992

Add D series instances to Azure instance type list

The following instance types are added to to the Azure instance type list:

  • Standard_D15_v2
  • Standard_D14_v2
  • Standard_D13_v2
  • Standard_D12_v2

Cloudera Bug: PARTNER-3824

Expand Error Log for Unsupported Azure VMs

When deploying an unsupported Azure VM type the error message now contains actionable information for how to get and use the latest supported VM types.

Cloudera Bug: PARTNER-3928

Shorten Azure VMs Instance ID field in Cloudera Director UI

Azure VMs in Cloudera Director reported their instance IDs as a full Resource ID with Subscription ID and Resource Group name included. As of 2.4.1 the instance ID field is shortened to just the VM name.

Cloudera Bug: PARTNER-3924

Use static private IP address assignment option instead of dynamic

To guarantee the private IP address does not change after the VM is deallocated and restarted, the private IP allocation method must be Static. As of 2.4.1 the default private IP address allocation method is changed to Static.

Workaround: Manually change the private IP address assignment option to "Static" for each VM in the cluster via Azure portal.

Cloudera Bug: PARTNER-3914

Issues Fixed in Cloudera Director 2.4.0

Root password for external database server emitted in log

Cloudera Director logs the command line it runs to create new databases for Cloudera Manager and for cluster services. As of version 2.3, the password for the database being created was redacted in these log messages, but the password for the root account of the database server was not. This is fixed in 2.4, and the root password is now also redacted.

Cloudera Bug: DIR-4724

Cloudera Director may show the status of a cluster as TERMINATE_FAILED even when it has successfully terminated

If a cluster is terminated while in the process of bootstrapping, it is possible for the cluster to show TERMINATE_FAILED even though it has successfully terminated.

Cloudera Bug: DIR-5545

Cloudera Director does not sync with changes made in Cloudera Manager

Modifying a cluster in Cloudera Manager after it is bootstrapped does not cause the cluster state to be synchronized with Cloudera Director. Services that have been added or removed in Cloudera Manager do not show up in Cloudera Director when growing the cluster. For more information on keeping Cloudera Director and Cloudera Manager in sync, see CDH Cluster Management Tasks.

Cloudera Bug: DIR-1260

Old pipeline records not evicted from the Cloudera Director database

Cloudera Director records data about its internal workflow pipelines in its own database. Persisting this information allows Cloudera Director to track pipeline progress across restarts and to resume pipelines that were running or suspended. Pipeline data for old pipelines, such as those that have completed or failed, is automatically evicted from this database. However, under some circumstances, old pipeline data would fail to be evicted, resulting in logged errors. One cause is a Cloudera Director restart, which destroys in-memory pipeline data that was erroneously expected to remain. Cloudera Director 2.4 is more robust and eliminates this cause of pipeline eviction failure.

Workaround: In Cloudera Director 2.3 and below, the inability to evict old pipeline data does not harm Cloudera Director functioning in the short term, but over time the database could grow unacceptably large. Contact Cloudera Support for assistance deleting pipelines that cannot be evicted normally. To prevent build-up of old pipeline data, do not stop Cloudera Director until a round of database eviction is complete.

Cloudera Bug: DIR-5451

Delete deployment may orphan underlying clusters

When deleting deployments, it is possible that Cloudera Director deletes a deployment successfully, but leaves the cluster in an undeleted state. Retrying deployment deletion will not help, and the clusters will be orphaned. This is fixed in 2.4 such that a deployment deletion will also check for any orphaned clusters, even if the deployment itself is deleted.

Workaround: In Cloudera Director 2.3 and below, individually delete orphaned clusters if there ID's are known.

Cloudera Bug: DIR-5282

Bootstrap fails with non-default password-protected parcel repository

Bootstrap fails when using a password-protected CDH parcel repository with Cloudera Director 2.3 and below. This has been corrected in Cloudera Director 2.4.

Cloudera Bug: DIR-5225

Cloudera Director bootstrap hangs if EC2 spot instances terminate immediately after fulfillment

With Cloudera Director 2.3 and below, bootstrap can hang if spot instances terminate immediately after fulfillment, making it necessary to cancel the cluster bootstrap, terminate the cluster, and try again. This has been corrected in Cloudera Director 2.4 such that bootstrap fails immediately.

Cloudera Bug: DIR-5383

NullPointerException thrown when creating an invalid environment on Azure

In Cloudera Director 2.3.0 and below, a NullPointerException is thrown when invalid Microsoft Azure environment information (Subscription ID, Tenant ID, Client ID or Client Secret) is used in creating a new Azure Environment. For Cloudera Director 2.4.0 and higher, an error message is shown indicating that invalid Azure environment information was used to create the new Azure environment.

Cloudera Bug: DIR-5187

Terminated host not properly cleaned up during shrink or repair

When shrinking or repairing an instance that has been terminated outside of Cloudera Director, Cloudera Director may fail to decommission and delete the host from Cloudera Manager.

Cloudera Bug: DIR-5207

Terminated EC2 instances report 127.0.0.1 as private IP

AWS instances that were terminated outside of Cloudera Director may have reported an IP address of 127.0.0.1. This has been changed in Cloudera Director 2.4 so that the IP address 192.0.2.1 is reported (an IP address reserved for documentation).

Cloudera Bug: DIR-5386

Cloudera Director client infinitely tries to create services if you specify duplicate services

If duplicate services are specified for a cluster (for example, two Hive services or two Impala services), Cloudera Director will infinitely retry to create services during cluster creation.

Cloudera Bug: DIR-4668

Workaround: Cancel the cluster bootstrap, terminate the cluster, and recreate without duplicate services.

Cloudera Director may not apply custom configuration to all instances

Cloudera Director requests that Cloudera Manager perform automatic configuration for a cluster prior to applying any custom configurations. Automatic configuration may sometimes create multiple groups of instances within Cloudera Manager for a single corresponding group requested by Cloudera Director. When this occurs, custom configurations for the instances will only be applied to instances in one of the Cloudera Manager groups.

Cloudera Bug: DIR-5030

Creation of a cluster where instance groups have no roles is not possible using the web UI

Cloudera Director's web UI does not allow creation of clusters with instance groups that should not have CDH roles deployed on them.

Cloudera Bug: DIR-3991

Modification of a cluster where instance groups have no roles is not possible using the web UI

Cloudera Director's web UI does not allow modification of clusters with instance groups that should not have CDH roles deployed on them, even if they were created using the API.

Cloudera Bug: DIR-3955

Cluster launch fails using the development version of Cloudera Manager 5.10 and CDH 5.10 with Kudu

Cloudera Director 2.3 does not support deployment and management of Kudu.

Cloudera Bug: DIR-4854

If a cluster is terminated while it is bootstrapping, the cluster must be terminated again to complete the termination process

Terminating a cluster that is bootstrapping stops ongoing processes but keeps the cluster in the bootstrapping phase.

Cloudera Bug: None

Issues Fixed in Cloudera Director 2.3.0

Deployment bootstrap process may fail to complete

The process of bootstrapping a deployment can hang indefinitely waiting for Cloudera Manager to start, even after Cloudera Manager is up and reachable.

Cloudera Bug: DIR-5104

Cloudera Director does not install the JDBC driver for an existing MySQL database

Cloudera Director automatically installs JDBC drivers on an instance for Cloudera Manager and the CDH clusters it provisions. However, when you use an existing MySQL database with Cloudera Manager, Cloudera Director does not install the JDBC driver, which can result in database connection failures.

Cloudera Bug: DIR-3867

External databases are not configured for Hue and Oozie

External databases are not configured for Hue and Oozie in clusters created through the Cloudera Director web UI.

Cloudera Bug: DIR-3984

Normalization process does not set swappiness correctly on RHEL 7.2

On CentOS/RHEL 7 operating systems, the tuned service overwrites the swappiness settings that Cloudera Director configures on instances.

Cloudera Bug: DIR-3993

Stale service configs

Cloudera Director sometimes fails to detect stale services properly when restarting a cluster.

Cloudera Bug: DIR-4417

The nscd tool is installed but not enabled during normalization

nscd, a tool which caches common name service requests, is installed on Cloudera Director-managed instances, but is not enabled on CentOS and RHEL. This can reduce the performance of the bootstrapping process.

Cloudera Bug: DIR-4627

Cluster update or termination during instance metadata refresh fails to complete

If a deployment or cluster is terminated or updated at the same time that a refresh of instance metadata is running, on rare occasions the refresh will prevent the terminate or update operation from completing properly.

Cloudera Bug: DIR-5145

Director detects SRIOV incorrectly

For AWS instances, Cloudera Director will always report Enhanced Networking (SR-IOV) as false (for example on the instance properties page), even when it's enabled. This is fixed in Cloudera Director 2.3 and requires IAM permissions for the EC2 method DescribeInstanceAttribute.

Cloudera Bug: DIR-4997

After Cloudera Manager bootstrap failure, termination leads to renewed bootstrap attempt

In Cloudera Director 2.2, if you attempt to terminate a cluster or deployment in the BOOTSTRAP_FAILED stage, it may go back into the BOOTSTRAPPING stage and return the following exception message: java.util.concurrent.TimeoutException: Pipeline did not complete in 10 SECONDS. In this situation, terminating the deployment or cluster a second time should terminate the cluster or deployment as expected. This can also happen in Cloudera Director 2.1, but the exception message will be the following more generic message: 500 internal server error.

Cloudera Bug: DIR-4263

Warning when adding Hue Load Balancer role

When you bootstrap or validate a cluster that has the HUE_LOAD_BALANCER role, Cloudera Director generates an unknown role type warning for the role.

Cloudera Bug: DIR-4709

Bootstrap failure with Kafka and Sentry on Cloudera Manager 5.9

Cluster bootstrap fails when using Cloudera Manager 5.9 with both Kafka 2.0 and Sentry.

If Kafka and Sentry are required on the same cluster, use one of the following combinations:
  • Kafka 2.1 with Cloudera Manager 5.9 or 5.10
  • Kafka 2.0 with Cloudera Manager 5.8 or lower

Cloudera Bug: DIR-4634

Lack of support for newer AWS regions

When selecting certain AWS regions, such as ap-northeast-2, an error message can appear stating Unable to find the region ap-northeast-2. In this case, manually set the KMS region endpoint (under Advanced Options) to the KMS region endpoint specified in the AWS Regions and Endpoints in the AWS documentation.

Cloudera Bug: DIR-4978

Cloudera Manager repository URL validation failure

The validation of the Cloudera Manager repository can fail during the bootstrap process if the URL uses a host like localhost, a single-word hostname, or one with an internal or non-standard domain name. Use an IP address for the host, or use a hostname with a common domain like .com.

Cloudera Bug: DIR-4962

Cloudera Director configures Hue to use SQLite

CDH 5.8 and higher installs Postgres drivers along with Hue. When configuring a cluster to use Cloudera Manager's embedded Postgres database, Director will configure Hue to use its own embedded SQLite database rather than Cloudera Manager's embedded Postgres database.

Cloudera Bug: DIR-4952

MySQL database creation fails with insufficiently strong password

When using MySQL 5.7 as an external database server for a Cloudera Director deployment or cluster, database creation may fail with an error: "Your password does not satisfy the current policy requirements." This is due to Cloudera Director generating random UUIDs for passwords, which do not satisfy the MEDIUM level of password validation in MySQL 5.7. Disable password validation in MySQL, or adjust the validation level to LOW.

Cloudera Bug: DIR-4936

RDS instance creation fails with password length violation

AWS RDS requires a master user password of at least eight characters. If a password is supplied that is too short, Cloudera Director fails to validate it, leading to a failure from RDS. Ensure that the master user password is at least eight characters long.

Cloudera Bug: DIR-4916

Cloudera Manager server logs in Diagnostic data may be empty

Cloudera Director automatically attempts to collect diagnostic data after cluster bootstrap failure. If cluster bootstrap failed before or just after the cluster is created in Cloudera Manager, then the scm-server-logs inside the diagnostic data may be empty. In this case, trigger diagnostic data collection on the deployment.

Cloudera Bug: DIR-4877

High Azure Standard Storage Disk Usage

Azure Standard Disks are billed for used space + transactions (see Azure Storage Standard Disk Pricing). In Cloudera Director 2.2, Standard Storage Virtual Hard Disks (VHDs) are mounted without the "discard" option. As a result, if a file is deleted on the VHD it does not release this space back to Azure Standard Storage and it will continue to be billed as used space. Note: this issue does not cause disk space leakage; space occupied by deleted files can still be used by new files.

To address this problem, edit the prepare_unmounted_volumes file to add the discard mount option. prepare_unmounted_volumes is located at /var/lib/cloudera-director-plugins/azure-provider-1.1.0/etc/.

Change line 78 from:

echo "UUID=${blockid} $mount $FS defaults,noatime 0 0" >> /etc/fstab

to

echo "UUID=${blockid} $mount $FS defaults,noatime,discard 0 0" >> /etc/fstab

Restart the Cloudera Director server service after making this change.

Cloudera Bug: DIR-4719

Java Clients return null for a 404 ("Not Found") response

The Java client currently can return null values for both 204 and 404 response codes from the collectDiagnosticData service endpoint. Therefore, it is difficult to tell if a collection call fails because a deployment or cluster is missing. In this case, poll for the status for a finite amount of time. If the poll times out, consider the collection attempt failed.

Cloudera Bug: DIR-4628

Incorrect choice of response code for cluster update failure

An API request to update a cluster fails if the cluster is in transition, for example, if it is already being updated. The response code for the failure, however, is 204, which indicates success.

Cloudera Bug: DIR-4496

Environments may not be able to be deleted temporarily

Even when an environment is empty, that is, all of its deployments and external databases have been deleted, it can take five to ten minutes before it is possible to delete the environment. This is due to remaining data structures that have not yet been automatically cleaned up.

Cloudera Bug: DIR-4438

SELinux remains enabled on instances allocated by Director

Depending on the operating system, Cloudera Director may misread the state of SELinux on instances it allocates and determine that it is disabled, when it is actually still enabled. This can lead to errors running Cloudera Manager or cluster services.

Cloudera Bug: DIR-4425

Security group validation should be configurable

This change provides a new capability for end users to enforce network requirements. It allows users to define the network rules configuration and validates AWS security groups against the pre-defined rules. When writing rules, users can not only define allowed networking traffic, but also deny traffic against specific ports from a list of IP ranges.

Cloudera Bug: DIR-4339

Time daemons do not run properly on RHEL and CentOS 7.x instances

The choice of standard time daemon for RHEL and CentOS 7.x releases has changed from ntpd to chronyd. However, Cloudera Director does not perform the correct commands when normalizing instances to properly set up chronyd. Instances may end up with ntpd running, or no time daemon running at all. To avoid this, rely on ntpd for time synchronization, and use an instance bootstrap script to disable chronyd and enable ntpd. For more information, see Configuring NTP Using NTPD in the Red Hat Linux 7 System Administrator's Guide.

Cloudera Bug: DIR-3994

Issues Fixed in Cloudera Director 2.2.0

Storage Encryption for AWS RDS Instances

Before Cloudera Director 2.2, storage encryption for AWS RDS instances was not supported, despite the presence of a KMS key ID field in the web UI form for describing RDS instances. The web UI field was ignored. In Cloudera Director 2.2, storage encryption is supported, using the default key ID associated with RDS for the AWS account. Use of a non-default KMS key is not supported, and the KMS key ID field has been removed from the web UI. See Defining External Database Servers for information on enabling storage encryption for a new RDS instance.

Cloudera Bug: DIR-1407

Cannot update environment credentials of environments deployed on Microsoft Azure

With Cloudera Director on Microsoft Azure, the Update Environment Credentials web UI displays only some properties, and does not display all the properties required for the update.

Cloudera Bug: DIR-4072

Azure operation timeout

Some Azure operations, such as VM creation and deletion, can take longer to complete than the default timeout value of 20 minutes. When this occurs, the Cloudera Director Azure plugin will timeout the Azure operation, resulting in a failure to complete the operation. Adjusting the Cloudera Director server timeout does not help.

Wait until Azure operation time drops back to normal range (less than 20 minutes).

Affected Versions: Cloudera Director 2.1.0, 2.1.1. Beginning in Cloudera Director 2.2.0, the user can change the timeout value for Azure if the default value of 20 minutes is not long enough.

Cloudera Bug: PARTNER-2860

Deployment fails on Azure due to incompatible instance type existing in an Availability Set

VM creation fails if the VM of one series (for example, DS13) is deployed into an Azure Availability Set that already contains one or more VMs from a different series (for example, DS13_V2). This is an Azure platform restriction.

Affected Versions: Cloudera Director 2.1.0, 2.1.1. Beginning in Cloudera Director 2.2.0, an error is reported when an instance template is created that will cause a VM to be deployed into an incompatible Availability Set.

Cloudera Bug: PARTNER-2941

Add check to make sure resources are in the same region

VM creation fails when using resources from one region (for example, a VNET in EastUS) to deploy a VM in another region (for example, WestUS). This is an invalid configuration yet it may not be obvious when configuring an instance template.

Affected Versions: Cloudera Director 2.1.0, 2.1.1. Beginning in Cloudera Director 2.2.0, an error will be shown if the user tries to configure an instance template with resources from a different region than what is defined at the environment level.

Cloudera Bug: PARTNER-2952

Some valid host FQDN suffixes are not allowed in the Azure instance template

The regex check for the host FQDN suffix (DNS domain on the private cluster network) does not allow valid host FQDN with fewer than three characters. For example, company.us is not allowed.

Affected Versions: Cloudera Director 2.1.0, 2.1.1. Beginning in Cloudera Director 2.2.0, the check for host FQDN has been relaxed to allow names like company.us or company.1.us.

Cloudera Bug: PARTNER-3106

Merge user-provided image configuration files with internal ones

Updating a Cloudera Director Azure plugin configuration file (images.conf) requires replacing the entire configuration file, even if only part of the configuration file needs to be updated.

Affected Versions: Cloudera Director 2.1.0, 2.1.1. Beginning in Cloudera Director 2.2.0, the user can provide partial Azure plugin configuration files containing only the portions to be updated.

Cloudera Bug: PARTNER-2567

Issues Fixed in Cloudera Director 2.1.1

Cloudera Director cannot connect to restarted VMs on Azure

Restarted VMs on Microsoft Azure are sometimes assigned a new IP address. This causes the cached IP address in Cloudera Director to become stale, so that Cloudera Director is unable to connect to the VMs.

Affected Version: Cloudera Director 2.1.0.

Cloudera Bug: PARTNER-2607

Public IP attached to a VM on Azure is deleted when the VM is deleted

Any public IP attached to a VM is deleted when the VM is deleted, even if that public IP was not created by the plugin.

Affected Version: Cloudera Director 2.1.0.

Cloudera Bug: PARTNER-2617

Cloudera Director web UI handles errors incorrectly with failed instance template validation on Azure

When the Microsoft Azure subscription permissions are not properly set up, an unexpected error can occur, causing instance template validation to exit. This error is not properly displayed in the Cloudera Director web UI.

Affected Version: Cloudera Director 2.1.0.

Cloudera Bug: PARTNER-2805

Resource name cannot contain special characters

A deployment may fail if the compute resource group used for Azure deployment contains special characters such as an underscore (_). Resource group names are sometimes used in the construction of resource names, causing deployments to fail if the resource group names contain special characters, because the naming restrictions are different for resource group names and resource names.

Affected Version: Cloudera Director 2.1.0.

Cloudera Bug: PARTNER-2848

Bootstrapping of clusters may fail if configured to not associate public IP addresses with EC2 instances

When using AWS, if the user deselects the Associate public IP addresses checkbox, instructing Cloudera Director to not assign public IP addresses to the EC2 instances it creates, Cloudera Director incorrectly interprets the missing public IP address of each instance as localhost (the Cloudera Director instance itself). Under certain conditions, this can lead to a variety of errors, including bootstrap failures and corruption of the Cloudera Director instance.

Affected Version: Cloudera Director 2.1.0.

Cloudera Bug: DIR-3713

Database server password fails if it contains special characters

Cloudera Director server does not handle special characters properly in database server admin/root passwords.

Cloudera Bug: DIR-3622

Update Cloudera Manager Credentials fails in certain scenarios

Cloudera Director erroneously rejects the credentials update as an unsupported modification if sensitive fields are configured on the deployment. The sensitive fields include license, billingId, and krbAdminPassword.

Cloudera Bug: DIR-3879

Cloudera Director server fails to start after upgrade under some circumstances

During an upgrade, Cloudera Director expects the Cloudera Manager instances it has deployed to match the instance template that was used while bootstrapping those instances. If the instance was modified out of band of Cloudera Director, then the server fails to start. An example of a mismatch is if the instance type of the Cloudera Manager instance was modified from within the cloud provider console.

Cloudera Bug: DIR-3956

Cluster bootstrap fails with high task parallelism

For high values of lp.bootstrap.parallelBatchSize, Cloudera Director fails to bootstrap clusters and throws an exception indicating that it failed to write intermediate state to the database. The default value of lp.bootstrap.parallelBatchSize is 20. lp.bootstrap.parallelBatchSize controls how many operations Cloudera Director should do in parallel while configuring a cluster.

Cloudera Bug: DIR-3771

Modifying a cluster can leave some roles marked as stale in Cloudera Manager

When growing or shrinking a cluster, you are presented with the option of restarting the cluster. The restart operation should only restart roles that are marked stale by Cloudera Manager, that is, only roles that need to be restarted. This optimization serves to minimize cluster downtime. However, with Cloudera Director 2.1.x, some stale roles might not be restarted, even though the Restart Cluster option is selected.

Cloudera Bug: DIR-3466

Default memory autoconfiguration for monitoring services may be suboptimal

Depending on the size of your cluster and your instance types, you may need to manually increase the memory limits for the Host Monitor and Service Monitor. Cloudera Manager displays a configuration validation warning or error if the memory limits are insufficient.

Cloudera Bug: DIR-2205

Issues Fixed in Cloudera Director 2.1.0

Validation error after initial setup with high availability

When you set up HDFS high availability using Cloudera Director, the secondary NameNode is not configured, because it is not required for high availability. Because of a Cloudera Manager bug, the absence of a secondary NameNode causes an erroneous validation error to appear in Cloudera Manager in HDFS > Configuration > HDFS Checkpoint Directories.

Cloudera Bug: DIR-1893

Repository or parcel URLs with internal domain names fail validation

Repository or parcel URLs fail validation in Cloudera Director when they are specified with internal domain names.

Cloudera Bug: DIR-2794

Database-related error when running Cloudera Director CLI after upgrade

When run after upgrade, the Cloudera Director CLI performs steps to upgrade its local database from the previous version. It can report an error:
Referential integrity management for DEFAULT not implemented.

Cloudera Bug: DIR-3587

Cloudera Director Does Not Recognize Cloudera Manager Password Changes

Cloudera Director does not recognize changes in the admin password in Cloudera Manager unless the username associated with the new password is also changed.

Cloudera Bug: DIR-2868

Incorrect yum repo definitions for Google Compute Engine RHEL images

The default RHEL 6 image defined in director-google-plugin version 1.0.1 and lower has an incorrect yum repo definition. This causes yum commands to fail after yum caches are cleared. See the Google Compute Engine issue tracker for issue details.

Cloudera Bug: DIR-2669

Long version string required for Kafka

Kafka requires a nonintuitive version string to be specified in the configuration file or web UI.

Cloudera Bug: DIR-2298

Issues Fixed in Cloudera Director 2.0.0

Cloning and growing a Kerberos-enabled cluster fails

Cloning of a cluster that uses Kerberos authentication fails, whether it is cloned manually or by using the kerberize-cluster.py script. Growing a cluster that uses Kerberos authentication fails.

Cloudera Bug: DIR-1614

Kafka with Cloudera Manager 5.4 and lower causes failure

Kafka installed with Cloudera Manager 5.4 and lower causes the Cloudera Manager installation wizard, and therefore the bootstrap process, to fail, unless you override the configuration setting broker_max_heap_size.

Cloudera Bug: DIR-2240

Cloudera Director does not set up external databases for Oozie and Hue

Cloudera Director cannot set up external databases for Oozie and Hue.

Cloudera Bug: DIR-996

Issues Fixed in Cloudera Director 1.5.2

Apache Commons Collections deserialization vulnerability

Cloudera has learned of a potential security vulnerability in a third-party library called the Apache Commons Collections. This library is used in products distributed and supported by Cloudera (“Cloudera Products”), including Cloudera Director. At this time, no specific attack vector for this vulnerability has been identified as present in Cloudera Products.

The Apache Commons Collections potential security vulnerability is titled “Arbitrary remote code execution with InvokerTransformer” and is tracked by COLLECTIONS-580. MITRE has not issued a CVE, but related CVE-2015-4852 has been filed for the vulnerability. CERT has issued Vulnerability Note #576313 for this issue.

Releases affected: Cloudera Director 1.5.1 and lower, CDH 5.5.0, CDH 5.4.8 and lower, Cloudera Manager 5.5.0, Cloudera Manager 5.4.8 and lower, Cloudera Navigator 2.4.0, and Cloudera Navigator 2.3.8 and lower

Users affected: All

Severity (Low/Medium/High): High

Impact: This potential vulnerability may enable an attacker to run arbitrary code from a remote machine without requiring authentication.

Immediate action required: Upgrade to Cloudera Director 1.5.2, Cloudera Manager 5.5.1, and CDH 5.5.1.

Serialization for complex nested types in Python API client

Serialization for complex nested types has been fixed in the Python API client.

Issues Fixed in Cloudera Director 1.5.1

Support for configuration keys containing special characters

Configuration file parsing has been updated to correctly support quoted configuration keys containing special characters such as colons and periods. This enables the usage of special characters in service and role type configurations, and in instance tag keys.

Issues Fixed in Cloudera Director 1.5.0

Growing clusters may fail when using a repository URL that only specifies major and minor versions

When using a Cloudera Manager package repository or CDH/parcel repository URL that only specifies the major or minor versions, Cloudera Director may incorrectly use the latest available version when trying to grow a cluster.

For Cloudera Manager: https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.3.3/

For CDH: https://archive.cloudera.com/cdh5/parcels/5.3.3/

Cloudera Bug: DIR-1482

Flume does not start automatically after first run

Although you can deploy Flume through Cloudera Director, you must start it manually using Cloudera Manager after Cloudera Director bootstraps the cluster.

Cloudera Bug: DIR-779

Impala daemons attempt to connect over IPv6

Impala daemons attempt to connect over IPv6.

Cloudera Bug: DIR-939

DNS queries occasionally time out with AWS VPN

DNS queries occasionally time out with AWS VPN.

Cloudera Bug: DIR-972

Issues Fixed in Cloudera Director 1.1.3

Ensure accurate time on startup

Instance normalization has been improved to ensure that time is synchronized by Network Time Protocol (NTP) before bootstrapping, which improves cluster reliability and consistency.

Cloudera Bug: DIR-1424

Speed up ephemeral drive preparation

Instance drive preparation during the bootstrapping process was slow, especially for instances with many large ephemeral drives. Time required for this process has been reduced.

Cloudera Bug: DIR-1265

Fix typographical error in the virtualizationmappings.properties file

The d2 instance type d2.4xlarge was incorrectly entered into Cloudera Director as d3.4xlarge in virtualizationmappings.properties. This has been corrected.

Cloudera Bug: DIR-1326

Avoid upgrading preinstalled Cloudera Manager packages

Cloudera Director no longer upgrades preinstalled Cloudera Manager packages.

Cloudera Bug: DIR-1370

Issues Fixed in Cloudera Director 1.1.2

Parcel validation fails when using HTTP proxy

Parcel validation now works when configuring an HTTP proxy for Cloudera Director server, allowing correctly configured parcel repository URLs to be used as expected.

Cloudera Bug: DIR-1251

Unable to grow a cluster after upgrading Cloudera Director 1.0 to 1.1.0 or 1.1.1

Cloudera Director now sets up parcel repository URLs correctly when a cluster is modified.

Cloudera Bug: DIR-1247

Add support for d2 and c4 AWS instance types

Cloudera Director now includes support for new AWS instance types d2 and c4. Cloudera Director can be configured to use additional instance types at any point as they become available in AWS.

Cloudera Bug: DIR-1070

Issues Fixed in Cloudera Director 1.1.1

Service-level custom configurations are ignored

Restored the ability to have service-level custom configurations. Due to internal refactoring changes, it was no longer possible to override service-level configs.

Cloudera Bug: DIR-1198

The property customBannerText is ignored and not handled as a deprecated property

Restored the customBannerText configuration file property, which was removed during the internal refactoring work.

Cloudera Bug: DIR-1199

Fixed progress bar issues when a job fails

The web UI showed a progress bar even when a job had failed.

Cloudera Bug: DIR-1073

Updated IAM Help text on Add Environment page

The help text on the Add Environment page for Role-based keys should refer to AWS Identity and Access Management (IAM), not to AMI.

Cloudera Bug: DIR-1122

Add eu-central-1 to the region dropdown

The eu-central-1 region has been added to the region dropdown on the Add Environment page.

Cloudera Bug: DIR-1146

Gateway roles should assign YARN, HDFS, and Spark gateway roles

All available gateway roles, including YARN, HDFS, and Spark, should be deployed by default on the instance.

Cloudera Bug: DIR-1114

Spark on YARN should be shown on the Modify Cluster page

Spark on YARN did not appear in the list of services on the Modify Cluster page.

Cloudera Bug: DIR-1115