Known issues

This section lists known issues that you might run into while using the Management Console service.

CB-6924 Workaround for ZooKeeper external volume bug

In the current version of CDP, ZooKeeper might be configured to write to CDP's root disk which is too small to accommodate the ZooKeeper data. To correct this issue, you need to reconfigure ZooKeeper to write to an external volume and move any ZooKeeper data to that volume.

To check if ZooKeeper is configured to use an external volume, complete the following:

  1. Open ZooKeeper and navigate to: ZooKeeper menu item -> Configuration tab -> Filter to Server.
  2. If the dataDir and dataLogDir fields contain /hadoopfs/fs1/zookeeper you do not need to do anything.
  3. If the fields contain any other values, you must reconfigure ZooKeeper.

To reconfigure ZooKeeper, complete the following:

  1. ssh into the machine where the ZooKeeper server is running .
  2. Run the following command to change the user:

    sudo -su zookeeper

  3. Run the following command:

    cp -R /var/lib/zookeeper/ /hadoopfs/fs1/zookeeper

  4. Open the cluster from the Cloudbreak user interface.
  5. Log into the Clouder Manager user interface.
  6. Find ZooKeeper on the Cloudera Manager page and navigate to the configuration with either the Search box or select it from the side menu: ZooKeeper menu item -> Configuration tab -> Filter to Server.
  7. Change the following properties:
    dataDir: /hadoopfs/fs1/zookeeper
    dataLogDir: /hadoopfs/fs1/zookeeper
  8. Save your changes
  9. Restart the Stale configuration

    You do not need to redeploy ZooKeeper.

CB-3876 Data Warehouse and Machine Learning create security groups

Problem: When during environment registration you choose to use your own security groups, the Data Warehouse and Machine Learning services do not use these security groups but create their own.

Workaround: There is no workaround.

CRB-971 Data Warehouse creates IAM, S3, and DynamoDB resources

Problem: The Data Warehouse service creates its own S3 buckets, DynamoDB tables, and IAM roles and policies. It does not use the environment's S3 bucket(s), DynamoDB table, and IAM roles and policies.

Workaround: There is no workaround.

CB-4176 Data Lake cluster repair fails after manual stop

Problem: Data Lake cluster repair fails after an instance has been stopped manually via AWS console or AWS CLI.

Workaround: After stopping a cluster instance manually, restart it manually via the AWS console or AWS CLI, and then use the Sync option in CDP to sync instance state.

CB-2813 Environment with ML workspaces in it can be deleted

Problem: When deleting an environment that uses a customer-created VPC and subnets, there is no mechanism in place to check for any existing ML workspaces running within the environment. As a result, an environment can be deleted when ML workspaces are currently running in it.

Workaround: If using an environment created within your existing VPC and subnets, prior to deleting an environment, ensure that there are no ML workspaces running within the environment.

CB-3459 Subnet dependency error when deleting an environment

Problem: When deleting an environment that uses a VPC and subnets created by CDP, the environment deletion fails with an error similar to:
com.sequenceiq.cloudbreak.cloud.exception.CloudConnectorException: AWS CloudFormation stack reached an error state: DELETE_FAILED reason: The subnet 'subnet-05606fd72fda58c8c' has dependencies and cannot be deleted. (Service: AmazonEC2; Status Code: 400; Error Code: DependencyViolation; Request ID: da9a7fe0-ac43-467e-9942-94f10e6bd2b7)
This error occurs if there are resources such as instances used for Data Warehouse, or Machine Learning cluster nodes that were not deleted prior to environment termination.

Workaround: Prior to terminating an environment, you must terminate all clusters running within that environment.

CB-4248 Expired certificate causes untrusted connection warning

Problem: CDP automatically generates an SSL certificate for every Data Lake and Data Hub cluster. There are two possibilities:
  • By default, CDP generates a trusted certificate valid for 3 months.
  • If generating a trusted certificate fails, CDO generates a self-signed certificate valid for 2 years.

In the first case, if your cluster stays active for over 3 months, the trusted certificate will expire and you will see an "untrusted connection", warning when trying to access cluster UIs from your browser.

Workaround: To fix this, you should generate a new certificate by using the following steps:
  1. Use the Renew certificate UI option:
    • For Data Lake - Click the Renew certificate button on the Data Lake details page.
    • For Data Hub - Click Actions > Renew certificate on Data Hub cluster details page.

    During certificate renewal, several related messages will be written to Event History. Once the certificate renewal has been completed, the following message appears: "Renewal of the cluster's certificate finished."

  2. Additionally, if your cluster was created prior to December 19, you need to perform the following manual steps:
    1. SSH to the Knox gateway host on your cluster.
    2. Run the hostname command to get your domain name.
    3. Run the following commands (just replace the domain name test-master.dev.cldr.work with your correct, fully-qualified domain name):
      sudo sh -c '/opt/salt_2017.7.5/bin/salt --out=newline_values_only 'test-master.dev.cldr.work' pillar.get gateway:userfacingcert > /etc/certs-user-facing/server.pem'
      sudo systemctl reload nginx.service