Monitoring a Data Lake

You can monitor the status of your Data Lake from the CDP web UI or CLI.

Monitoring Data Lake cluster via UI

To access information related to your Data Lake cluster from the CDP web UI, navigate to the Management Console service > Data Lakes. Each Data Lake cluster is represented by an entry on the Data Lakes page. To get more information about a specific Data Lake cluster, click on the tile representing your cluster. When a Data Lake cluster is healthy, its status should be Running.

To check health of specific hosts and services, navigate to Cloudera Manager.

Monitoring Data Lake cluster via CLI

You can view your available Data Lake clusters via CDP CLI using the following commands:

cdp datalake list-datalakes
cdp datalake describe-datalake
cdp datalake get-cluster-host-status
cdp datalake get-cluster-service-status
The cdp datalake list-datalakes command allows you to view a list of all available Data Lakes. For example:
cdp environments list-datalakes
{
    "datalakes": [
        {
            "datalakeName": "zookeeper-190920-144828-vg7",
            "crn": "crn:cdp:datalake:us-west-1:9d74eee4-1cad-45d7-b645-7ccf9edbb73d:datalake:4529591f-53ea-4196-90fc-5d780d7063a8",
            "status": "RUNNING",
            "environmentCrn": "crn:cdp:environments:us-west-1:9d74eee4-1cad-45d7-b645-7ccf9edbb73d:environment:b1935d68-85d5-4f50-a023-56fa96d01c45",
            "creationDate": "2019-09-20T12:49:55.669000+00:00",
            "statusReason": "Datalake is running"
        },
        {
            "datalakeName": "zookeeper-sqqsx",
            "crn": "crn:cdp:datalake:us-west-1:9d74eee4-1cad-45d7-b645-7ccf9edbb73d:datalake:92d66fed-c5d2-437c-a6eb-a54e40d36287",
            "status": "RUNNING",
            "environmentCrn": "crn:cdp:environments:us-west-1:9d74eee4-1cad-45d7-b645-7ccf9edbb73d:environment:1eb291b3-dd23-4bdd-a3e8-09579afdf5a8",
            "creationDate": "2019-09-25T09:24:08.017000+00:00",
            "statusReason": "Datalake is running"
        }
    ]
}
The cdp datalake describe-datalake command allows you to obtain basic information about a specific Data Lake cluster. For example:
cdp datalake describe-datalake --datalake-name test-data-lake
{
    "datalake": {
        "crn": "crn:cdp:datalake:us-west-1:9d74eee4-1cad-45d7-b645-7ccf9edbb73d:datalake:aa2e8e3e-2d6f-410b-bf3c-a3e02112bfc8",
        "datalakeName": "test-data-lake",
        "status": "RUNNING",
        "environmentCrn": "crn:cdp:environments:us-west-1:9d74eee4-1cad-45d7-b645-7ccf9edbb73d:environment:574aa1cb-7a51-45a2-97ae-dead97072145",
        "credentialCrn": "crn:altus:environments:us-west-1:9d74eee4-1cad-45d7-b645-7ccf9edbb73d:credential:83c861b6-5f62-4b83-a466-06de751a3964",
        "cloudPlatform": "AWS",
        "creationDate": "2019-09-20T22:09:22.422000+00:00",
        "clouderaManager": {
            "version": "7.x.0",
            "clouderaManagerRepositoryURL": "http://cloudera-build-us-west-1.vpc.cloudera.com/s3/build/1445641/cm7/7.0.1/redhat7/yum/",
            "clouderaManagerServerURL": "https://adar-test-data-lake.adar-tes.xcu2-8y8x.workload-dev.cloudera.com:8443/test-data-lake/cdp-proxy/cmf/home/"
        },
        "productVersions": [
            {
                "name": "CDH",
                "version": "7.0.1-1.cdh7.0.1.p0.1443705"
            }
        ],
        "statusReason": "Datalake is running",
        "awsConfiguration": {
            "instanceProfile": "arn:aws:iam::069336058373:instance-profile/idbroker-assume-role"
        }
    }
}
The cdp datalake get-cluster-host-status command allows you to obtain information about the health of each of your Data Lake hosts. For example:
cdp datalake get-cluster-host-status --cluster-name test-data-lake
{
    "hosts": [
        {
            "hostid": "5c8fb276620f0aa54bdd111e33ba5f58",
            "hostname": "idbroker1.cloudera.site",
            "healthSummary": "GOOD"
        },
        {
            "hostid": "30f27ab8472c9677985f04efc2b800c4",
            "hostname": "master0.cloudera.site",
            "healthSummary": "GOOD"
        }
    ]
}
The cdp datalake get-cluster-service-status command allows you to obtain information about the health of each service running on the Data Lake cluster. For example:
cdp datalake get-cluster-service-status --cluster-name test-data-lake
{
    "services": [
        {
            "type": "ZOOKEEPER",
            "state": "STARTED",
            "healthSummary": "GOOD",
            "healthChecks": [
                {
                    "name": "ZOOKEEPER_SERVERS_HEALTHY",
                    "summary": "GOOD"
                }
            ]
        },
        {
            "type": "HDFS",
            "state": "STARTED",
            "healthSummary": "GOOD",
            "healthChecks": [
                {
                    "name": "HDFS_DATA_NODES_HEALTHY",
                    "summary": "GOOD"
                },
                {
                    "name": "HDFS_VERIFY_EC_WITH_TOPOLOGY",
                    "summary": "DISABLED"
                }
            ]
        },
        {
            "type": "SOLR",
            "state": "STARTED",
            "healthSummary": "GOOD",
            "healthChecks": [
                {
                    "name": "SOLR_SOLR_SERVERS_HEALTHY",
                    "summary": "GOOD"
                }
            ]
        },
        {
            "type": "HIVE",
            "state": "STARTED",
            "healthSummary": "GOOD",
            "healthChecks": [
                {
                    "name": "HIVE_HIVEMETASTORES_HEALTHY",
                    "summary": "GOOD"
                }
            ]
        },
        {
            "type": "RANGER",
            "state": "STARTED",
            "healthSummary": "GOOD",
            "healthChecks": [
                {
                    "name": "RANGER_RANGER_ADMIN_HEALTHY",
                    "summary": "GOOD"
                },
                {
                    "name": "RANGER_RANGER_RANGER_TAGSYNC_HEALTH",
                    "summary": "GOOD"
                },
                {
                    "name": "RANGER_RANGER_RANGER_USERSYNC_HEALTH",
                    "summary": "GOOD"
                }
            ]
        },
        {
            "type": "HBASE",
            "state": "STARTED",
            "healthSummary": "GOOD",
            "healthChecks": [
                {
                    "name": "HBASE_REGION_SERVERS_HEALTHY",
                    "summary": "GOOD"
                }
            ]
        },
        {
            "type": "KAFKA",
            "state": "STARTED",
            "healthSummary": "GOOD",
            "healthChecks": [
                {
                    "name": "KAFKA_KAFKA_BROKER_HEALTHY",
                    "summary": "GOOD"
                }
            ]
        },
        {
            "type": "ATLAS",
            "state": "STARTED",
            "healthSummary": "GOOD",
            "healthChecks": [
                {
                    "name": "ATLAS_ATLAS_SERVER_HEALTHY",
                    "summary": "GOOD"
                }
            ]
        },
        {
            "type": "KNOX",
            "state": "STARTED",
            "healthSummary": "GOOD",
            "healthChecks": [
                {
                    "name": "KNOX_IDBROKER_HEALTHY",
                    "summary": "GOOD"
                },
                {
                    "name": "KNOX_KNOX_GATEWAY_HEALTHY",
                    "summary": "GOOD"
                }
            ]
        }
    ]
}