Understanding Data Lake details
To access information related to your Data Lake cluster and access cluster actions, navigate to the Management Console service > Data Lakes.
Each Data Lake cluster is represented by an entry on the Data Lakes page. To get more information about a specific Data Lake cluster, click on the tile representing your cluster.
Environment Details
This section includes information related to the CDP environment in which the Data Lake cluster is running:
Item | Description |
---|---|
Cloud Provider | The logo of the cloud provider where the cluster is running. |
Name | The name of the environment used to create the cluster. |
Credential | The name of the credential used to create the cluster. |
Region | The region in which the cluster is running in the cloud provider infrastructure. |
Availability Zone | The availability zone within the region in which the cluster is running. |
Services
Click logos in the Services section to open the user interface for the components that are running in the Data Lake cluster.
Cloudera Manager Info
The Cloudera Manager Info section provides the following information:
Item | Description |
---|---|
CM URL | Link to the Cloudera Manager web UI. |
CM Version | The Cloudera Manager version which the cluster is currently running. |
Platform Version | The Cloudera Runtime platform version which the cluster is currently running. |
Event History and other tabs
The Data Lake page provides additional details organized in tabs, starting with the Event History tab:
Item | Description |
---|---|
Event History | Events logged for the cluster, with the most recent event at the top. The Download option allows you to download the event history to a local file. The events are formatted in JSON and compressed. |
Hardware | Information about your cluster instances: instance names, instance IDs, instance types, their status, fully qualified domain names (FQDNs), and private and public IP addresses. Click to access information about the instance, storage, image, and packages installed on the image. |
Cloud Storage | External storage locations for database and files used by Data Lake services, such as HMS database, Ranger audit database, HBase files (storage for Atlas metadata). |
Tags | User-defined tags, listed in the order they were added. |
Endpoints | Endpoints for various cluster services, such as the URL for the Ranger user interface for defining data access policies. |
Recipes | Future home of a list of custom scripts attached to this Data Lake. Each "recipe" lists its name, type, and the host group on which it was executed. |
Attached clusters | The workload clusters using the services of this Data Lake; this information repeats the list of clusters found in the other tabs of this CDP Environment. |
Repository Details | Cloudera Manager and Cloudera Runtime repository information, in more detail than shown in the Cloudera Manager Info section. |
Image Details | Cluster node base image details. |
Network | Names of the network and subnet in which the cluster is running and the links to the related cloud provider console |
Telemetry | The instance profile and cloud storage location specified during environment setup under Log Storage and Audits for service logs. |
Show CLI Command
You can click the Show CLI Command button to review the CLI command used to create the Data Lake, and copy it if you want to create a similar Data Lake through the CDP CLI. Ensure that any Data Lake that you create has a unique name. For more information on the CDP CLI commands to create a Data Lake, review the CLI documentation for Data Lakes.