Exploring a Data Lake
In Cloudera Data Warehouse (CDW) Private Cloud, a default Database Catalog is linked to the Hive MetaStore (HMS) on the CDP Private Cloud Base cluster. As soon as you activate an environment in CDW, you can explore and query databases and tables that are present on your base cluster. If you are new to CDP, then you can also import data using Hue and run queries.
- You must have the DWAdmin role.
- Your must have activated an environment in CDW.
- You must have a default Database Catalog running.
- Log in to the Data Warehouse service as DWAdmin.
- Click .
Set up the Virtual Warehouse:
- Specify a Name for the Virtual Warehouse.
- In Type, click the SQL engine you prefer: Hive or Impala.
- Select your Database Catalog and User Group if you have been assigned a user group.
- In Size, select the number of executors, for example xsmall-2Executors.
- Accept default values for other settings.
- Click Create.
After your Virtual Warehouse starts running, click Hue,
and expand Tables to explore available data.
Explore data lake contents by running queries.
For example, select all data from the airlines table.