Exploring a Data Lake
In Cloudera Data Warehouse, you explore sample airline database tables in your data lake from a Virtual Warehouse. You learn how to load the airline data. You view and query the tables.
- You obtained permissions to access a running environment for creating a Database Catalog and Virtual Warehouse.
- You obtained the DWAdmin role to perform Data Warehouse tasks.
- You logged into the CDP web interface.
- Your activated the environment from Cloudera Data Warehouse.
- You set up a Hadoop SQL policy in Ranger to access data in the data lake.
- Navigate to .
- In New Database Catalog, in Name, specify a Database Catalog name.
- In Environments, select the name of your environment activated from Cloudera Data Warehouse .
Accept default values for the image version and data lake type (CDW).
Turn on Load Demo Data to explore sample airline data
from Hue, and click Create.
- Click .
Set up the Virtual Warehouse:
- Specify a Name for the Virtual Warehouse.
- In Type, click the SQL engine you prefer: Hive or Impala.
- Select your Database Catalog and User
Group if you have been assigned a user group.
- In Size, select the number of executors, for example xsmall-2Executors.
- Accept default values for other settings.
- Click Create.
After your Virtual Warehouse starts running, click Hue,
and expand Tables to explore available data.
Explore data lake contents by running queries.
For example, select all data from the airlines table.