In Cloudera Data Warehouse, you explore sample airline database tables in your data
lake from a Virtual Warehouse. You learn how to load the airline data. You view and query
the tables.
You obtained permissions to access a running environment for creating a Database
Catalog and Virtual Warehouse.
You obtained the DWAdmin role to perform Data Warehouse tasks.
You logged into the CDP web interface.
You activated the environment from Cloudera Data Warehouse.
You set up a Hadoop SQL policy in Ranger to access data in the data lake.
For more information about meeting prerequisites, see Getting started in CDW. In this task, you set up a minimal Virtual Warehouse for learning how to explore a
data lake. You do not need to configure auto-scaling and optional features to explore a
data lake. Plan to delete the Virtual Warehouse after your exploration.
Navigate to Data Warehouse > Database Catalogs > New Database Catalog.
In New Database Catalog, in Name, specify a Database Catalog name.
In Environments, select the name of your environment activated from Cloudera
Data Warehouse.
Accept default values for the image version and data lake type (SDX).
Turn on Load Demo Data to explore sample airline data from Hue, and click
Create.
Click Virtual Warehouses > New Virtual Warehouse.
Set up the Virtual Warehouse:
Specify a Name for the Virtual Warehouse.
In Type, click the SQL engine you prefer: Hive or
Impala.
Select your Database Catalog and User
Group if you have been assigned a user group.
In Size, select the number of executors, for example
xsmall-2Executors.
Accept default values for other settings.
Click Create.
After your Virtual Warehouse starts running, click Hue,
and expand Tables to explore available data.
Explore data lake contents by running queries.
For example, select all data from the airlines table.