In Cloudera Data Warehouse, you explore sample airline database tables in your data
lake from a Virtual Warehouse. You learn how to load the airline data. You view and query
the tables.
You obtained permissions to access a running environment for creating a Database
Catalog and Virtual Warehouse.
You obtained the DWAdmin role to perform Data Warehouse tasks.
You logged into the CDP web interface.
You activated the environment from Cloudera Data Warehouse.
You set up a Hadoop SQL policy in Ranger to access data in the data lake.
For more information about meeting prerequisites, see Getting started in CDW. In this task, you set up a minimal Virtual Warehouse for learning how to explore a
data lake. You do not need to configure Virtual Warehouse executors for auto-scaling.
You can skip optional configurations to explore a data lake. Plan to delete the Virtual
Warehouse after your exploration.
Navigate to Data Warehouse > Database Catalogs > Add New.
In Create Database Catalog, in Name, specify a Database Catalog name.
In Environments, select the name of your environment activated from Cloudera
Data Warehouse.
In Select Size, accept the default (small) for this exploration of demo data.
Accept default value for the Database Catalog Image Version.
Turn on Load Demo Data to explore sample airline data from Hue, and click
Create Database Catalog.
Click Virtual Warehouses > Add
New.
Set up the Virtual Warehouse:
Specify a Name for the Virtual Warehouse.
In Type, click the SQL engine you prefer: Hive or
Impala.
Select your Database Catalog and User
Group if you have been assigned a user group.
In Size, select the number of executors, for example
xsmall-2Executors.
Accept default values for other settings.
Click Create.
After your Virtual Warehouse starts running, click Hue,
and expand Tables to explore available data.
Explore data lake contents by running queries.
For example, select all data from the airlines table.