Creating a join
Learn how you can create new data joins in a dataset in Cloudera Data Visualization.
This example shows you how to create new data joins using the Flight Delays dataset.
-
On the main navigation bar, click DATA.
The Data view opens, displaying the Datasets tab.
- Create a new dataset based on the sample datafile.
- Find the dataset in the list of datasets, either by scrolling or by using search,
and click it.The dataset side navigation panel appears, open on Dataset Detail view.
- In the side navigation menu, click Data Model.The Data Model view appears, and shows the name of the only table in the dataset. You may click SHOW DATA to display the data of that table.
-
Click EDIT DATA MODEL.
- Click the plus sign on the table representation.
The Table Browser modal window opens.
- In the Table Browser modal window, make the following
selections:
-
In the Database Name selector, choose the database documentation.
-
In the Table Name selector, choose the table name
airline_id
.This value is pre-populated to match the existing table of the dataset, but it may be changed.
- Click SELECT.
The Edit Join modal window opens.
-
- In the Edit Join modal window, the following options are
available:
- [Optional] Click Clear Fields to clear all already defined joins between the two tables.
- [Optional] Click sample data to preview the data. Click again to hide sample data.
- [Optional] Click Add Join Pair to add another column connection between the same two tables.
- [Optional] Click Add Join Expression to add a join between the two tables based on a custom SQL expression.
- [Optional] Click icon (minus) to remove an existing join pair or an existing join expression.
- [Optional] Under Join Expressions, click the text box to open the Join Expression interface and specify or update a custom SQL expression that defines the join conditions.
- Click APPLY to save the changes.
-
In the Edit Join modal window, do the following:
- Select the matching columns for both tables. On the left side, select the field
airlineid
. On the right side, select the fieldcode
. - Click Sample Data to view some data in both columns, and verify that the join makes sense. Click again to hide sample data.
- Select the matching columns for both tables. On the left side, select the field
- Click APPLY.
- Repeat the previous two steps to create seven more joins as follows:
-
The table
airport_codes
has two joins to the main table, and you must create each join separately as follows:- Left column
origin
= right columncode
. - Left column
dest
= right columncode
.
- Left column
- The table
cancellation_code
has a join for left columncancellationcode
= right columncode
. - The table
airport_lat_long
has two joins to the main table, and you must create each join separately as follows:- Left column
origin
= right columnlocationid
. - Left column
dest
= right columnlocationid
.
- Left column
- The table
state_abbreviations
has two joins to the main table, and you must create each join separately:- Left column
deststate
= right columnabbreviation
. - Left column
originstate
= right columnabbreviation
.
This step is optional, and depends on whether your
flights_*
table has fully extended state names. - Left column
-
-
Click SAVE.
- Optional: Click the (link) icon to edit joins or to change join type.