Restricting columns in datasets based on SQL query

In Cloudera Data Visualization, you can easily restrict the table columns in the dataset by changing the SQL definition of that dataset. SQL-defined datasets make it easy to limit their content to specific columns.

  1. Switch to Data Model interface, and click Show Data.

    There is a large number of columns in the query result, and many of them are not necessary when it comes to answering most common questions.

  2. Find the fields that you would like to keep in the dataset definition.
  3. Switch back to Dataset Detail interface, and edit SQL text window by applying the following statement:
    select county, stname, ctyname, tot_pop, tot_male, tot_female from main.us_counties

    In this example we keep the columns county, stname, ctyname, tot_pop, tot_male, and tot_female.

  4. Click Save.
  5. In the Refresh dataset table column information modal window, click Close.
  6. Switch back to the Data Model interface, click Show Data, and check that the dataset only has the explicitly specified columns:

    In this example we have kept the columns county, stname, ctyname, tot_pop, tot_male, and tot_female.