Cloudera Data Visualization enables you to create Correlation Heatmap visuals.
Correlation Heatmaps use colored cells, typically in a monochromatic scale, to show a 2D
correlation matrix (table) between two discrete dimensions or event types. The values of the
first dimensions appear as rows of the table, while the values of the second dimension are
represented by the columns of the table. The color value of the cells is proportional to the
number of measurements that match the dimensional values. This enables you to quickly identify
incidence patterns, and to recognize anomalies.
Correlation Heatmap visuals are similar to Chords because they both compare exactly two
dimensions. Correlation heatmaps are ideal for comparing the measurement for each pair of
dimension values.
The following steps demonstrate how to create a new correlation visual on the SFPD
Incidents dataset, which is based on data previously imported into Cloudera Data Visualization from the sfpd_incidents.csv datafile
[data source default.sfpd_incidents].
For an overview of shelves that specify this visual, see Shelves for correlation
heatmaps.
Start a new visual based on the SFPD Incidents dataset.
For instructions, see Creating a visual.
In the VISUALS menu, find and click Correlation
Heatmap.
The shelves of the visual changed. They are now Dimensions,
Measures, Tooltips, X
Trellis, Y Trellis, and
Filters. Both Dimensions and
Measures are mandatory.
To show specific items, populate the shelves from the available fields
(Dimensions and Measures) in the
DATA menu.
Under Dimensions, select pddistrict and
drag it to the Dimensions shelf.
Under Dimensions, select descript and drag
it to the Dimensions shelf.
Under Measures, select Record Count and
drag it to the Measures shelf.
Record Count is defined as a sum of events. If
you hover over it with your mouse, you can see a black detail bubble with
sum(1) contents.
Click REFRESH VISUAL.
The default correlation heatmap visual appears.You can see that this dataset has a very large number of possible values that
represent the columns of the table. If you scroll to the right, you will see some cells
rendered in dark shades of green.
To examine a shorter list of categories, you can add some filtering to the
visual.
Under Dimensions, select datetime and drag
it to the Filters shelf .
Repeat it with category and descript.
On the Filters shelf, click the (down
arrow) on the descript field, and then click
Pick values from a list.
Select a number of values.
In this example, 7 distinct options were picked.
Click REFRESH VISUAL.
You can see that this smaller matrix also shows the entire range of color values.
Click the pencil/edit icon next to the title of the
visualization to enter a name for the visual.
In this example, the title is changed to 'SFPD Incidents - Correlation Heat Map'. You
can also add a brief description of the visual as a subtitle below the title of the
visualization.
At the top left corner of the Dashboard Designer, click
SAVE.