Correlation flow

Cloudera Data Visualization enables you to create Correlation Flow visuals. Correlation Flows enable quick visual analysis of co-occurrence of two or more distinct dimensions. In simple form, a Correlation Flow visual is a linear representations of a two-dimensional Correlation Heatmap matrix. Where a Correlation Heatmap represents correlation values through color intensity, the Correlation Flow plots thicker lines and taller dimension boxes to represent greater correlation.

The following steps demonstrate how to create a new Correlation Flow visual on the Customer Value Analysis dataset. This dataset is based on data previously imported into Cloudera Data Visualization from the customer-value-analysis.csv datafile.

For an overview of shelves that specify this visual, see Shelves for correlation flows.

  1. Start a new visual based on the Customer Value Analysis dataset.
    For instructions, see Creating a visual.
  2. In the VISUALS menu, find and click Correlation Flow.
    The shelves of the visual changed. They are now Dimensions, Measures, Tooltips, and Filters. Both Dimensions and Measures are mandatory.
  3. Populate the shelves from the available fields:
    1. Under Dimensions, select Vehicle Class and drag it to the Dimension shelf.
    2. Repeat it for Vehicle Size, Employment Status, Policy Type, and Policy.
    3. Under Measures, select Record Count and drag it to the Measures shelf.
  4. Click REFRESH VISUAL.
    The correlation flow visual appears.
    Figure 1. Correlation Flow Visual
    You can see that each distinct line segment between two dimensional events reports its own correlation data, as demonstrated by the Tooltips in the following chart.
    • Tooltip 1 demonstrates the correlation between Four-Door and Medsize vehicles. The dataset has 3,237 Four-Door cars, and they represent 36% of the total. Additionally, 70% of these cars are considered Medsize. Viewing this from the other direction of the correlation relationship, 50% of all Medsize cars are four-door cars.
    • Tooltip 2 demonstrates the correlation between Medsize vehicles and Unemployed owners. It shows that 1,597 people are in both categories, and this segment represents 17% of all records. 25% of Medsize cars are owned by people who are unemployed, while 69% of unemployed people own a medsize vehicle.
    • Tooltip 3 demonstrates the correlation between Employed owners and the ones who have a Corporate Auto vehicle policy. It shows that 1,238 people are in these two categories, which is 14% of the total. 22% of employed owners have a corporate auto policy, while 63% of people with a corporate auto policy are actively employed.
    Figure 2. Correlation Flow Visual
  5. Click the pencil/edit icon next to the title of the visualization to enter a name for the visual.

    In this example, the title is changed to 'Customer Value Analysis - Correlation Flow'. You can also add a brief description of the visual as a subtitle below the title of the visualization.

  6. At the top left corner of the Dashboard Designer, click SAVE.