In the Superset UI, you connect to Druid data by filling out a dialog containing the
fully qualified domain names (FQDN) of nodes that run Druid components. You specify a slice
of data to visualize and query Druid. The visualization appears in the Superset
UI.
This task introduces you to the Superset Web UI, which appears after you sign in.
From the UI, you can navigate to the Apache documentation to obtain information not
covered in this documentation, such as defining a list of users who can access views,
opening Superset functions to certain groups of users, setting up permissions, and
viewing user statistics. For more information about authentication in Superset, see the
Flask documentation (link below).
You are running the Druid and Superset services in Ambari.
You ingested data, such as the Wikipedia data from the Wikiticker example, into
Druid.
The data is records of edits to Wikipedia data.
In Ambari, in Services > Superset > Summary > Quick Links, click
Superset.
In Superset Sign In, enter the Superset Admin name admin and
enter the Superset Admin password that you set up.
The Superset Web UI appears.
Select Sources > Druid Clusters.
Select Sources > Refresh Druid Metadata
In List Druid Data Source, the wikipedia data source appears.
Click the data source wikipedia.
The Data Source & Chart Type pane appears on the left. The canvas for
query results appears on the right.
At the top of the canvas, the UI includes controls for viewing the query in
JSON and downloading the query in JSON or CSV format:
In Data Source & Chart Type, build a query that slices your Wikipedia data.
For example, get the top 10 most-edited articles between September 12 and 13,
2015 by setting the following values.
Option
Description
Visualization Type
Distribution - Bar Chart
Time Granularity
All
Time - Since
9/12/2015
Time - Until
9/13/2015
Query - Metrics
COUNT(*)
Query - Series
page
Query - Row limit
10
In Since and Until, click Free form and enter a date in the format shown
above.
Click Run Query.
A bar chart appears showing the top 10 articles for the time frame you specified.
On the canvas, change the default title of the visualization from
undefined - untitled to Most Edits by Page Name, for
example.
Click Save a Slice , specify a
file name, and click OK.
In Data Source & Chart Type, create a table view that aggregates edits per
channel by changing the following values, run the query, and save the
slice:
Option
Description
Visualization Type
Distribution - Bar Chart
Time Granularity
1 hour
Time - Since
9/12/2015
Time - Until
9/13/2015
Group by
channel
Metrics
SUM(added)
Sort By
SUM(added)
The resulting table shows the number of edits per channel: