Example of data API usage
To use the Data API interface in Cloudera Data Visualization, you must enable the Data API, obtain an API key, and then invoke the Data API.
The API provides no discovery interfaces to understand the structure and format of the data. The invoker of the interface must be familiar with the target dataset, and what dimensions, aggregates, and filters are appropriate for each use case.
To construct the request payload, see Data API request payload.
The following example of a Data API python code interfaces with the Cereals dataset that ships as a sample within most Cloudera Data Visualization installations.
Data API Programmatic Access
In this example, we are supplying the API Key that authorizes the user to access the dataset.
The host
and port
variables specify the
running instance of the DataViz, and the api_key
string has been
obtained earlier, as described in Enabling Data API.
Note that you must replace the sample data request, dsreq
, with
your custom request. The dataset ID in this example is 11; it may be different on
your system. Therefore, edit the last line of the dsreq
specification to use the dataset ID of your target dataset.
import requests
import json
url = 'http://host:port/arc/api/data'
def _fetch_data(dsreq):
headers = {
'Authorization': 'apikey api_key'
}
params = {
'version': 1,
'dsreq': dsreq,
}
r = requests.post(url, headers=headers, data=params)
if r.status_code != 200:
print 'Error', r.status_code, r.content
return
raw = r.content
d = json.loads(raw)
print '\nData Request being sent is:\n', \
json.dumps(json.loads(dsreq), indent=2)
print '\nData Response returned is:\n', json.dumps(d, indent=2)
def main():
# CHANGE the following dsreq to a data request of your choice.
dsreq =
"""{"version":1,"type":"SQL","limit":100,
"dimensions":[{"type":"SIMPLE","expr":"[manufacturer] as 'manufacturer'"}],
"aggregates":[{"expr":"sum([sodium_mg]) as 'sum(sodium_mg)'"},
{"expr":"avg([fat_grams]) as 'avg(fat_grams)'"},
{"expr":"sum(1) as 'Record Count'"}],
"filters":["[cold_or_hot] in ('C')"],
"dataset_id":11}"""
_fetch_data(dsreq)
if __name__ == '__main__':
main()