Example of data API usage

To use the Data API interface in Cloudera Data Visualization, you must enable the Data API, obtain an API key, and then invoke the Data API.

The API provides no discovery interfaces to understand the structure and format of the data. The invoker of the interface must be familiar with the target dataset, and what dimensions, aggregates, and filters are appropriate for each use case.

To construct the request payload, see Data API request payload.

The following example of a Data API python code interfaces with the Cereals dataset that ships as a sample within most Cloudera Data Visualization installations.

Data API Programmatic Access

In this example, we are supplying the API Key that authorizes the user to access the dataset.

The host and port variables specify the running instance of the DataViz, and the api_key string has been obtained earlier, as described in Enabling Data API.

Note that you must replace the sample data request, dsreq, with your custom request. The dataset ID in this example is 11; it may be different on your system. Therefore, edit the last line of the dsreq specification to use the dataset ID of your target dataset.

import requests
import json

url = 'http://host:port/arc/api/data'

def _fetch_data(dsreq):
  headers = {
   'Authorization': 'apikey api_key'
  }

  params = {
    'version': 1,
    'dsreq': dsreq,
  }

  r = requests.post(url, headers=headers, data=params)
  if r.status_code != 200:
    print 'Error', r.status_code, r.content
    return

  raw = r.content
  d = json.loads(raw)
  print '\nData Request being sent is:\n', \
        json.dumps(json.loads(dsreq), indent=2)
  print '\nData Response returned is:\n', json.dumps(d, indent=2)


def main():

 # CHANGE the following dsreq to a data request of your choice.

  dsreq = 
"""{"version":1,"type":"SQL","limit":100,
"dimensions":[{"type":"SIMPLE","expr":"[manufacturer] as 'manufacturer'"}],
"aggregates":[{"expr":"sum([sodium_mg]) as 'sum(sodium_mg)'"},
{"expr":"avg([fat_grams]) as 'avg(fat_grams)'"},
{"expr":"sum(1) as 'Record Count'"}],
"filters":["[cold_or_hot] in ('C')"],
"dataset_id":11}"""

  _fetch_data(dsreq)

if __name__ == '__main__':
  main()