Data API request payload

Before invoking the Cloudera Data Visualization Data API, you must familiarize yourself with the syntax of the request payload.

Syntax of a Request Payload

The Data API has a consistent pattern for each payload, with the following basic access syntax:
{ 
        "version":version_number,
        "type":"SQL",
        "limit":number_of_rows,
        "dimensions":
          [{
             "type":"SIMPLE/SEGMENT",
             "expr":"[dimension_1] as 'dimension_value_1', 
             "expr":"[dimension_2] as 'dimension_value_2',
             "expr":"[dimension_n] as 'dimension_value_n',
             "order": {"asc": True/False, "pri": order_sequence}
          }],        
        "aggregates":
          [{
             "expr":["aggregate_1] as 'aggregate_value_1', 
             "expr":["aggregate_2] as 'aggregate_value_2',
             "expr":["aggregate_n] as 'aggregate_value_n',
          }],
        "filters":
           [
             "[filter_1] in ('filter_value_1')",
             "[filter_2] in ('filter_value_2')",
             "[filter_n] in ('filter_value_n')"
           ],
        "having":
           [
            "[aggregate_1]) > aggregate_value_1"
           ],
        "dataset_id":dataset_id_number   
}            

Parameters of a Request Payload

The request payload parameters are defined as follows:

version
Specifies the version of the API. This is used for backward compatibility in case the API changes in the future. This is a mandatory parameter. For example, version 1.
type
Specifies the type of query. This is a mandatory parameter. For example, SQL.
limit
Specifies the number of rows to return. This is a mandatory parameter.
dimensions

List of zero or more dimensions. Specifying no dimensions or aggregates returns all columns from the dataset. Each item in the list specifies a dimension that is requested, and has the following structure of key value pairs:

  • type

    Either SIMPLE or SEGMENT. Default is SIMPLE. Use field descriptions for expr and order with SIMPLE type.

  • expr

    Either a dimension or an aggregate with the following format: "expr":"[dimension_1] as 'dimension_value'. This is a mandatory parameter.

  • order

    Maximum of two key value pairs: ‘asc’, which is set to True or False. This indicates the order and priority, which is set to an integer and specifies the sequence in which the order is to be applied.

aggregates

List of zero or more aggregates. Specifying no aggregates returns all columns from the dataset. Each item in the list specifies an aggregate that is requested, and has the following structure of key value pairs:

  • expr

    Specifies one aggregate with the following format:

    [{"expr":["aggregate_1] as 'aggregate_value_1'}]
dataset_id
Specifies the ID of the dataset object to run the data query. This is a mandatory parameter.
filters
Specifies a comma-separated list of filter expressions that apply to the dataset using the ‘WHERE’ clause. All the filters specified are combined with an ‘AND’ operation. These filters are applied as the ‘WHERE clause in the generated SQL. This is an optional parameter.
having
Specifies expressions to filter the dataset using the HAVING clause. Therefore, this parameter has an aggregation comparison. This is an optional parameter.

Example of a Request Payload

Here is an example of a Data API request payload that interfaces with the Cereals dataset, which ships as a sample with most Cloudera Data Visualization installations.

{ 
        "version":1,
        "type":"SQL",
        "limit":100,
        "dimensions":
          [{
             "type":"SIMPLE",
             "expr":"[manufacturer] as 'manufacturer'", 
             "order": {"asc":False, "pri":1
          }],        
        "aggregates":
          [{
             "expr":["avg([fat_grams]) as 'avg(fat_grams)'", 
             "expr":["sum(1) as 'Record Count'"
          }],
        "filters":
           [
             ["[cold_or_hot] in ('C')"]
           ],
        "having":
           [
            ["[avg([fat_grams]) < 5"]
           ],
        "dataset_id":11