Data API request payload
Before invoking the Cloudera Data Visualization Data API, you must familiarize yourself with the syntax of the request payload.
Syntax of a Request Payload
{
"version":version_number,
"type":"SQL",
"limit":number_of_rows,
"dimensions":
[{
"type":"SIMPLE/SEGMENT",
"expr":"[dimension_1] as 'dimension_value_1',
"expr":"[dimension_2] as 'dimension_value_2',
"expr":"[dimension_n] as 'dimension_value_n',
"order": {"asc": True/False, "pri": order_sequence}
}],
"aggregates":
[{
"expr":["aggregate_1] as 'aggregate_value_1',
"expr":["aggregate_2] as 'aggregate_value_2',
"expr":["aggregate_n] as 'aggregate_value_n',
}],
"filters":
[
"[filter_1] in ('filter_value_1')",
"[filter_2] in ('filter_value_2')",
"[filter_n] in ('filter_value_n')"
],
"having":
[
"[aggregate_1]) > aggregate_value_1"
],
"dataset_id":dataset_id_number
}
Parameters of a Request Payload
The request payload parameters are defined as follows:
- version
- Specifies the version of the API. This is used for backward compatibility in case the API changes in the future. This is a mandatory parameter. For example, version 1.
- type
- Specifies the type of query. This is a mandatory parameter. For example, SQL.
- limit
- Specifies the number of rows to return. This is a mandatory parameter.
- dimensions
List of zero or more dimensions. Specifying no dimensions or aggregates returns all columns from the dataset. Each item in the list specifies a dimension that is requested, and has the following structure of key value pairs:
-
type
Either
SIMPLE
orSEGMENT
. Default isSIMPLE
. Use field descriptions forexpr
andorder
withSIMPLE
type. -
expr
Either a dimension or an aggregate with the following format: "expr":"[dimension_1] as 'dimension_value'. This is a mandatory parameter.
-
order
Maximum of two key value pairs: ‘asc’, which is set to True or False. This indicates the order and priority, which is set to an integer and specifies the sequence in which the order is to be applied.
-
type
- aggregates
List of zero or more aggregates. Specifying no aggregates returns all columns from the dataset. Each item in the list specifies an aggregate that is requested, and has the following structure of key value pairs:
-
expr
Specifies one aggregate with the following format:
[{"expr":["aggregate_1] as 'aggregate_value_1'}]
-
expr
- dataset_id
- Specifies the ID of the dataset object to run the data query. This is a mandatory parameter.
- filters
- Specifies a comma-separated list of filter expressions that apply to the dataset using the ‘WHERE’ clause. All the filters specified are combined with an ‘AND’ operation. These filters are applied as the ‘WHERE clause in the generated SQL. This is an optional parameter.
- having
- Specifies expressions to filter the dataset using the
HAVING
clause. Therefore, this parameter has an aggregation comparison. This is an optional parameter.
Example of a Request Payload
Here is an example of a Data API request payload that interfaces with the Cereals dataset, which ships as a sample with most Cloudera Data Visualization installations.
{
"version":1,
"type":"SQL",
"limit":100,
"dimensions":
[{
"type":"SIMPLE",
"expr":"[manufacturer] as 'manufacturer'",
"order": {"asc":False, "pri":1
}],
"aggregates":
[{
"expr":["avg([fat_grams]) as 'avg(fat_grams)'",
"expr":["sum(1) as 'Record Count'"
}],
"filters":
[
["[cold_or_hot] in ('C')"]
],
"having":
[
["[avg([fat_grams]) < 5"]
],
"dataset_id":11