Supported basic data types

The following basic data types are supported with the Avro format.

Avro schema type Flink data type
string STRING
boolean BOOLEAN
bytes BYTES
int INT
long BIGINT
float FLOAT
double DOUBLE

To specify a field with additional properties, such as the decimal or array fields in the example, the type field must be a nested object which has a type field itself, as well as the needed properties.

An example of a property that must be set this way is a field’s logical type. Some types cannot be directly represented by an Avro data type, so they use one of the supported types as an underlying representation. The logical type attribute tells how it should be interpreted. For example, the decimal type – described below – is stored as bytes, while its logical type is decimal.

Decimal type
  • Only 38 precision and 18 scale are supported
  • Flink data type is DECIMAL(38,18)
  • To specify in the Avro schema: {"name": "decimal_field", "type": {"type": "bytes", "logicalType": "decimal", "precision": 38, "scale": 18}}
Array type
Avro allows arrays of supported basic types, except:
  • String
  • Decimal
  • Constructed types (nesting of arrays, rows, maps)
For example, defining an Array of long values:
  • In the table definition: arr_field ARRAY<BIGINT>
  • Avro schema:
    {“name”:”arr_field, "type": {"type":"array",
        "items":"long"}}}
Row type
Flink rows can be specified as records in the Avro schema. Fields must be named both in the SQL of the table definition, as well as in the Avro schema string.
  • Field names must match between the table declaration and the Avro schema’s record description.
  • The two name fields in the Avro schema have the following structure:
    • one on the outside is the name of the field
    • one inside is the type object, pertaining to the record definition
  • Decimal fields are not supported within rows.
  • Rows can be nested, Arrays are also allowed as fields of the Row.
Example table definition:
CREATE TABLE source(row_field ROW<f1 INT,f2 STRING,f3 BOOLEAN>) WITH (...)
Corresponding Avro schema:
'format.avro-schema' =
    	'{
      "type": "record",
      "name": "test",
      "fields" : [
        {"name": "row_field", "type": {
            "type":"record",
            "name":"row_field",
            "fields": [
            {"name":"f1", "type":"int"},
            {"name":"f2", "type":"string"},
            {"name":"f3", "type":"boolean"}]
  }}]
}'
Map type
Only Maps with String keys are supported. The value field can be any type of the supported ones, except decimal.
  • In the table definition: map_field MAP<STRING,BIGINT>
  • In the Avro schema:
    {"name": "map_field", "type": {"type":"map",
        "values":"long"}}
Nullability
To set a field nullable in the Avro schema, create a union of the field’s type and null. A nullable integer field would be defined as: {"name": "int_field", "type": ["int", "null"]}