Known issues and limitations

Learn about the known issues in Flink and SQL Stream Builder, the impact or changes to the functionality, and the workaround in Cloudera Streaming Analytics 1.6.0.

SQL Stream Builder

FLINK-18027: ROW value constructor cannot deal with complex expressions
When querying data from a table or a view with a ROW() function an exception is thrown due to a Calcite parsing issue. For example, the following query will return an error:
CREATE VIEW example AS SELECT col1, ROW(col2) FROM table;
SELECT * FROM example;
Add a second SELECT layer to the SQL query as shown in the following example:
CREATE VIEW example AS SELECT col1, ROW(col2) FROM (SELECT col1, col2 FROM table);
SELECT * FROM example;
Cannot access API Explorer
The API Explorer page of SSB REST API cannot be accessed when using Apache Knox as authentication method. This issue is not present when using SPNEGO authentication.
None
Uploading connector files fail
When trying to upload a new connector JAR with a file size more than 1 MB, the upload process fails with an error.
Set the server.tomcat.max-swallow-size in Cloudera Manager using the following steps:
  1. Open your cluster in Cloudera Manager.
  2. Select SQL Stream Builder from the list of services.
  3. Select Configuration.
  4. Search for Streaming SQL Engine Advanced Configuration Snippet (Safety Valve) for ssb-conf/application.properties in the search bar.
  5. Add server.tomcat.max-swallow-size=2000MB to the Safety Valve.
  6. Click Save.
  7. Restart the SQL Stream Builder service.
CSA-2559: Materialized View settings can be overwritten while running job
Materialized View settings are overwritten when submitting a new job with the same name.
None
CSA-2551: Dynamic filters are not working with greater value than a character
The dynamic filtering feature cannot be used for the Materialized View when providing a parameter with a value that is greater than the value of a character type.
CSA-2547: Vulnerability issue for user impersonation
With SPENGO authentication, adding the doAs=other_user parameters results in that users can be impersonated as the call is proxied to the Streaming SQL Engine as ssb principal.
CSA-2538: Error when saving Materialized View configuration
Due to data type mismatch for retention_interval_ms in the console and admin databases, configurations using MV retention times (greater than 2147483 seconds) cannot be saved for Materialized Views.
CSA-2529: Cannot set consumer groups for Kafka tables
Queries fail when adding consumer groups for Kafka table settings.
CSA-2528: Improvement for Materialized View table names
The automatically created names of Materialized View tables are not expressive enough to easily work with.
Db2 CDC connector is not available from Connectors and Templates
The Db2 Change Data Capture (CDC) is not yet available on the Streaming SQL Console under Templates and Connectors. This does not limit the use of the Db2 connector.
You can use the following the Db2 CDC example as a reference to create a table:
CREATE TABLE db2_cdc_source (
	'column_name' INT,
	'column_name' STRING
) WITH (
    'connector' = 'db2-cdc',
	'hostname' = '...',
	'port' = '...',
	'username' = '...',
	'password' = '...',
	'database-name' = '...',
       'schema-name' = '...',
	'table-name' = '...'
)
CSA-2016: Deleting table from other teams
There is a limitation when using the Streaming SQL Console for deleting tables. It is not possible to delete a table that belongs to another team using the Delete button on the User Interface.
Use DROP TABLE statement from the SQL window.
CSA-1673: SSB operations are not showing in Atlas
Due to a communication issue SQL Stream Builder (SSB) operations are not showing in Atlas.
None
CSA-1454: Timezone settings can cause unexpected behavior in Kafka tables
You must consider the timezone settings of your environment when using timestamps in a Kafka table as it can affect the results of your query. When the timestamp in a query is identified with from_unixtime, it returns the results based on the timezone of the system. If the timezone is not set in UTC+0, the timestamp of the query results will shift in time and will not be correct.
Change your local timezone settings to UTC+0.
CSA-1231: Big numbers are incorrectly represented on the Streaming SQL Console UI
The issue impacts the following scenarios in Streaming SQL Console:
  • When having integers bigger than 253-1 among your values, the Input transformations and User Defined Functions are considered unsafe and produce incorrect results as these numbers will lose precision during parsing.
  • When having integers bigger than 253-1 among your values, sampling to the Streaming SQL Console UI produces incorrect results as these numbers will lose precision during parsing.
None

Flink

FLINK-18027: ROW value constructor cannot deal with complex expressions
When querying data from a table or a view with a ROW() function an exception is thrown due to a Calcite parsing issue. For example, the following query will return an error:
CREATE VIEW example AS SELECT col1, ROW(col2) FROM table;
SELECT * FROM example;
Add a second SELECT layer to the SQL query as shown in the following example:
CREATE VIEW example AS SELECT col1, ROW(col2) FROM (SELECT col1, col2 FROM table);
SELECT * FROM example;
In Cloudera Streaming Analytics, the following SQL API features are in preview:
  • Match recognize
  • Top-N
  • Stream-Table join (without rowtime input)
DataStream conversion limitations
  • Converting between Tables and POJO DataStreams is currently not supported in CSA.
  • Object arrays are not supported for Tuple conversion.
  • The java.time class conversions for Tuple DataStreams are only supported by using explicit TypeInformation: LegacyInstantTypeInfo, LocalTimeTypeInfo.getInfoFor(LocalDate/LocalDateTime/LocalTime.class).
  • Only java.sql.Timestamp is supported for rowtime conversion, java.time.LocalDateTime is not supported.
Kudu catalog limitations
  • CREATE TABLE
    • Primary keys can only be set by the kudu.primary-key-columns property. Using the PRIMARY KEY constraint is not yet possible.
    • Range partitioning is not supported.
  • When getting a table through the catalog, NOT NULL and PRIMARY KEY constraints are ignored. All columns are described as being nullable, and not being primary keys.
  • Kudu tables cannot be altered through the catalog other than simply renaming them.
Schema Registry catalog limitations
  • Currently, the Schema Registry catalog / format only supports reading messages with the latest enabled schema for any given Kafka topic at the time when the SQL query was compiled.
  • No time-column and watermark support for Registry tables.
  • No CREATE TABLE support. Schemas have to be registered directly in the SchemaRegistry to be accessible through the catalog.
  • The catalog is read-only. It does not support table deletions or modifications.
  • By default, it is assumed that Kafka message values contain the schema id as a prefix, because this is the default behaviour for the SchemaRegistry Kafka producer format. To consume messages with schema written in the header, the following property must be set for the Registry client: store.schema.version.id.in.header: true.