Known issues and limitations

Learn about the known issues in Flink and SQL Stream Builder, the impact or changes to the functionality, and the workaround in Cloudera Streaming Analytics 1.11.0.

SQL Stream Builder

CSA-5138 - SQL job submissions with UDF JARs fail when checkpointing is enabled
Due to the handling of ClassLoaders for custom JARs, uploading any Java UDFs with checkpoints enabled will cause the SQL job to fail with the following error:
ERROR com.cloudera.ssb.sqlio.service.SqlExecutorService: Error while submitting streaming job
org.apache.flink.util.FlinkRuntimeException: org.apache.flink.api.common.InvalidProgramException: Table program cannot be compiled.
Once the SQL job fails, the session on Streaming SQL Console must be reset before resubmitting the job without checkpointing.
None
CSA-4960 - Invalid job schemas for existing SSB jobs
After successfully upgrading to CSA 1.11.0 from CSA 1.8.0 or lower versions, migrating the existing jobs produce invalid job schemas in the admin database.
The mv_config object is stored in the mv_config column of the SSB jobs table. You need to manually update the jobs table to resolve the issue:
  • If an existing mv_config includes an unknown create field, the field must be deleted.
  • If an existing mv_config includes the minRowRetentionCount deprecated key, the key should be changed to min_row_retention_count.
The following mv_config objects show an invalid and valid example:
  • Valid mv_config object:
    { "name": "quizzical_benz", "retention": 300, "min_row_retention_count": 0, "recreate": false, "key_column_name": "", "api_key": null, "ignore_nulls": false, "require_restart": false, "enabled": false }
  • Invalid mv_config object:
    { "create": false, "name": "quizzical_benz", "retention": 300, "minRowRetentionCount": 0, "recreate": false, "key_column_name": "", "api_key": null, "ignore_nulls": false, "require_restart": false, "enabled": false }
ENGESC-23078 - Job not found after successful job creation
After successfully creating a job in SSB, the SQL job is not found due to tables having empty values. This issue is indicated with the following error message in the log files:
java.lang.IllegalArgumentException: argument "content" is null
The issue only applies when upgrading from a CSA version lower than 1.9.0.
Update the empty values with null string in the mv_config and checkpoint_config fields as shown in the following example:
UPDATE jobs SET mv_config = 'null' WHERE mv_config IS NULL;
UPDATE jobs SET checkpoint_config = 'null' WHERE checkpoint_config IS NULL;
CSA-4938 - Activating an environment makes SSB unable to start
Restarting SSB with an active environment file causes a NullPointerException and SSB fails to start.
Use the following command to update the environment file:
UPDATE environments SET properties = '{}' WHERE properties IS NULL;
CSA-4643 - flink-yarn-session is ignoring command line parameters
When adding parameters to the Flink session using flink-yarn-session -d in command line, the parameters are not applied to the session.
None
CSA-4858 - Kerberos encryption type detection does not always work correctly for SSB
SSB detects no supported encryption types even though there is a list of allowed encryption types in the krb5.conf file. This causes an error when generating keytabs from the principal and password pair.
  1. Run ktutil on your cluster.
  2. Change the configuration with the following commands:
    addent -password -p <username> -k 1 -e aes256-cts
    wkt /tmp/new_keytab.keytab
  3. Upload the new keytab on Streaming SQL Console.
CSA-4861 - Error with Flink JSON row serializer init in SSB

As the open() function of the Flink RowRowConverter is called in SSB, the transient fields of the RowRowConverter are not transferred to the Flink workers that do the row serializiation. This causes a NullPointerException (NPE). The error only occurs in case of composite fields (for example, ARRAY).

None
Auto discovery is not supported for Apache Knox
You need to manually configure Knox with SQL Stream Builder to enable Knox authentication.
Complete the configuration based on the CDP Private Cloud Base version you use. For more information, see the Enabling Knox authentication for SSB documentation.
CSA-5006 - SSB service fails when using Active Directory (AD) Kerberos authentication
If you use AD Kerberos for authentication and the Load Balancer URL is not provided, it can cause the SQL Stream Builder (SSB) service to fail. The issue is caused by the keytab generation. When the keytab is generated by Cloudera Manager it requires the principals from the AD for the Load Balancer host, and without no host specified for the Load Balancer, the SSB service cannot be started by Cloudera Manager. This issue also persists when the Load Balancer role is not deployed or used with SSB.
Fill out the Load Balancer URL parameter in Cloudera Manager regardless of using Load Balancer with SSB. For more information, see the Enabling High Availability for SSB documentation.

Flink

FLINK-20539 - Type mismatch when using ROW() in computed column
Using the ROW() function in SQL statements fails due to a mismatch between Calcite and Flink SQL data type.
None

Limitations

In Cloudera Streaming Analytics, the following SQL API features are in preview:
  • Match recognize
  • Top-N
  • Stream-Table join (without rowtime input)
DataStream conversion limitations
  • Converting between Tables and POJO DataStreams is currently not supported in CSA.
  • Object arrays are not supported for Tuple conversion.
  • The java.time class conversions for Tuple DataStreams are only supported by using explicit TypeInformation: LegacyInstantTypeInfo, LocalTimeTypeInfo.getInfoFor(LocalDate/LocalDateTime/LocalTime.class).
  • Only java.sql.Timestamp is supported for rowtime conversion, java.time.LocalDateTime is not supported.
Kudu catalog limitations
  • CREATE TABLE
    • Primary keys can only be set by the kudu.primary-key-columns property. Using the PRIMARY KEY constraint is not yet possible.
    • Range partitioning is not supported.
  • When getting a table through the catalog, NOT NULL and PRIMARY KEY constraints are ignored. All columns are described as being nullable, and not being primary keys.
  • Kudu tables cannot be altered through the catalog other than simply renaming them.
Schema Registry catalog limitations
  • Currently, the Schema Registry catalog / format only supports reading messages with the latest enabled schema for any given Kafka topic at the time when the SQL query was compiled.
  • No time-column and watermark support for Registry tables.
  • No CREATE TABLE support. Schemas have to be registered directly in the SchemaRegistry to be accessible through the catalog.
  • The catalog is read-only. It does not support table deletions or modifications.
  • By default, it is assumed that Kafka message values contain the schema id as a prefix, because this is the default behaviour for the SchemaRegistry Kafka producer format. To consume messages with schema written in the header, the following property must be set for the Registry client: store.schema.version.id.in.header: true.