Known issues and limitations

The following limitations apply to Cloudera Streaming Analytics 1.4.0

SQL Stream Builder

CSA-1023: SQL Stream jobs with large schemas fail when using MySQL
The SQL Stream jobs that have large schemas will fail when you configure SQL Stream Builder with MySQL database. The following error message appears when you run into this issue:
_mysql_connector.MySQLInterfaceError: Data too long for column 'sb_job_data' at row
        114:14
The MySQL ‘text’ data columns are limited to 64kb length. Make sure that the schema either does not exceed this value, or use the following workaround to change the ‘text’ data type to ‘longtext’ which has 4GB length.
  1. Log in as the root user to MySQL:
    mysql -u root -p
    Enter password:
  2. Use the following ALTER TABLE command to change the data types into ‘longtext’:
    ALTER TABLE sb_jobs MODIFY sb_job_data LONGTEXT
  3. Update the schemas in your SQL jobs:
    1. Open the Streaming SQL Console.
    2. Select Table tab.
    3. Search for the Kafka table where you want to update the schema.
    4. Click Edit.
    5. Select Schema tab.
    6. Click Detect schema.
    7. Click Save changes.
  4. Restart your SQL jobs.
CSA-1232: Big numbers are incorrectly represented on the Streaming SQL Console UI
The issue impacts the following scenarios in Streaming SQL Console:
  • When having integers bigger than 253-1 among your values, the Input transformations and User Defined Functions are considered unsafe and produce incorrect results as these numbers will lose precision during parsing.
  • When having integers bigger than 253-1 among your values, sampling to the Streaming SQL Console UI produces incorrect results as these numbers will lose precision during parsing.
None
CSA-1378: Spring cleanup can cause exemptions and failure in SQL Stream Builder
Due to the cleanup mechanism of the Spring Boot framework used in SSB, the /tmp folder is cleared within certain dates on RHEL7 and Ubuntu. The cleanup removes every artifact stored by Spring. This can cause exemptions and job failures when using SSB.
None
CSA-1410: Restoring SSB job from savepoint fails when using MySQL
Restarting a SQL job from savepoint can fail when using MySQL database due to an issue of log creation.
None
CSA-1454: Timezone settings can cause unexpected behavior in Kafka tables
You must consider the timezone settings of your environment when using timestamps in a Kafka table as it can affect the results of your query. When the timestamp in a query is identified with from_unixtime, it returns the results based on the timezone of the system. If the timezone is not set in UTC+0, the timestamp of the query results will shift in time and will not be correct.
Change your local timezone settings to UTC+0.
CSA-1479: Incorrect Materialized View settings when loading SQL jobs
When editing an already existing SQL Stream job, the primary key and recreate table settings will not be correct and revert to default.
None
CSA-1499: Table name error for Materialized Views
When selecting data from a table, the Materialized View engine returns an error due to using the wrong table name in the execution.
None

Flink

In Cloudera Streaming Analytics, the following SQL API features are in preview:
  • Match recognize
  • Top-N
  • Stream-Table join (without rowtime input)
DataStream conversion limitations
  • Converting between Tables and POJO DataStreams is currently not supported in CSA.
  • Object arrays are not supported for Tuple conversion.
  • The java.time class conversions for Tuple DataStreams are only supported by using explicit TypeInformation: LegacyInstantTypeInfo, LocalTimeTypeInfo.getInfoFor(LocalDate/LocalDateTime/LocalTime.class).
  • Only java.sql.Timestamp is supported for rowtime conversion, java.time.LocalDateTime is not supported.
Kudu catalog limitations
  • CREATE TABLE
    • Primary keys can only be set by the kudu.primary-key-columns property. Using the PRIMARY KEY constraint is not yet possible.
    • Range partitioning is not supported.
  • When getting a table through the catalog, NOT NULL and PRIMARY KEY constraints are ignored. All columns are described as being nullable, and not being primary keys.
  • Kudu tables cannot be altered through the catalog other than simply renaming them.
Schema Registry catalog limitations
  • Currently, the Schema Registry catalog / format only supports reading messages with the latest enabled schema for any given Kafka topic at the time when the SQL query was compiled.
  • No time-column and watermark support for Registry tables.
  • No CREATE TABLE support. Schemas have to be registered directly in the SchemaRegistry to be accessible through the catalog.
  • The catalog is read-only. It does not support table deletions or modifications.
  • By default, it is assumed that Kafka message values contain the schema id as a prefix, because this is the default behaviour for the SchemaRegistry Kafka producer format. To consume messages with schema written in the header, the following property must be set for the Registry client: store.schema.version.id.in.header: true.