Updating SQL queries with PROCTIME function
After upgrading to Cloudera Streaming Analytics (CSA) 1.12.0, you need to migrate your SQL queries using PROCTIME() function in lookup joins, because the PROCTIME() function is no longer avaialable in CSA.
As the
PROCTIME()
shortcut function was removed from CSA, the SQL syntax
changes for lookup joins. For example, let's say you have a streaming_table
for continuously incoming data (for example, a Kafka source table), and a
kudu_table
to enrich that streaming data, and you want to join them on an
id
column. With PROCTIME()
, you would use the following
syntax:SELECT * FROM `streaming_table` AS s
JOIN `kudu_table` FOR SYSTEM_TIME AS OF PROCTIME() AS k
ON s.id = k.id;
But as PROCTIME()
is no longer available, the
syntax should look like the following
example:SELECT * FROM `streaming_table` AS s
JOIN `kudu_table` FOR SYSTEM_TIME AS OF s.proc_time AS k
ON s.id = k.id;
The difference is that the left side table is required to have
a calculated column that represents the processing time attribute. The name of the column can
be anything, but it is required to use that column for the FOR SYSTEM_TIME AS OF
...
expression instead of the raw PROCTIME()
function.You can update the existing SQL queries that use
PROCTIME()
to avoid SQL
jobs to fail using one of the following methods:- Adding the computed
PROCTIME()
column to the existing table:ALTER TABLE `streaming_table` ADD `proc_time` AS PROCTIME();
- Creating a new table through copying the original
DDL:
You can also create the new table using theCREATE TABLE `streaming_table_with_proc_time` ( -- Add proc_time column. `proc_time` AS PROCTIME() ) WITH ( -- No changes here. ) LIKE `streaming_table`;
LIKE
clause:CREATE TABLE `streaming_table_with_proc_time` ( ... -- The existing table schema. `proc_time` AS PROCTIME() ) WITH ( ... -- The existing table properties. ) LIKE `streaming_table`;