Unified Analytics overview
You can take advantage of SQL engine enhancements in the Cloudera Data Platform (CDP) by using Unified Analytics. Unified Analytics includes semantics commonality, backward compatibility, and optimizations.
- Red Hat OpenShift
- Embedded Container Service (ECS)
- Automatic query rewrites to use materialized views
- Command-line materialized view recommender
- DataSketches functions and rewrites
- Ranger column masking and row filtering
- Query results cache
- SQL set operations and grouping sets
- Atlas integration
- Extensive subquery support
- Advanced join reordering with bushy plans generation
- Integrity constraints-based rewritings
- User defined functions (UDFs) in Hive
- Other extensions to query optimization, such as column pruning, sort/limit merge and pushdown
Unified Analytics also brings significant optimization equivalency to the SQL engines, unifying common techniques such as subquery processing, join ordering and materialized views.
Lexical conventions
INSERT INTO MOVIES_INFO VALUES (1,cast('Toy Story (1995)' as varchar(50)), 'Animation|Children\'s|Comedy'), ...
hive.support.quoted.identifiers
property. Set this
property to one of the following values:- none
Quotation of identifiers and special characters in identifiers are not allowed, but regular expressions in backticks are supported for column names.
- column
Use the backtick character to enclose identifiers having special characters. `col1`. Use single quotation marks to enclose string literals, for example: 'value'. Double quotation marks are also accepted, but not recommended.
- standard (default)
SQL standard way to enclose identifiers. Use double quotation marks to enclose identifiers having special characters "col1" and single quotation marks for string literals, for example 'value'.
Limitations
- Unified Analytics does not support left/right ANTI JOIN syntax.
- Unified Analytics does not support complex types - ARRAY, STRUCT, MAP.