Migrating Hive Workloads to CDP Private Cloud
Migrating Hive workloads from CDH
Changes to CDH Hive Tables
Configuration changes
Hive Configuration Property Changes
Customizing critical Hive configurations
Setting Hive Configuration Overrides
Hive Configuration Requirements and Recommendations
Configuring HMS for high availability
Setting up Hive metastore for Atlas
Changing the Hive warehouse location
Security tasks
Making the Hive plugin for Ranger visible
Configuring authorization to tables
Setting up access control lists
Configure encryption zone security
Configure edge nodes as gateways
Configure HiveServer HTTP mode
Key syntax changes
Handling table reference syntax
LOCATION and MANAGEDLOCATION clauses
Key semantic changes and workarounds
Changing incompatible column types
Understanding CREATE TABLE behavior
Configuring legacy CREATE TABLE behavior
Dropping partitions
Handling the Keyword APPLICATION
Handling output of greatest and least functions
Renaming tables
TRUNCATE TABLE on an external table
Other syntax and semantic changes
Syntax and semantic changes CDH 6.2.1 to CDP 7.0.3.2
Aliasing tables
ANALYZE TABLE ... COMPUTE STATISTICS PARTIALSCAN removed
Decimal to string change
Decimal literals
hive.stats.collect.rawdatasize removal
HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
Limit scanned partitions
Overflow handling of decimals
Functions that changed
ACOS(2) and ASIN(2) return NULL
CAST function results
Casting types with leading or trailing spaces
CORR and COVAR_SAMP compliant with SQL:2011
LENGTH function supported data types
STDDEV_SAMP and VAR_SAMP
NULL related behaviors
ORDER BY clause treatment of NULLs
Disallow enabling/enforcing NOT NULL
Default NULL ordering change
Enforcement of NOT NULL constraint
Timestamp or date related behaviors
ADD_MONTHS function fix
ADD_MONTHS date validation
Casting invalid dates
FROM_UNIXTIME and UNIX_TIMESTAMP time zone
Handling of CURRENT_TIMESTAMP output format
Handling of Julian dates in UDFs
Handling return type for old date functions
Support for SQL:2016 datetime formats (limited formats)
UNIX_TIMESTAMP behavior
TIMESTAMP based on UTC
UNIX_TIMESTAMP conversion of TIMESTAMPLOCALTZ
Semantic changes and workarounds CDP 7.1.1
NVL UDF implementation changes
Improved Handling of External Table Inserts in HDFS
Semantic changes and workarounds CDP 7.1.4
Exclusive write lock for MERGE INSERT
Lock implementations to allow zero-wait readers
UNBOUNDED representation in Window functions
Support for 0 ROWS PRECEDING or FOLLOWING
Semantic changes and workarounds CDP 7.1.5
Sort behavior in SHOW COLUMNS
Event notification cleanup interval
Semantic changes and workarounds CDP 7.1.6
Support for SQL:2016 datetime formats (text, FM, FX)
Casting Timestamp to numeric and vice-versa
Handling trailing zeros of decimal constants
Semantic changes and workarounds CDP 7.1.7
Precision and scale changes
Semantic changes and workarounds CDP 7.1.7 SP1
Date and timestamp parser changes from LENIENT to STRICT
Date strings are parsed using local timezone
Semantic changes and workarounds CDP 7.1.7 SP2
Date and timestamp format changes
Semantic changes and workarounds CDP 7.1.7 SP2 CHFx
New property to control datetime formatter
Dates are parsed by ignoring trailing invalid characters
Semantic changes and workarounds CDP 7.1.8 CHFx
Handling table column named default
Fix precision and scale inference for aggregate rewriting in Calcite
Migrating Spark Apps
Preventing SparkSQL incompatibility
Spark integration with Hive
Removing Hive on Spark Configurations
Disabling Partition Type Checking
Converting Hive CLI scripts to Beeline
Hive unsupported interfaces and features
Migrating Hive workloads from HDP 2.6.5
Changes to HDP Hive tables
Checking and correcting Hive table locations
Configuration changes
Hive Configuration Property Changes
Customizing critical Hive configurations
Setting Hive Configuration Overrides
Hive Configuration Requirements and Recommendations
Configuring HMS for high availability
Setting up Hive metastore for Atlas
Changing the Hive warehouse location
Removing the LLAP Queue
Security tasks
Making the Hive plugin for Ranger visible
Configuring authorization to tables
Setting up access control lists
Configure encryption zone security
Configure edge nodes as gateways
Configure HiveServer HTTP mode
Handling syntax changes
Handling table reference syntax
LOCATION and MANAGEDLOCATION clauses
Key semantic changes and workarounds
Casting timestamps
Changing incompatible column types
Understanding CREATE TABLE behavior
Configuring legacy CREATE TABLE behavior
Dropping partitions
Handling output of greatest and least functions
Renaming tables
TRUNCATE TABLE on an external table
Migrating Spark Apps
Spark integration with Hive
Identifying and fixing invalid Hive schema versions
Fixing statistics
Converting Hive CLI scripts to Beeline
Hive unsupported interfaces and features
Replicating Hive data from HDP 3 to CDP
Replicating Hive data
Configuring the CDP cluster
Mandatory CDP policy-level properties
Optional CDP policy-level properties
Supported scheduled query operations
Configuring the HDP cluster
Mandatory HDP cluster configuration properties
Mandatory HDP policy-level properties
Optional HDP policy-level properties
Configuring wire-encrypted clusters
Example commands for replicating HDP 3 workloads
Troubleshooting Hive replication using REPL
Repl Command Known Issues
Patches Required on HDP
Patches required on CDP
Verifying the Hive data replication
Setting up the HDP cluster
Verifying replication
Handing a failed verification
Validating external table replication
Enabling background threads after migration
Migration paths from HDP 3 to CDP for LLAP users
Migration paths for Hive users
Migration to CDP Private Cloud Base or CDP Public Cloud
Migration to Cloudera Data Warehouse
Apache Tez processing of Hive jobs
Migration paths for Spark users
Migration to CDP Private Cloud Base
HWC changes from HDP to CDP
Migrating Hive workloads from CDP Private Cloud Base to CDW Private Cloud
Planning a CDW Virtual Warehouse instance
Apache Tez processing of Hive jobs
Migrate Hive workloads from HDP (LLAP) to CDW (LLAP)
Migrate from CDP PVC Base (Hive on Tez) to CDW (LLAP)
Migrating Hive workloads to ACID
Tables in Hive 1 and 2 vs. Hive 3
Compatible storage formats
Table design considerations
Hive ingest patterns introduction
Classic ingest patterns
ACID ingest patterns
Handling government regulations in ACID tables
Key concepts about ACID ingest patterns