In-Place Upgrade HDP 3 to CDP Private Cloud Base
HDP to CDP Upgrade Overview
In-Place Upgrade Overview
CDP Upgrade Readiness
How much time should I plan for to complete my upgrade?
Cluster environment readiness
Disk space and mountpoint considerations
Downloading and Publishing the Package Repository
Downloading and Publishing the Parcel Repository
Hadoop Users (user:group) and Kerberos Principals
Sample data ingestion
Merge Independent Hive and Spark Catalogs
Ambari and HDP Upgrade Checklist
Ambari upgrade checklist
Download cluster blueprints without hosts
HDP upgrade checklist
Checklist for large clusters
Before upgrading any cluster
Managing MPacks
Changes to Ambari services and views
HDP Core component version changes
Upgrading the cluster's underlying OS
In-Place and Restore
Move and Decommission
Upgrading Ambari
Before you upgrade Ambari
Backup Ambari
Setting up a local repository
Updating Ambari repo files
Updating HDP repo files
Case study for setting up an HDP-GPL local repository
Setting up local repository with temporary internet access
Case study for setting up local repository
Update version repository base urls
Preparing Ambari Repository Configuration File to use Local Repository
​Preparing to Upgrade Ambari
Upgrade to Ambari 7.1.x.0
Download cluster blueprints
​Mandatory Post-Upgrade Tasks
Upgrading Ambari Infra
Upgrading Ambari Log Search
Upgrading Ambari Metrics
Upgrading HDP to Cloudera Runtime 7.1.x
HDP Prerequisites
Upgrade process
Before upgrading any cluster
Backup HDP Cluster
Backup and Restore Databases
Backup Ranger
Backup Ranger Admin Database
Backup Ranger KMS Database
Backup Atlas
Backup HBase tables
Backup Ambari Infra Solr
Backup Ambari-Metrics
Backup Hive
Backup HBase
Backup Kafka
Backup Oozie
Backup Knox
Backup Logsearch
Backup Zeppelin
Backup HDFS
Backup ZooKeeper
Backup databases
Before you upgrade
Checkpoint HDFS
Pre-upgrade steps
Ranger Service connection with Oracle database
Ranger admin password
Preparing Spark for upgrade
Backing up Ambari infra Solr data
Back up and upgrade Ambari infra Solr and Ambari Log Search
Generate migration configuration
Back up Ambari Infra Solr data
Remove Existing Collections and Upgrade Binaries
Preparing HBase for upgrade
Preparing the backend HMS database for upgrade
Turn off YARN GPU
Preparing HDP Search for upgrade
Before you begin
Download Solr configuration from HDP Search ZooKeeper
Transition Solr configuration
Cloudera Manager versions 7.1.1 to 7.2.4
Cloudera Manager versions 7.3.1 or higher
Validate the configuration
Test the configuration
Preparing ZooKeeper for upgrade
Preparing Kafka for upgrade
Extract Kafka broker ID
Register software repositories
HDP Intermediate bits for 7.1.x.0 Repositories
Software download matrix for 3.1.5 to CDP 7.1.x
AM2CM legacy tools download
Install software on the hosts
Perform the HDP upgrade
Perform express upgrade
Post-HDP-upgrade tasks
Upload HDFS entity information
Ambari infra-migrate and restore
Ambari Metrics and LogSearch
Back up the Ranger configuration
Backup Infra Solr collections
Troubleshooting the HDP upgrade
YARN Registry DNS instance fails to start
HDP 3.1.5 to HDP 7.1.7 Intermediate bits Kafka upgrade
Rollback Ambari 7.1.x to Ambari 2.7.5
Rollback HDP Services 3.1.5 from CDP 7.1.x
Overview
ZooKeeper
Ambari-Metrics
Ambari Infra Solr
Ranger
Restore Ranger Admin Database
Restore Ranger KMS Database
HDFS
YARN
HBase
Kafka
Atlas
Restore HBase Tables
Restore ATLAS_ENTITY_AUDIT_EVENTS table
Restore Solr snapshots
Hive
Spark
Oozie
Knox
Zeppelin
Log Search
Transitioning to Cloudera Manager
Pre-transition steps
Databases
Kerberos
Kerberos principal
HDFS
Preparing HDFS
Backup the non-default Rack Awareness Topology script
Spark
Spark2/Livy
Ranger
Solr
Cloudera Manager Installation and Setup
Installing JDBC Driver
Proxy Cloudera Manager through Apache Knox
Transitioning HDP to Cloudera Private Cloud Base
Transitioning HDP 3.1.5 cluster to CDP Private Cloud Base 7.1.x cluster using the AM2CM tool
Post transition steps
Generating keytabs in Cloudera Manager
Enable Auto Start setting
ZooKeeper
Delete ZNodes
Ranger
Ranger KMS
Add Ranger policies for components on the CDP Cluster
Set maximum retention days for Ranger audits
Search post-HDP-upgrade tasks
HDFS
Ports
TLS/SSL
HDFS HA
Custom Topology
Add Balancer Role to HDFS
Other review configurations for HDFS
Configuring HDFS properties to optimize log collection
Solr
Restore Solr collections on CDP cluster
Kafka
Change Kafka port value
Unsetting Kafka Protocol version
Impala
YARN
Start job history
Yarn Mapreduce framework jars
GPU Scheduling
YARN CGroups
Reset ZNode ACLs
Placement rules evaluation engine
Converting old mapping rule format to JSON-based placement rule format
Setting the owner and permissions of /user/yarn
Spark
Livy2
Tez
Hive
Identifying and fixing invalid Hive schema versions
Create HIVE sys database
Setting up Hive metastore for Atlas
HMS health check
HBase
Hue
Installing Python 3.8
Installing Python 3.8 on CentOS 7 for Hue
Installing Python 3.8 on RHEL 8 for Hue
Installing Python 3.8 on SLES 12 for Hue
Installing Python 3.8 on Ubuntu 18 for Hue
Installing the psycopg2 Python package for PostgreSQL database
Installing MySQL client for MySQL databases
Installing MySQL client for MariaDB databases
Ozone
Oozie
Validate Database URL
Installing the new Shared Libraries
Update Oozie properties
Adding Oozie service dependencies
Access Oozie load balancer URL
Oozie Load Balancer configuration
Atlas advanced configuration snippet (Safety valve)
Migrating Atlas data
Phoenix
Map Phoenix schemas to HBase namespaces
Starting all services
Knox
Topology migration
Migrate Credential Aliases
Migrate signing key
Configure Apache Knox authentication for AD/LDAP
Client Configurations
Securing ZooKeeper
Zeppelin Shiro configurations
Migrating Spark workloads to CDP
Spark 1.6 to Spark 2.4 Refactoring
Handling prerequisites
Spark 1.6 to Spark 2.4 changes
New Spark entry point SparkSession
Dataframe API registerTempTable deprecated
union replaces unionAll
Empty schema not supported
Referencing a corrupt JSON/CSV record
Dataset and DataFrame API explode deprecated
CSV header and schema match
Table properties support
CREATE OR REPLACE VIEW and ALTER VIEW not supported
Managed table location
Write to Hive bucketed tables
Rounding in arithmetic operations
Precedence of set operations
HAVING without GROUP BY
CSV bad record handling
Spark 2.4 CSV example
Configuring storage locations
Querying Hive managed tables from Spark
Compiling and running Spark workloads
Compiling and running a Java-based job
Compiling and running a Scala-based job
Running a Python-based job
Running a job interactively
Post-migration tasks
Spark 2.3 to Spark 2.4 Refactoring
Handling prerequisites
Spark 2.3 to Spark 2.4 changes
Empty schema not supported
CSV header and schema match
Table properties support
Managed table location
Precedence of set operations
HAVING without GROUP BY
CSV bad record handling
Spark 2.4 CSV example
Configuring storage locations
Querying Hive managed tables from Spark
Compiling and running Spark workloads
Post-migration tasks
Apache Hive Changes in CDP
Hive Configuration Property Changes
Customizing critical Hive configurations
Setting Hive Configuration Overrides
Hive Configuration Requirements and Recommendations
Removing the LLAP Queue
Configuring HiveServer for ETL using YARN queues
Configuring authorization to tables
Updating Hive and Impala JDBC/ODBC drivers
Getting the JDBC driver
Getting the ODBC driver
Setting up access control lists
Configure encryption zone security
Renaming tables
Configure edge nodes as gateways
Configure HiveServer HTTP mode
Configuring HMS for high availability
Installing Hive on Tez and adding a HiveServer role
Handling table reference syntax
Add Backticks to Table References
Unsupported Interfaces and Features
Changes to HDP Hive tables
Configuring External Authentication for Cloudera Manager
Additional Services
Installing DAS using Ambari
Check cluster configuration for Hive and Tez
Add the DAS service
DAS post-installation tasks
Additional configuration tasks
Setting up the tmp directory
Configuring DAS for SSL/TLS
Set up trusted CA certificate
Set up self-signed certificates
Configure SSL/TLS in Ambari
Configuring user authentication in Ambari
Configuring user authentication using Knox SSO
Configuring user authentication using Knox proxy
Configuring user authentication using SPNEGO
Enabling logout option for secure clusters
Troubleshooting DAS installation
Problem area: Queries page
Your queries are not appearing on the Queries page
Query column is empty, yet you can see the DAG ID and Application ID
Query column is not empty, but you cannot see the DAG ID and Application ID
You cannot view queries from other users
Problem area: Compose page
You cannot see your databases or the query editor is missing
You cannot view new databases and tables, or cannot see changes to existing databases or tables
Replication failure in the DAS Event Processor
Problem area: Reports page
DAS service installation fails with the "python files missing" message
DAS does not log me out as expected, or I stay logged in longer than the time specified in the Ambari configuration
Getting a 401 - Unauthorized access error message while accessing DAS
Setting up quick links for the DAS UI
Installing DAS using Cloudera Manager
Adding Hue service with Cloudera Manager
Install and configure MySQL database
Add the Hue service using Cloudera Manager
Enable Kerberos for authentication
Integrate Hue with Knox
Grant Ranger permissions to new users or groups
Adding Query Processor service to a cluster
Applications Upgrade
Procedure to Rollback from CDP 7.1.7 SP1 to CDP 7.1.7