Cloudera Embedded Container Service
Day Two Operations Guide
Overview for Cloudera Embedded Container Service day two operations
Prerequisites for Cloudera Embedded Container Service day two operations
Basic operations
Collecting diagnostic data
Proactive monitoring
Environment health checks
Host-level tasks
Starting, stopping, restarting, and refreshing Cloudera Embedded Container Service Clusters
Adding hosts to a Cloudera Embedded Container Service Cluster
Installing NVIDIA GPU software in ECS
Decommissioning Cloudera Embedded Container Service Hosts
ECS Server High Availability
Enabling ECS Server HA after Cloudera Embedded Container Service Installation
Installing iptables on the new Cloudera Embedded Container Service master nodes
Adding hosts to the containerized cluster
Adding Role Instances to Docker Server
Adding Role Instances to Containerised Cluster
Starting Docker Server on Nodes
Starting ECS Server on Nodes
Rolling Restart of an Cloudera Embedded Container Service
Checking Nodes and Pods in the UI
Enabling ECS Server HA and promote agents after Cloudera Embedded Container Service Installation
Enabling ECS Server deployment for High Availability
Preparing the cluster for High Availability
High Level steps to enable an Cloudera Embedded Container Service High Availability cluster
Verifying DNS setup
Installing Load Balancer
Promoting Cloudera Embedded Container Service Agents to Cloudera Embedded Container Service Servers
Refreshing Cloudera Embedded Container Service
Creating an environment-wide backup
Creating backup of Cloudera Control Plane
Troubleshooting DRS
CDP Control Plane UI or the Backup and Restore Manager becomes inaccessible after a failed restore event?
Timeout error appears in Backup and Restore Manager
Stale configurations in Cloudera Manager after a restore event
Timeout error during backup of OCP clusters
Managing certificates
Adjusting the expiration time of Cloudera Embedded Container Service cluster certificates