Cloudera Embedded Container Service
Day Two Operations Guide
Overview for Cloudera Embedded Container Service day two operations
Prerequisites for Cloudera Embedded Container Service day two operations
Basic operations
Collecting diagnostic data
Proactive monitoring
Environment health checks
Host-level tasks
Starting, stopping, restarting, and refreshing Cloudera Embedded Container Service Clusters
Adding hosts to a Cloudera Embedded Container Service Cluster
Installing NVIDIA GPU software in ECS
Decommissioning Cloudera Embedded Container Service Hosts
ECS Server High Availability
Enabling ECS Server HA after Cloudera Embedded Container Service Installation
Installing iptables on the new Cloudera Embedded Container Service master nodes
Adding hosts to the containerized cluster
Adding Role Instances to Docker Server
Adding Role Instances to Containerised Cluster
Starting Docker Server on Nodes
Starting ECS Server on Nodes
Rolling Restart of an Cloudera Embedded Container Service
Checking Nodes and Pods in the UI
Enabling ECS Server HA and promoting agents after Cloudera Embedded Container Service Installation
Enabling ECS Server deployment for High Availability
Preparing the cluster for High Availability
High Level steps for enabling a Cloudera Embedded Container Service High Availability cluster
Verifying DNS setup
Installing Load Balancer
Promoting Cloudera Embedded Container Service Agents to Cloudera Embedded Container Service Servers
Refreshing Cloudera Embedded Container Service
Creating an environment-wide backup
Creating a backup of Cloudera Control Plane
Troubleshooting DRS
Cloudera Control Plane UI or the Backup and Restore Manager becomes inaccessible after a failed restore event?
Timeout error appears in Backup and Restore Manager
Timeout error during backup of OCP clusters
Stale configurations in Cloudera Manager after a restore event
Existing namespaces are not deleted automatically after the restore event
Backup event fails during volume snapshot creation process
Restore event for an environment backup fails with an exception
Managing certificates
Adjusting the expiration time of Cloudera Embedded Container Service cluster certificates