Known Issues in MapReduce and YARN
This topic describes known issues, unsupported features and limitations for using MapReduce and YARN in this release of Cloudera Runtime.
Known Issues
- JobHistory URL mismatch after server relocation
- After moving the JobHistory Server to a new host, the URLs listed for the JobHistory Server on the ResourceManager web UI still point to the old JobHistory Server. This affects existing jobs only. New jobs started after the move are not affected.
- CDH-49165: History link in ResourceManager web UI broken for killed Spark applications
- When a Spark application is killed, the history link in the ResourceManager web UI does not work.
- CDH-6808: Routable IP address required by ResourceManager
- ResourceManager requires routable
host:port
addresses foryarn.resourcemanager.scheduler.address
, and does not support using the wildcard 0.0.0.0 address.
- OPSAPS-52066: Stacks under Logs Directory for Hadoop daemons are not accessible from Knox Gateway.
- Stacks under the Logs directory for Hadoop daemons, such as NameNode, DataNode, ResourceManager, NodeManager, and JobHistoryServer are not accessible from Knox Gateway.
- CDPD-2936: Application logs are not accessible in WebUI2 or Cloudera Manager
- Running Containers Logs from NodeManager local directory cannot be accessed either in Cloudera Manager or in WebUI2 due to log aggregation.
- OPSAPS-50291: Environment variables
HADOOP_HOME
,PATH
,LANG
, andTZ
are not getting whitelisted - It is possible to whitelist the environment variables
HADOOP_HOME
,PATH
,LANG
, andTZ
, but the container launch environments do not have these variables set up automatically. - COMPX-1445: Queue Manager operations are failing when Queue Manager is installed separately from YARN
- If Queue Manager is not selected during YARN installation, Queue Manager operation are failing. Queue Manager says 0 queues are configured and several failures are present. That is because ZooKeeper configuration store is not enabled.
- COMPX-1451: Queue Manager does not support multiple Resource
- When YARN High Availability is enabled there are multiple Resource Managers. Queue Manager receives multiple ResourceManager URLs for a High Availability cluster. It picks the active ResourceManager URL only when Queue Manager page is loaded. Queue Manager cannot handle it gracefully when the currently active ResourceManager goes down while the user is still using the Queue Manager UI.
- COMPX-8687: Missing access check for getAppAttemps
- When the Job ACL feature is enabled using Cloudera Manager (
mapreduce.cluster.acls.enabled
property is not generated to all configuration files, including theyarn-site.xml
configuration file. As a result the ResourceManager process will use the default value of this property. The default property ofmapreduce.cluster.acls.enabled
isfalse
.
property), the
Limitations
- Capacity Scheduler
-
- As Capacity Scheduler is the default scheduler, the Dynamic Resource Pool User Interface is not available by default.
- Capacity Scheduler can be configured only through safety-valves in Cloudera Manager.
Unsupported Features
-
The following YARN features are currently not supported in Cloudera Data Platform:
- GPU support for Docker
- Hadoop Pipes
- Fair Scheduler
- Application Timeline Server (ATS 2 and ATS 1.5)
- Container Resizing
- Distributed or Centralized Allocation of Opportunistic Containers
- Distributed Scheduling
- Native Services
- Pluggable Scheduler Configuration
- Queue Priority Support
- Reservation REST APIs
- Resource Estimator Service
- Resource Profiles
- (non-Zookeeper) ResourceManager State Store
- Shared Cache
- YARN Federation
- New Aggregated Log File Format
- Node Labels
- Rolling Log Aggregation
- YARN WebUI v2
- Docker on YARN (DockerContainerExecutor)
- Moving jobs between queues
- Dynamic Resource Pools
- Upgrading from Cloudera Data Hub (CDH) or Hortonworks Data Platform (HDP) to Cloudera
Data Platform - Data Center (CDP-DC) 7.0
CDP-DC 7.0 release only supports Capacity Scheduler for your Hadoop YARN clusters. However, you cannot upgrade to the CDP-DC 7.0 release from your existing CDH or HDP clusters. Therefore, in this release, we do not support any sort of migrations from CDH or HDP clusters and from Fair Scheduler to Capacity Scheduler. If you wish to migrate from Fair Scheduler to Capacity Scheduler, you must first upgrade to Cloudera Data Platform – Data Center 7.1 release and migrate from Fair Scheduler to Capacity Scheduler. For more information, see Migrating from Fair Scheduler to Capacity Scheduler.
Technical Service Bulletins
- TSB 2021-539: Capacity Scheduler queue pending metrics can become negative in certain production workload scenarios causing blocked queues
- The pending metrics of Capacity Scheduler queues can become negative in certain production
workload scenarios.
Once this metric becomes negative, the scheduler is unable to schedule any further resource requests on the specific queue. As a result, new applications are stuck in the
ACCEPTED
state unless YARN ResourceManager is restarted or failed-over. - Knowledge article
- For the latest update on this issue see the corresponding Knowledge article: TSB 2021-539: Capacity Scheduler queue pending metrics can become negative in certain production workload scenarios causing blocked queues