Managing Data Operating System
Also available as:
PDF
loading table of contents...

Prerequisites for Running Containerized Spark Jobs

To containerize Spark on YARN, you must ensure that the YARN cluster is enabled for Docker.

During application submission, ensure that you specify the following parameters:
  • YARN_CONTAINER_RUNTIME_TYPE=docker
  • YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=<docker_image>
  • YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK=host
  • YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=<any volume mounts needed by the spark application>