Adding Spark to ML Runtime Sessions

You can add Spark to ML Runtime sessions using the ML Runtimes addons. Both Spark and Hadoop CLI are enabled when you enable Spark.

Custom Spark settings can also be configured at the workbench level. When set, the custom Spark configuration provided by the administrator will be merged with the default Spark configuration used in Cloudera AI sessions. These settings will automatically apply to all newly launched Spark sessions within the workbench. These configuration settings are available under Site Administration > Runtimes > Spark Configuration.

To add Spark to sessions run on projects using ML Runtimes images:

  1. View all available ML Runtime addons by selecting the Site Administration > Runtimes.
  2. To enable the default Spark configuration, start a New Session for an ML Runtimes project.
  3. Click the Enable Spark option, then select the Spark version.
  4. Click Start Session.
  5. You can now run a Spark job for your session.
    The Logs tab displays the executors in your Spark job.
  6. After you start a Spark job, you can access the Spark UI by clicking the Spark UI button at the top of the Session window.