Using Cloudera's Distribution of Apache Spark 2

For an architectural overview of how Cloudera's Distribution of Apache Spark 2 works with Cloudera Data Science Workbench, see Overview: Cloudera Distribution of Apache Spark 2. The rest of this guide describes how to set Spark 2 environment variables, manage package dependencies, and how to configure logging. It also consists of instructions and sample code for running R, Scala, and Python projects from Spark 2.

Continue reading:

Spark 2 Configuration
Accessing Spark 2 UIs
Using Spark 2 from Python
Using Spark 2 from R
Using Spark 2 from Scala
Setting Up an HTTP Proxy for Spark 2

Jupyter Magic Commands

Spark 2 Configuration