Configuring CDS 2.x Powered by Apache Spark 2
Spark 3 on CDSW
Using CDS 2.x Powered by Apache Spark
Configuring CDS 2.x Powered by Apache Spark 2
Spark Configuration Files
Configuring Global Properties Using Cloudera Manager
Configuring Spark Environment Variables Using Cloudera Manager
Managing Memory Available for Spark Drivers
Managing Dependencies for Spark 2 Jobs
Spark Logging Configuration
Running Spark Jobs on an HDP Cluster
Setting Up an HTTP Proxy for Spark 2
Using Spark 2 from Python
Setting Up a PySpark Project
Spark on ML Runtimes
Example: Montecarlo Estimation
Example: Locating and Adding JARs to Spark 2 Configuration
Example: Distributing Dependencies on a PySpark Cluster
Using Spark 2 from R
Installing sparklyr
Connecting to Spark 2
Using Spark 2 from Scala
Accessing Spark 2 from the Scala Engine
Example: Read Files from the Cluster Local Filesystem
Example: Using External Packages by Adding Jars or Dependencies
Adding Remote Packages
Adding Remote or Local JARs