Configure Gateway Hosts Using Cloudera Manager
Cloudera Data Science Workbench hosts must be added to your CDH cluster as gateway hosts, with gateway roles properly configured.
- If you have not already done so and plan to use PySpark, install either the Anaconda parcel or Python (versions 2.7.11 and 3.6.1) on your CDH cluster.
-
Configure Apache Spark on your gateway hosts.
-
Use Cloudera Manager to create add gateway hosts to your CDH cluster.
-
Test Spark 2 integration on the gateway hosts.