Architecture Overview Cloudera ManagerCloudera Manager is an end-to-end application used for managing CDP clusters. When a CDP service (such as Impala, Spark, etc.) is added to the cluster, Cloudera Manager configures cluster hosts with one or more functions, called roles. ML RuntimesML Runtimes are responsible for running data science workloads and intermediating access to the underlying cluster. Cloudera Data Science Workbench Web ApplicationThe Cloudera Data Science Workbench web application is typically hosted on the master host, at http://cdsw.<your_domain>.com. CDS 2.x Powered by Apache SparkApache Spark is a general purpose framework for distributed computing that offers high performance for both batch and stream processing. It exposes APIs for Java, Python, R, and Scala, as well as an interactive shell for you to run jobs.