Spark Guide
Also available as:

Chapter 2. Prerequisites

Before installing Spark, make sure your cluster meets the following prerequisites.

Table 2.1. Prerequisites for Running Spark 1.4.1

HDP Cluster Stack Version
  • 2.3.2 or later

(Optional) Ambari Version
  • or later

Software dependencies
  • Spark requires HDFS and YARN

  • PySpark requires Python to be installed on all nodes

  • SparkR (tech preview) requires R binaries to be installed on all nodes


HDP 2.3.2 supports Spark 1.3.1 and Spark 1.4.1. When you upgrade your cluster to HDP 2.3.2, Spark is automatically upgraded to 1.4.1. If you wish to return to Spark 1.3.1, follow the Spark Manual Downgrade procedure in the HDP 2.3.2 release notes.