Apache Spark Incompatible Changes
-
As of CDH 5.1, before you can run Spark in standalone mode, you must set the spark.master property in /etc/spark/conf/spark-defaults.conf, as follows:
spark.master=spark://MASTER_IP:MASTER_PORT
where MASTER_IP is the IP address of the host the Spark master is running on and MASTER_PORT is the port.This setting means that all jobs will run in standalone mode by default; you can override the default on the command line.
-
This release of Spark includes changes that will enable Spark to avoid breaking compatibility in the future. As a result, most applications will require a recompile to run against Spark 1.0, and some will require changes in source code. The details are as follows:
-
There are two changes in the core Scala API:
- The cogroup and groupByKey operators now return Iterators over their values instead of Seqs. This change means that the set of values corresponding to a particular key need not all reside in memory at the same time.
- SparkContext.jarOfClass now returns Option[String] instead of Seq[String] .
- Spark’s Java APIs have been updated to accommodate Java 8 lambdas. See Migrating from pre-1.0 Versions of Spark for
more information.Note
: CDH 5.1 does not support Java 8.
-
<< Apache Sentry (incubating) Incompatible Changes | Apache Sqoop Incompatible Changes >> | |