HDP-2.3.4 Release Notes
Also available as:
PDF

Spark

[Important]Important

Hortonworks strongly recommends that all users running HDP 2.3.4 upgrade to HDP 2.3.4.7.

HDP 2.3.4 provides Spark 1.5.2 and the following Apache patches:

  • SPARK-10058: CORE, TESTS, Fix the flaky tests in HeartbeatReceiverSuite.

  • SPARK-10389: SQL, [1.5 support order by non-attribute grouping expression on Aggregate.

  • SPARK-10515: When killing executor, the pending replacement executors should not be lost.

  • SPARK-10534: SQL, ORDER BY clause allows only columns that are present in the select projection list.

  • SPARK-10577: PYSPARK, DataFrame hint for broadcast join.

  • SPARK-10581: DOCS, Groups are not resolved in scaladoc in SQL classes.

  • SPARK-10619: Can't sort columns on Executor Page.

  • SPARK-10741: SQL, Hive Query Having/OrderBy against Parquet table is not working

  • SPARK-10790: YARN, Fix initial executor number not set issue and consolidate the codes.

  • SPARK-10812: YARN, Fix shutdown of token renewer..

  • SPARK-10812: YARN, Spark hadoop util support switching to yarn.

  • SPARK-10825: CORE, TESTS, Fix race conditions in StandaloneDynamicAllocationSuite.

  • SPARK-10829: SQL, Fix 2 bugs for filter on partitioned columns.

  • SPARK-10833: BUILD, Inline, organize BSD/MIT licenses in LICENSE.

  • SPARK-10845: SQL, Makes spark.sql.hive.version a SQLConfEntry.

  • SPARK-10858: YARN: archives/jar/files rename with # doesn't work unl.

  • SPARK-10859: SQL, fix stats of StringType in columnar cache.

  • SPARK-10871: include number of executor failures in error msg.

  • SPARK-10885: STREAMING, Display the failed output op in Streaming UI.

  • SPARK-10889: STREAMING, Bump KCL to add MillisBehindLatest metric.

  • SPARK-10901: YARN, spark.yarn.user.classpath.first doesn't work.

  • SPARK-10904: SPARKR, Fix to support `select(df, c("col1", "col2"))`.

  • SPARK-10914: UnsafeRow serialization breaks when two machines have different Oops size..

  • SPARK-10932: PROJECT INFRA, Port two minor changes to release-build.sh from scripts' old repo.

  • SPARK-10934: SQL, handle hashCode of unsafe array correctly.

  • SPARK-10952: Only add hive to classpath if HIVE_HOME is set..

  • SPARK-10955: STREAMING, Add a warning if dynamic allocation for Streaming applications.

  • SPARK-10959: PYSPARK, StreamingLogisticRegressionWithSGD does not t….

  • SPARK-10959: PYSPARK, StreamingLogisticRegressionWithSGD does not train with given regParam and convergenceTol parameters.

  • SPARK-10960: SQL, SQL with windowing function should be able to refer column in inner select.

  • SPARK-10971: SPARKR, RRunner should allow setting path to Rscript..

  • SPARK-10973: ML, PYTHON, Fix IndexError exception on SparseVector when asked for index after the last non-zero entry.

  • SPARK-10980: SQL, fix bug in create Decimal.

  • SPARK-10981: SPARKR, SparkR Join improvements.

  • SPARK-11009: SQL, fix wrong result of Window function in cluster mode.

  • SPARK-11023: YARN, Avoid creating URIs from local paths directly..

  • SPARK-11026: YARN, spark.yarn.user.classpath.first does work for 'SPARK-submit --jars hdfs://user/foo.jar'.

  • SPARK-11032: SQL, correctly handle having.

  • SPARK-11039: DOCS, WEBUI, Document additional UI configurations.

  • SPARK-11047: Internal accumulators miss the internal flag when replaying events in the history server.

  • SPARK-11051: CORE, Do not allow local checkpointing after the RDD is materialized and checkpointed.

  • SPARK-11056: Improve documentation of SBT build..

  • SPARK-11063: STREAMING, Change preferredLocations of Receiver's RDD to hosts rather than hostports.

  • SPARK-11066: Update DAGScheduler's "misbehaved ResultHandler".

  • SPARK-11094: Strip extra strings from Java version in test runner.

  • SPARK-11103: SQL, Filter applied on Merged Parquet schema with new column fail.

  • SPARK-11104: STREAMING, Fix a deadlock in StreamingContex.stop.

  • SPARK-11126: SQL, Fix a memory leak in SQLListener._stageIdToStageMetrics.

  • SPARK-11126: SQL, Fix the potential flaky test.

  • SPARK-11135: SQL, Exchange incorrectly skips sorts when existing ordering is non-empty subset of required ordering.

  • SPARK-11153: SQL, Disables Parquet filter push-down for string and binary columns.

  • SPARK-11188: SQL, Elide stacktraces in bin/SPARK-sql for AnalysisExceptions.

  • SPARK-11233: SQL, register cosh in function registry.

  • SPARK-11244: SPARKR, sparkR.stop() should remove SQLContext.

  • SPARK-11246: SQL, Table cache for Parquet broken in 1.5.

  • SPARK-11251: Fix page size calculation in local mode.

  • SPARK-11264: bin/SPARK-class can't find assembly jars with certain GREP_OPTIONS set.

  • SPARK-11270: STREAMING, Add improved equality testing for TopicAndPartition from the Kafka Streaming API.

  • SPARK-11287: Fixed class name to properly start TestExecutor from deploy.client.TestClient.

  • SPARK-11294: SPARKR, Improve R doc for read.df, write.df, saveAsTable.

  • SPARK-11299: DOC, Fix link to Scala DataFrame Functions reference.

  • SPARK-11302: MLLIB, 2) Multivariate Gaussian Model with Covariance matrix returns incorrect answer in some cases.

  • SPARK-11303: SQL, filter should not be pushed down into sample.

  • SPARK-11417: SQL, no @Override in codegen.

  • SPARK-11424: Guard against double-close() of RecordReaders.

  • SPARK-11434: SQL, Fix test "Filter applied on merged Parquet schema with new column fails".

  • SPARK-5966: WIP, SPARK-submit deploy-mode cluster is not compatible with master local>.

  • SPARK-8386: SQL, add write.mode for insertIntoJDBC when the parameter overwrite is false.

HDP 2.3.2 provided Spark 1.4.1 and the following Apache patches:

NEW FEATURES

  • SPARK-1537 Add integration with Yarn's Application Timeline Server.

  • SPARK-6112 Provide external block store support through HDFS RAM_DISK.

BUG FIXES

  • SPARK-10623 NoSuchElementException thrown when ORC predicate push-down is turned on.

HDP 2.3.0 provided Spark 1.3.1 and the following Apache patches:

IMPROVEMENTS

  • SPARK-7326 (Backport) Performing window() on a WindowedDStream doesn't work all the time JDK 1.7 repackaging