Developing Apache Spark Applications
Also available as:
PDF

Introduction

Apache Spark enables you to quickly develop applications and process jobs.

Apache Spark is designed for fast application development and processing. Spark Core is the underlying execution engine; other services, such as Spark SQL, MLlib, and Spark Streaming, are built on top of the Spark Core.

Depending on your use case, you can extend your use of Spark into several domains, including the following described in this chapter:

  • Spark DataFrames

  • Spark SQL

  • Calling Hive user-defined functions from Spark SQL

  • Spark Streaming

  • Accessing HBase tables, HDFS files, and ORC data (Hive)

  • Using custom libraries

For more information about using Livy to submit Spark jobs, see "Submitting Spark Applications Through Livy" in this guide.