Accessing external storage from Spark
Spark can access all storage sources supported by Hadoop, including a local file system, HDFS, HBase, Amazon S3, and Microsoft ADLS.
Spark supports many file types, including text files,
RCFile, SequenceFile, Hadoop
InputFormat, Avro, Parquet, and compression of all
supported files.
For developer information about working with external storage, see External Datasets in the upstream Apache Spark RDD Programming Guide.
