Spark Connect Sessions
You can learn what a Spark Connect Session is, certain known limitations, and the supported Runtime component versions.
What a Spark Connect Session is
A session is an interactive short-lived development environment for running Spark commands. A Spark Connect Session is a type of CDE Session that exposes the Spark Connect interface. A Spark Connect Session allows you to connect to Spark from any remote Python environment.
Spark Connect allows you to connect remotely to the Spark clusters. Spark Connect is an API that uses the DataFrame API and unresolved logical plans as the protocol. The separation between client and server allows Spark and its open ecosystem to be leveraged from everywhere. It can be embedded in modern data applications, in IDEs and Notebooks. For more information about Spark Connect, identify the Spark version in your Virtual Cluster, and navigate to the relevant Spark Connect Overview page linked to that Spark version in the Spark documentation.
Supported versions of Cloudera Runtime components
Ensure that you are using Spark 3.5.1 before you use Spark Connect Sessions.
Limitations
- Profile support: Spark Connect does not support profiles in the configuration files even though the CDE clients support "Profiles" in the configuration files.