Creating Sessions in Cloudera Data Engineering

A Cloudera Data Engineering Session is an interactive short-lived development environment for running Spark commands to help you iterate upon and build your Spark workloads.

The commands that are run in a Cloudera Data Engineering Session are called Statements. You can submit the Statements through the connect CLI command or the Interact tab in the Cloudera Data Engineering UI for a Session. Python and Scala are the supported Session types. Learn how to use Cloudera Data Engineering Sessions using the user interface and CLI.

In Cloudera Data Engineering, sessions are associated with virtual clusters. Before you can create a session, you must create a virtual cluster that can run it. For more information, see Creating virtual clusters.

In the Cloudera console, click the Data Engineering tile. The Home page displays.
Click Sessions in the left navigation menu and then click Create Session.
Enter a Name for the Session.
Select a Type, for example, PySpark, Scala, or Spark Connect.
Select a Timeout value.
The Session will stop after the indicated time has passed.
Optionally, enter a Description for the Session.
Optionally, enter the Configurations.
note
The Spark session is created during the job run or session creation. Most spark configurations are not modifiable during runtime and has to be specified during job run or session creation. You can check if a configuration can be modified by using spark.conf.isModifiable. For example,
```
spark.conf.isModifiable("spark.executor.memory")
False
```
Set the Compute options.
- Optional: GPU Acceleration (Technical Preview): You can accelerate your session using GPUs. Click Enable GPU Accelerations checkbox to enable the GPU acceleration and configure selectors and tolerations if you want to run the job on specific GPU nodes. When you run this session, this particular session will request GPU resources.
Optional: Under Files and Resources, you can upload Jar, Python, Egg, Zip, and other files. You can also add a resource, respositories, or a Python environment to be used in this session.

note
Uploading files and resources is supported only from Cloudera Data Services on premises 1.5.4 SP1 onwards.

Files that are uploaded to a session are stored in the app/mount directory.
Click Create.
The Connect tab displays a list of connectivity options available to interact with the Session. The Interact tab allows you to interact with the Session, and becomes available once the Session is running.
To delete a Session, open the Session and click Delete.

note
If you delete a Session, doing so will result in the termination of an active session and the loss of any attached logs and details.

Creating Sessions in Cloudera Data Engineering

We want your opinion

How can we improve this page?