Creating sessions in Cloudera Data Engineering

A Cloudera Data Engineering session is an interactive short-lived development environment for running Spark commands to help you iterate upon and build your Spark workloads.

The commands that are run in a Cloudera Data Engineering session are called Statements. You can submit the Statements through the connect CLI command or the Interact tab in the Cloudera Data Engineering UI for a session. Python and Scala are the supported session types. Learn how to use Cloudera Data Engineering sessions using the user interface and CLI.

In Cloudera Data Engineering, sessions are associated with virtual clusters. Before you can create a session, you must create a virtual cluster that can run it. For more information, see Creating virtual clusters.

  1. In the Cloudera console, click the Data Engineering tile. The Home page displays.
  2. Click Sessions in the left navigation menu and then click Create Session.
  3. Enter a Name for the session.
  4. Select a Type, for example, PySpark, Scala, or Spark Connect.
  5. Select a Timeout value.
    The session will stop after the indicated time has passed.
  6. Optional: Enter a Description for the session.
  7. Optional: Enter the Configurations.
  8. Optional: Click Data Connector drop-down list and select the name of the data connector from the list. The UI displays the storage information that is internally overwritten. For more information about how to add an Ozone data connector, see Adding Ozone data connector for Cloudera Data Engineering service.
  9. Set the Compute options.
    • Optional: GPU Acceleration (Technical Preview): You can accelerate your session using GPUs. Click Enable GPU Accelerations checkbox to enable the GPU acceleration and configure selectors and tolerations if you want to run the job on specific GPU nodes. When you run this session, this particular session will request GPU resources.
  10. Optional: In the Files and Resources section, you can upload Jar, Python, Egg, Zip, and other files. You can also add a resource, respositories, or a Python environment to be used in this session.
    Files that are uploaded to a session are stored in the app/mount directory.
  11. Share the session with a user or group.
    1. In the Sharing Settings section, click Add User or Group. The Add User or Group pop-up appears.
    2. In the Search for a User or a Group field, type the user or group name and select the required user or group from the list.
    3. Select Full or Read Only depending on the access you want to provide from the Access Level drop-down list.
    4. Click Add.
  12. Click Create.
    The Connect tab displays a list of connectivity options available to interact with the session. The Interact tab allows you to interact with the session, and becomes available once the session is running.
  13. To delete a session, open the session and click Delete.