Running raw Scala code in Cloudera Data Engineering

Cloudera Data Engineering (CDE) supports running raw Scala code from the command line, without compiling it into a JAR file. You can use the cde spark submit command to run a .scala file. CDE recognizes the file as Scala code and runs it using spark-shell in batch mode rather than spark-submit.

Limitations:
  • When setting the Log Level from the user interface, the setting is not applied to the raw Scala jobs.
  • Do not use package <something> in the raw Scala job file as Raw Scala File is used for Scripting and not for Jar development and packaging.

Run cde spark submit as follows to run a Scala file:

cde spark submit filename.scala --jar <jar_dependency_1> --jar <jar_dependency_2> ...