Spark Logging Configuration
Cloudera Data Science Workbench allows you to update Spark’s internal logging configuration on a per-project basis. Spark 2 uses Apache Log4j, which can be configured through a properties file.
By
      default, a log4j.properties file found in the
      root of your project will be appended to the existing Spark logging properties for every
      session and job. To specify a custom location, set the environmental variable LOG4J_CONFIG to the file location relative to your
      project. 
The Log4j documentation has more details on logging options.
Increasing the log level or pushing logs to an alternate location for
        troublesome jobs can be very helpful for debugging. For example, this is
        a 
      log4j.properties file in the root of a project that
        sets the logging level to INFO for Spark jobs.
        shell.log.level=INFO
      PySpark logging levels should be set as
        follows:
      log4j.logger.org.apache.spark.api.python.PythonGatewayServer=<LOG_LEVEL>
      And Scala logging levels should be set
        as:
   log4j.logger.org.apache.spark.repl.Main=<LOG_LEVEL>
      