Running Apache Spark ApplicationsPDF version

Running an interactive session with the Livy API

Running an interactive session with Livy is similar to using Spark shell or PySpark, but the shell does not run locally. Instead, it runs in a remote cluster, transferring data back and forth through a network.

The Livy REST API supports GET, POST, and DELETE calls for interactive sessions.

The following example shows how to create an interactive session, submit a statement, and retrieve the result of the statement; the return ID could be used for further queries.

  1. Create an interactive session. The following POST request starts a new Spark cluster with a remote Spark interpreter; the remote Spark interpreter is used to receive and execute code snippets, and return the result.
    POST /sessions
            host = 'http://localhost:8998'
            data = {'kind': 'spark'}
            headers = {'Content-Type': 'application/json'}
            r = requests.post(host + '/sessions', data=json.dumps(data), headers=headers)
            r.json()
    
    {u'state': u'starting', u'id': 0, u'kind': u'spark'}
  2. Submit a statement. The following POST request submits a code snippet to a remote Spark interpreter, and returns a statement ID for querying the result after execution is finished.
    POST /sessions/{sessionId}/statements
            data = {'code': 'sc.parallelize(1 to 10).count()'}
            r = requests.post(statements_url, data=json.dumps(data), headers=headers)
            r.json()
    
    {u'output': None, u'state': u'running', u'id': 0}
  3. Get the result of a statement. The following GET request returns the result of a statement in JSON format, which you can parse to extract elements of the result.
    GET /sessions/{sessionId}/statements/{statementId}
            statement_url = host + r.headers['location']
            r = requests.get(statement_url, headers=headers)
            pprint.pprint(r.json())
    
    {u'id': 0,
       u'output': {u'data': {u'text/plain': u'res0: Long = 10'},
                   u'execution_count': 0,
                   u'status': u'ok'},
       u'state': u'available'}

    The remainder of this section describes Livy objects and REST API calls for interactive sessions.