Setting up Python for PyFlink
Before you can use Flink with the Python API, it is required to install and configure Python on every relevant node, or create and initialize a Python virtual environment.
- Connect to the Flink Gateway node using
CLI.
You are prompted to provide your password to the cluster.ssh root@[***FLINK GATEWAY NODE***]
- Check the version of
Python.
If the command fails or the versions are lower than 3.6, install Python.python --version
- Create a python virtual environment using the following
command:
conda create --copy -y -n flink_venv python=3.8
- Install PyFlink using the following
command:
python -m pip install apache-flink==1.19.1
- Install PyFlink on the YARN NameNode as well using the same steps.
- Connect to the Flink Gateway node using
CLI.
Provide your workload password when prompted.ssh <[***WORKLOAD USERNAME***]>@[***FLINK MANAGER NODE***]
- Create a Python virtual environment using the following
command:
conda create --copy -y -n flink_venv python=3.8
- Activate the newly created virtual
environment:
conda activate flink_venv
- Install PyFlink to the
flink_venv
virtual environment using the following command:python -m pip install apache-flink==1.19.1
- Create a ZIP archive from the
flink_venv
virtual environment so it can be deployed with a Flink job:cd path/to/flink_venv && zip -r venv.zip .
When the Python installation is complete, you can submit Flink application that were created using the Python API.