Running a Flink job
After developing your application, you can submit your Flink job in YARN per-job or session mode. To submit the Flink job, you need to run the Flink client in the command line including security parameters and other configurations with the run command.
- You have deployed the Flink parcel on your cluster.
- You have HDFS Gateway, Flink and YARN Gateway roles assigned to the host you are using for Flink submission. For instructions, see the Cloudera Manager documentation.
- You have established your HDFS home directory.
The following is a working example of a word count application that reads text
from a socket and counts the number of distinct words.
> hdfs dfs -put /opt/cloudera/parcels/FLINK/lib/flink/README.txt /tmp > flink run --detached \ /opt/cloudera/parcels/FLINK/lib/flink/examples/streaming/WordCount.jar \ --input hdfs:///tmp/README.txt \ --output hdfs:///tmp/ReadMe-Counts > hdfs dfs -tail /tmp/ReadMe-Counts ... (and,7) (source,1) (code,2) ...
You can set how to run your Flink job with the
execution.target
setting in the Flink configuration file.
By default, execution.target
is set to
yarn-per-job
, but you can change it to
yarn-session
. It is recommended to use per-job
configuration to simple jobs, and the session configuration in case of SQL
client.