Running a Flink job

After developing your application, you can submit your Flink job in YARN per-job or session mode. To submit the Flink job, you need to run the Flink client in the command line including security parameters and other configurations with the run command.

  • You have deployed the Flink parcel on your cluster.
  • You have HDFS Gateway, Flink and YARN Gateway roles assigned to the host you are using for Flink submission. For instructions, see the Cloudera Manager documentation.
  • You have established your HDFS home directory.
The following is a working example of a word count application that reads text from a socket and counts the number of distinct words.
> hdfs dfs -put /opt/cloudera/parcels/FLINK/lib/flink/README.txt /tmp
> flink run --detached \
 /opt/cloudera/parcels/FLINK/lib/flink/examples/streaming/WordCount.jar \
 --input hdfs:///tmp/README.txt \
 --output hdfs:///tmp/ReadMe-Counts
> hdfs dfs -tail /tmp/ReadMe-Counts
You can set how to run your Flink job with the setting in the Flink configuration file. By default, is set to yarn-per-job, but you can change it to yarn-session. It is recommended to use per-job configuration to simple jobs, and the session configuration in case of SQL client.