Running a Flink job
Learn about running a built-in example application in a few simple steps. This application demonstrates the Flink client for submitting YARN jobs.
- You have deployed the Flink parcel on your CDP Private Cloud Base cluster.
- You have HDFS Gateway, Flink and YARN Gateway roles assigned to the host you are using for Flink submission. For instructions, see the Cloudera Manager documentation.
- You have established your HDFS home directory.
The following is a working example of a word count application that reads text from a socket and counts the number of distinct words.
> hdfs dfs -put /opt/cloudera/parcels/FLINK/lib/flink/README.txt /tmp > flink run --detached \ /opt/cloudera/parcels/FLINK/lib/flink/examples/streaming/WordCount.jar \ --input hdfs:///tmp/README.txt \ --output hdfs:///tmp/ReadMe-Counts > hdfs dfs -tail /tmp/ReadMe-Counts ... (and,7) (source,1) (code,2) ...
You can set how to run your Flink job with the
execution.targetsetting in the Flink configuration file. By default,
execution.targetis set to
yarn-per-job, but you can change it to
yarn-session. It is recommended to use per-job configuration to simple jobs, and the session configuration in case of SQL client.