Running a Flink job

Learn about running a built-in example application in a few simple steps. This application demonstrates the Flink client for submitting YARN jobs.

  • You have deployed the Flink parcel on your CDP Data Center cluster.
  • You have HDFS Gateway, Flink and YARN Gateway roles assigned to the host you are using for Flink submission. For instructions, see Cloudera Manager documentation.
  • You have established your HDFS home directory.
The following is a working example of a word count application that reads text from a socket and counts the number of distinct words.
> hdfs dfs -put /opt/cloudera/parcels/FLINK/lib/flink/README.txt /tmp
> flink run -m yarn-cluster \
 /opt/cloudera/parcels/FLINK/lib/flink/examples/streaming/WordCount.jar \
 --input hdfs:///tmp/README.txt \
 --output hdfs:///tmp/ReadMe-Counts
> hdfs dfs -tail /tmp/ReadMe-Counts
...
(and,7)
(source,1)
(code,2)
...