Running a Spark MLlib example

To try Spark MLlib using one of the Spark example applications, do the following:

  1. Download MovieLens sample data and copy it to HDFS:
    $ wget --no-check-certificate \
    https://raw.githubusercontent.com/apache/spark/branch-2.4/data/mllib/sample_movielens_data.txt
    $ hdfs dfs -copyFromLocal sample_movielens_data.txt /user/hdfs
  2. Run the Spark MLlib MovieLens example application, which calculates recommendations based on movie reviews:
    $ spark-submit --master local --class org.apache.spark.examples.mllib.MovieLensALS \
    SPARK_HOME/lib/spark-examples.jar \
    --rank 5 --numIterations 5 --lambda 1.0 --kryo sample_movielens_data.txt