Developing Apache Spark ApplicationsPDF version

Enabling Native Acceleration For MLlib

MLlib algorithms are compute intensive and benefit from hardware acceleration. To enable native acceleration for MLlib, perform the following tasks.

  • Install the appropriate libgfortran 4.6+ package for your operating system. No compatible version is available for RHEL 6.
    OS Package Name Package Version
    RHEL 7.1 libgfortran 4.8.x
    SLES 11 SP3 libgfortran3 4.7.2
    Ubuntu 12.04 libgfortran3 4.6.3
    Ubuntu 14.04 libgfortran3 4.8.4
    Debian 7.1 libgfortran3 4.7.2
  • Install the GPL Extras parcel or package.
You can verify that native acceleration is working by examining logs after running an application. To verify native acceleration with an MLlib example application:
  1. Do the steps in "Running a Spark MLlib Example."
  2. Check the logs. If native libraries are not loaded successfully, you see the following four warnings before the final line, where the RMSE is printed:

    15/07/12 12:33:01 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS
    15/07/12 12:33:01 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
    15/07/12 12:33:01 WARN LAPACK: Failed to load implementation from: com.github.fommil.netlib.NativeSystemLAPACK
    15/07/12 12:33:01 WARN LAPACK: Failed to load implementation from: com.github.fommil.netlib.NativeRefLAPACK
    Test RMSE = 1.5378651281107205.

    You see this on a system with no libgfortran. The same error occurs after installing libgfortran on RHEL 6 because it installs version 4.4, not 4.6+.

    After installing libgfortran 4.8 on RHEL 7, you should see something like this:
    15/07/12 13:32:20 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS
    15/07/12 13:32:20 WARN LAPACK: Failed to load implementation from: com.github.fommil.netlib.NativeSystemLAPACK
    Test RMSE = 1.5329939324808561.