Grid Displays

Cloudera Machine Learning supports native grid displays of DataFrames across several languages.

Python 3

Using DataFrames with the pandas package requires per-session activation:
import pandas as pd

For PySpark DataFrames, use pandas and run df.toPandas() on a PySpark DataFrame. This will bring the DataFrame into local memory as a pandas DataFrame.


In R, DataFrames will display as grids by default. For example, to view the Iris data set, you would just use:

Similar to PySpark, bringing Sparklyr data into local memory with will output a grid display.
sparkly_df %>%


Calling the display() function on an existing dataframe will trigger a collect, much like

val df = sc.parallelize(1 to 100).toDF()