-
In your session, open the workbench and add the following code.
-
Obtain the JDBC connection string, as described above, and paste it
into the script where the “jdbc” string is shown. You will also need to
insert your user name and password, or create environment variables for
holding those values.
from pyspark.sql import SparkSession
from pyspark_llap.sql.session import HiveWarehouseSession
spark = SparkSession\
.builder\
.appName("CDW-CML-JDBC-Integration")\
.config("spark.security.credentials.hiveserver2.enabled","false")\
.config("spark.datasource.hive.warehouse.read.jdbc.mode", "client")\
.config("spark.sql.hive.hiveserver2.jdbc.url",
"jdbc:hive2://hs2-aws-2-hive-viz.env-j2ln9x.dw.ylcu-atmi.cloudera.site/default;transportMode=http;httpPath=cliservice;ssl=true;retries=3;user=<username>;password=<password>")\
.getOrCreate()
hive = HiveWarehouseSession.session(spark).build()
hive.showDatabases().show()
hive.setDatabase("default")
hive.showTables().show()
hive.sql("select * from foo").show()