Close HiveWarehouseSession operations
Spark can invoke operations, such as cache()
,
persist()
, and rdd()
, on a DataFrame you obtain from
running a HiveWarehouseSession executeQuery()
or table()
.
The Spark operations can lock Hive resources. You can release any locks and resources by
calling the HiveWarehouseSession close()
.
Calling close()
invalidates the HiveWarehouseSession instance and
you cannot perform any further operations on the instance.
Call
close()
when you finish running all other operations on the instance of HiveWarehouseSession.
import com.hortonworks.hwc.HiveWarehouseSession
import com.hortonworks.hwc.HiveWarehouseSession._
val hive = HiveWarehouseSession.session(spark).build()
hive.setDatabase("tpcds_bin_partitioned_orc_1000")
val df = hive.executeQuery("select * from web_sales")
. . . //Any other operations
.close()
You can also call close() at the end of an iteration if the application is
designed to run in a microbatch, or iterative, manner that does not need to
share previous states.
No more operations can occur on the DataFrame obtained by
executeQuery()
or table()
.