Accessing Data from Apache Hive
The following code sample demonstrates how to establish a connection with the Hive metastore and access data from tables in Hive.
Python
import os !pip3 install impyla !pip3 install thrift_sasl import os import pandas from impala.dbapi import connect from impala.util import as_pandas # Specify HIVE_HS2_HOST host name as an environment variable in your project settings HIVE_HS2_HOST='<hiveserver2_hostname>' # This connection string depends on your cluster setup and authentication mechanism conn = connect(host=HIVE_HS2_HOST, port=10000, auth_mechanism='GSSAPI', kerberos_service_name='hive') cursor = conn.cursor() cursor.execute('SHOW TABLES') tables = as_pandas(cursor) tables