Accessing JDBC external data in Impala
Learn about using the hive.sql.query table property to access external data through Impala.
While using the hive.sql.table property limits you to simple table access, using the hive.sql.query property allows you to use a custom SQL query to return a virtual table or subquery. The hive.sql.query property enables Impala to interact with complex joins, filters, or transformations defined on external systems without requiring schema-level access to base tables.
Using the hive.sql.query property provides the following benefits:
- Complex logic exposure – You can expose materialized logic, such as filters or joins, from Hive to Impala consumers in a secure and abstract way.
- Data virtualization – You can hide the physical data location and structure from the querying engine.
- External system integration – You can interact with views defined on external systems without direct access to base tables.
Creating a table with a custom query
The following example shows how to create an external JDBC table that uses a query to
filter data from a source table named country:
CREATE EXTERNAL TABLE country_filtered (
id INT,
name STRING
)
STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
TBLPROPERTIES (
"hive.sql.database.type" = "MYSQL",
"hive.sql.jdbc.driver" = "com.mysql.cj.jdbc.Driver",
"hive.sql.jdbc.url" = "jdbc:mysql://localhost:3306/functional",
"hive.sql.dbcp.username" = "hiveuser",
"hive.sql.dbcp.password" = "password",
"hive.sql.query" = "SELECT id, name FROM country WHERE id > 1"
);
When you define this table, Impala executes the specified SELECT statement against the external database.
