Using Impala to query Kudu tables
If you want to use Impala to query Kudu tables, you have to create a mapping between the Impala and Kudu tables.
Neither Kudu nor Impala need special configuration in order for you to use the Impala Shell or the Impala API to insert, update, delete, or query Kudu data using Impala. However, you do need to create a mapping between the Impala and Kudu tables. Kudu provides the Impala query to map to an existing Kudu table in the web UI.
- Make sure you are using the
impala-shellbinary provided by the default CDH Impala binary. The following example shows how you can verify this using the
alternativescommand on a RHEL 6 host. Do not copy and paste the
alternatives --setcommand directly, because the file names are likely to differ.
$ sudo alternatives --display impala-shell impala-shell - status is auto. link currently points to /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.25/bin/impala-shell /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.25/bin/impala-shell - priority 10 Current `best' version is /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.25/bin/impala-shell.
- Although not necessary, it is recommended that you configure
Impala with the locations of the Kudu Masters using the
--kudu_master_hosts=<master1>[:port]flag. If this flag is not set, you will need to manually provide this configuration each time you create a table by specifying the
kudu.master_addressesproperty inside a
TBLPROPERTIESclause. If you are using Cloudera Manager, no such configuration is needed. The Impala service will automatically recognize the Kudu Master hosts. However, if your Impala queries don't work as expected, use the following steps to make sure that the Impala service is set to be dependent on Kudu:
- Go to the Impala service.
- Click the Configuration tab and search for
- Make sure that the
Kudu Serviceproperty is set to the right Kudu service.
- Click Save Changes.
Before you carry out any of the operations listed within this section, make sure that this configuration has been set.
Start Impala Shell using the
impala-shellcommand. By default,
impala-shellattempts to connect to the Impala daemon on
localhoston port 21000. To connect to a different host, use the
To automatically connect to a specific Impala database, use the
-d <database>option. For instance, if all your Kudu tables are in Impala in the database
-d impala_kuduto use this database.
To quit the Impala Shell, use the following command: