Sqoop Hive import stops when HS2 does not use Kerberos authentication
Learn how to resolve the issue related to Sqoop Hive imports when either LDAP authentication or no authentication mechanism is enabled for the cluster.
Condition
23/07/24 18:10:17 INFO hive.HiveImport: Loading uploaded data into Hive 23/07/24 18:10:17 INFO hive.HiveImport: Collecting environment variables which need to be preserved for beeline invocation ... 23/07/24 18:10:20 INFO hive.HiveImport: SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 23/07/24 18:10:21 INFO hive.HiveImport: Connecting to jdbc:hive2://HOSTNAME/default;serviceDiscoveryMode=zooKeeper;ssl=true;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks;trustStorePassword=changeit;zooKeeperNamespace=hiveserver2
Cause
This issue occurs when Kerberos is not used in the JDBC connection string, which Sqoop uses to connect to HS2. The issue affects unsecure clusters and clusters where LDAP authentication is enabled, and the beeline-site.xml configuration file does not use Kerberos authentication.
The underlying issue is that Beeline prompts for the username and password for a successful connection and since the Sqoop Hive import is a non-interactive session, you are unable to provide the credentials and therefore the import job stops.
Solution
If... | Then... |
---|---|
No authentication is enabled for the cluster | Include the --hs2-url option in the Sqoop
import command and provide the JDBC connection
string.--hs2-url <HS2 JDBC string>This allows for a successful connection without prompting for the credentials. |
LDAP authentication is enabled for the cluster | Include the --hs2-user and
--hs2-password options in the Sqoop import
command and provide the credentials.--hs2-user <username> --hs2-password <password> |