Creating a CDSW data connection to a data warehouse
Learn how to connect natively to data stored in a data warehouse when using CDP Data
Visualization in Cloudera Data Science Workbench (CDSW).
You must connect to your data prior to using the data modeling and visualization
functionalities. The following steps show you how to create a new CDSW data connection to a
running Impala system.
When you create a connection, you automatically have privileges to create and manage
datasets on this connection, and to build dashboards and visuals in these datasets.
For more information on the Manage data connections privilege, see RBAC
permissions.
For instructions on how to define privileges for a role, see Setting role
privileges.
For instructions on how to assign the administrator role to a user, see Promoting
a user to administrator.
On the main navigation bar, click DATA.
The DATA interface appears, open on the
Datasets tab.
In the Data side menu bar, click NEW
CONNECTION.
The
Create New Data Connection modal window appears.
Select a Connection type from the drop-down
list.
Provide a name for the connection.
Enter the hostname or IP address of the running coordinator.
In this
example, you can see to create an Impala connection. You can get the coordinator hostname
from the JDBC URL of the Impala DW.
Under Port #, enter the port number.
Use your workload username and password as credentials.
Click the Advanced tab and make the appropriate
selections.
Click TEST.
If the connection is valid, the
system returns a Connection Verified message.
Click CONNECT.
You have set up a connection to a running data
warehouse.