Install the Data Plane Profiler Agent
DSS requires that the DP Profiler Agent be installed on all custers. The Profiler is installed on the Ambari host, using an Ambari management pack (MPack). An MPack bundles service definitions, stack definitions, and stack add-on service definitions.
You must have root access to the Ambari Server host node to
perform this task.
Important | |
---|---|
Prior to starting installation, you must have
downloaded the required repository tarballs from the Hortonworks customer portal,
following the instructions provided as part of the product procurement
process. The repository tarballs for the Data Plane Profiler agent are different from the DSS app repository tarballs. |
- Log in as root to an Ambari host on a cluster.
ssh root@<ambari-ip-address>
-
Install the Data Plane Profiler MPack by running the following command, replacing
<mpack-file-name> with the name of the MPack.
ambari-server install-mpack --mpack <mpack-file-name> --verbose
- Restart the Ambari server.
ambari-server restart
-
Launch Ambari in a browser and log in.
http://<ambari-server-host>:8080Default credentials are:
- Username: admin
- Password: admin
- Click Admin>Manage Ambari.
-
Click Versions, and then do the following on the Versions
page:
- Click the HDP version in the Name column.
-
Change the Base URL path for the DSS service to point
to the local repository, for example:
http://webserver.com/DSS/centos7/1.2.0.0-X
URLs shown are for example purposes only. Actual URLs might be different. - Click the Ambari logo to return to the main Ambari page.
- In the Ambari Services navigation pane, click Actions>Add
Service.
The Add Service Wizard displays. -
On the Choose Services page of the Wizard, select the
Dataplane Profiler service to install in Ambari, and then follow the
on-screen instructions.
Other required services are automatically selected.
- When prompted to confirm addition of dependent services, give a positive confirmation to all.
This adds other required services.
- On the Assign Masters page, you can choose the default settings.
-
On the Customize Services page, fill out the database
details and other required fields that are highlighted.
Make sure to enter the credentials that you set while configuring the external database. Change the username profileragent to the values set in the external database.NoteMake sure to add the database driver to the machine based on the external database that you configured.
- Complete the remaining installation wizard steps and exit the wizard.
-
Ensure that all components required for your DataPlane Platform have started
successfully.
NoteAs part of the installation verification screen, an earlier version of DSS repositories might appear in the labels. You can ignore the version number in the version number and proceed further.
-
Enable Knox SSO for DP Profiler Agent.
- Set
dpprofiler.sso.knox.enabled
to true in Advanced dpprofiler-env section in Ambari DP Profiler Configs. - Run the following CLI command to export the Knox certificate:
JAVA_HOME/bin/keytool -export -alias gateway-identity -rfc -file knox-pub-key.cert -keystore /usr/hdp/current/knox-server/data/security/keystores/gateway.jks
When prompted, enter the Knox master password.
- After generating the certificate, paste
the contents of the certificate in the
dpprofiler.sso.knox.public.key
field under Advanced dpprofiler-env properties of DP Profiler Configs in Ambari.
- Set
- Open the quick link of the profiler for service verification.
-
Add
/profilers
to the quick link URL.If the quick link is xyz:21900, change it to xyz:21900/profilers.NoteFor non-Kerberized clusters, this request returns the list of all registered profilers. For kerberos-enabled clusters where Knox is not enabled for DP Profiler Agent, you will see an HTTP-401 response which is expected. -
After installing the profiler agent using Add Service Wizard in Ambari, the NodeManager hosts do not have the dpprofiler user.
For Ambari to automatically create these users, restart all NodeManagers by going to Services->YARN->Restart NodeManagers (NodeManagers can be restarted in a rolling fashion - Ambari UI shows restart batching options)
NoteDuring DP Profiler Agent installation, two new Atlas types -
dss_hive_column_profile_data
anddss_hive_table_profile_data
, are registered. These types contain attributes to store metrics computed by DSS profilers. In addition, existing Atlas typeshive_table
andhive_column
are updated to add an additional attributeprofileData
. Forhive_table
type, attributeprofileData
is a reference todss_hive_table_profile_data
and for typehive_column
, attributeprofileData
is a reference todss_hive_column_profile_data
.ImportantAs part of installation of DataPlane Profiler Agent on HDP 3.x versions, make sure you enter the details of DP Profiler extra JARs when prompted as part of the advanced dpprofiler-env properties. To get the value of the version of the JARs, log in to the Livy machine and navigate to this location:
Extract the details of the exact location with specific version details and paste in the Ambari section. Enter the value of the property as follows:///usr/hdp/current/hive-warehouse-connector/hive-warehouse-connector-assembly-<version>.jar
file:///usr/hdp/current/hive-warehouse-connector/hive-warehouse-connector-assembly-<version>.jar
- If TDE zones are set up in the cluster and if any of the following locations fall within the TDE zones, the dpprofiler user must have Decrypt_EEK access to the Key/Keys used to encrypt that zone.
- /user/dpprofiler
- /ranger/audit/hiveServer2
- /apps/dpprofiler
- all locations of Hive tables
-
In the Advanced dpprofiler-config section of the DP Profiler service in Ambari, make sure you enter the
Zookeeper Connection String
details.