Data Steward Studio Installation
Also available as:
PDF

Ambari Dataplane Profiler Configs

From Ambari > Dataplane Profiler > Configs, you can view or update your database or advanced configurations.

Dataplane Profiler Database Configs

From Ambari > Dataplane Profiler > Configs > Database, you can view or update the DataPlane Profiler Database configurations.

Table 1. Database configs
Value Description Example
DP Profiler Database Database type or flavor used for DSS profiler.

h2

mysql

postgres

Note
Note
Make sure that the database type is entered in lower case, such as h2, mysql, or postgres.
Database Username A Database user needs to be created in the MySQL or Postgres DB that the profiler service would use to connect to the DB. This is name of that database user. profileragent
Database Name Name must be “profileragent”.
Important
Important
Do not modify.
profileragent
Database URL The URL of DP profiler database.

H2: jdbc:h2:/var/lib/profiler_agent/h2/profileragent;DATABASE_TO_UPPER=false;DB_CLOSE_DELAY=-1

MySQL: jdbc:mysql://hostname:3306/profileragent?autoreconnect=true

POSTGRES: jdbc:postgresql://hostname:5432/profileragent

Note
Note
Make sure that the database name within the URL is in lower case, such as h2, mysql, or postgres.
Database Host Database host name for Profiler Agent server <hostname>
Database password The password for your DP database. <your_password>
Note
Note
On HDP 3.x versions, Profiler Agent service on Ambari UI does not have a separate tab for configuring database. All database configuration is available as part of Dataplane Profiler Database Configs.

Dataplane Profiler Advanced Configs

From Ambari > Dataplane Profiler > Configs > Advanced, you can view or update the DataPlane Profiler advanced configurations.

Table 2. Advanced dpprofiler-config
Value Description Example
Dependent Cluster Configurations

Provides various cluster configurations, including: atlasUrl

rangerAuditDir

metastoreUrl

metastoreKeytab

metastorePrincipal

atlasUrl=application-properties/atlas.rest.address;rangerAuditDir=ranger-env/xasecure.audit.destination.hdfs.dir;metastoreUrl=hive-site/hive.metastore.uris;metastoreKeytab=hive-site/hive.metastore.kerberos.keytab.file;metastorePrincipal=hive-site/hive.metastore.kerberos.principal
Additional Cluster Configurations Additional configuration items of services in the cluster that can be set for use by profilers.
Profilers local home directory Local directory for the profilers. /usr/dss/current/profilers
Profilers shared results directory The HDFS directory where DSS Profilers will store their metrics output. Ensure the dpprofiler user has full access to this directory. /user/dpprofiler/dwh
Profilers shared binaries directory HDFS directory for the profilers. /apps/dpprofiler/profilers
SPNEGO Cookie Name Cookie name that is returned to the client after successful SPNEGO authentication. dpprofiler.spnego.cookie
SPNEGO Signature Secret Secret for verifying and signing the generated cookie after successful authentication ***some***secret**
Maximum assets submitted per profiler job Maximum number of assets to be submitted in one profiler job. 50
Maximum number of concurrent profiler jobs Number of profiler jobs active at a point in time. This is per profiler. 2
Job scan interval Time in seconds after which the profiler looks for an asset in the queue and schedules the job if the queue is not empty. 30
Maximum number of assets queued for submission Maximum size of the profiler queue. After which it rejects any new asset submission request. 500
Table 3. Advanced dpprofiler-env
Value Description Example
Profiler service local configuration directory Configuration files directory. /etc/profiler_agent/conf
Profiler service local data directory

Data directory. If using h2, data is stored here.

/var/lib/profiler_agent
Profiler service HTTP Port Port where profiler agent runs. 21900
Profiler Knox SSO Enabled Enable this to use Knox SSO for Profiler
Profiler Knox SSO Public Key Knox SSO Public Certificate Run the following CLI command to export the Knox certificate:
JAVA_HOME/bin/keytool -export -alias gateway-identity -rfc -file knox-pub-key.cert -keystore /usr/hdp/current/knox-server/data/security/keystores/gateway.jks
             
When prompted, enter the Knox master password. After generating the certificate, paste the contents of the certificate in this field.
Profiler Service Keytab Profiler agent keytab location. /etc/security/keytabs/dpprofiler.kerberos.keytab
Profiler Service Principal Profiler agent kerberos principal.

dpprofiler${principalSuffix}@REALM.COM

principalSuffix is a random string which is generated by Ambari for a cluster. This string is used to uniquely identify services on a cluster in case of multiple clusters being managed by single KDC

Number of retries for refresh of Kerberos ticket Maximum number of retries allowed for refreshing the Kerberos ticket 5
Profiler service local log directory Log Directory /var/log/profiler_agent
Profiler service local PID file directory Pid Directory /var/run/profiler_agent
Profiler Service SPNEGO Kerberos Keytab SPNEGO keytab location. /etc/security/keytabs/spnego.service.keytab
Profiler Service SPNEGO Kerberos Principal SPNEGO Kerberos principal. HTTP/${FQDN}@REALM.COM

FQDN - fully qualified domain name of the machine

Profiler Service Logging configuration Content for logback.xml.
<configuration>

<conversionRule conversionWord="coloredLevel" converterClass="play.api.libs.logback.ColoredLevel" />

<appender name="FILE" class="ch.qos.logback.core.FileAppender">
<file>{{dpprofiler_log_dir}}/application.log</file>
<encoder>
<pattern>%date [%level] from %logger in %thread - %message%n%xException</pattern>
</encoder>
</appender>

<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>%coloredLevel %logger{15} - %message%n%xException{10}</pattern>
</encoder>
</appender>

<appender name="ASYNCFILE" class="ch.qos.logback.classic.AsyncAppender">
<appender-ref ref="FILE" />
</appender>

<appender name="ASYNCSTDOUT" class="ch.qos.logback.classic.AsyncAppender">
<appender-ref ref="STDOUT" />
</appender>

<logger name="play" level="INFO" />
<logger name="application" level="DEBUG" />

<!-- Off these ones as they are annoying, and anyway we manage configuration ourselves -->
<logger name="com.avaje.ebean.config.PropertyMapLoader" level="OFF" />
<logger name="com.avaje.ebeaninternal.server.core.XmlConfigLoader" level="OFF" />
<logger name="com.avaje.ebeaninternal.server.lib.BackgroundThread" level="OFF" />
<logger name="com.gargoylesoftware.htmlunit.javascript" level="OFF" />

<root level="WARN">
<appender-ref ref="ASYNCFILE" />
<appender-ref ref="ASYNCSTDOUT" />
</root>

</configuration>
Table 4. Advanced dpprofiler-livy-config
Value Description Example
Read session driver core count Number of cores to use for the driver session for the read process. 1
Read session driver memory size Amount of memory to use for the driver process for the read session. 1g
Read session executor core count Number of cores to use for each executor for read session. 1
Read session executor memory size Amount of memory to use per executor for read session. 1g
Read session heartbeat timeout Timeout in seconds to which read session will be orphaned. 172800
Read session name Name of the read session. dpprofiler-read
Read session executor count Number of executors to launch for the read session. 2
Read session queue name Name of the YARN queue for the read sessions. default
Read session timeout Specifies timeouts for read requests using interactive session. 90
Write session driver core count Number of cores to use for the driver session for the write process. 1
Write session driver memory size Amount of memory to use for the driver process for the write session. 1g
Write session executor core count Number of cores to use for each executor for write session. 1
Write session executor memory size Amount of memory to use per executor for the write session. 1g
Write session heartbeat timeout Timeout in seconds to which write session will be orphaned. 172800
Write session name Name of the write session. dpprofiler-write
Write session executor count Number of executors to launch for the write session. 2
Write session queue name Name of the YARN queue for the write sessions. default
Write session timeout Specifies timeouts for write requests using interactive session. 90
Session Lifetime in Minutes Session lifetime in minutes after its creation before it will be swapped.

2880

For smaller clusters, it is recommended to set this to a smaller value like 240.

Session Lifetime in Requests Maximum number of requests a session can process before it will swapped. 500
Session creation retry count Maximum number of attempts for session creation. The session will be declared dead after these many retries. 20
Table 5. Custom dpprofiler-config
Value Description Example
dpprofiler.user User for Profiler Agent
Important
Important
Do not modify.
dpprofiler
Table 6. Custom dpprofiler-env
Value Description Example
Table 7. Custom dpprofiler-livy-config
Value Description Example