On-Premises Installation

This guide will provide you with the information you need to install and configure Workload XM 2.1 on-premises.

Introduction

Workload XM on-premises must be installed in a dedicated cluster, separate from your development, test, or production workload clusters. This configuration minimizes the impact on the cluster and prevents the need to upgrade your CDH clusters to meet the needs of Workload XM.

The following terminology is used in this guide:

  • Workload XM on-premises - A CDP Data Center cluster managed by Cloudera Manager where Workload XM is installed on-premises. This cluster is used exclusively for running Workload XM.

  • CDH Cluster - A CDH or CDP Data Center cluster managed by Cloudera Manager where analytical workloads are run that will be analyzed by Workload XM.

This guide contains the prerequisites for installing and deploying Workload XM on-premises.

Architecture

The image below illustrates how Workload XM on-premises interacts with your other clusters. The Workload XM instance that you have installed on-premises is on the left, in a cluster that contains the other required services, which are listed in the Prerequisites section of this guide. On the right are the clusters that are running the workloads. Telemetry data is passed from these clusters to the Workload XM on-premises cluster.



Prerequisites

Before you install Workload XM on-premises, verify the following prerequisites:

Files

To get started, you'll need the installer files. For this version of Workload XM on-premises, these are the files you need:

  • Parcel: WXM-2.1.0.2.1.0-b804-3162213-el7.parcel
  • Parcel SHA: WXM-2.1.0.2.1.0-b804-3162213-el7.parcel.sha
  • CSD: WXM-2.1.0.2.1.0-b804-3162213.jar

You can get these files here: https://www.cloudera.com/downloads/workloadxm.html. If you do not have access to the downloads, contact your administrator for login information.

Hardware Requirements

In addition to the requirements for CDH Data Center and the other services you have installed, running Workload XM on-premises requires:

  • 5 nodes. Each node must have:
    • 16 Cores
    • 64 GB RAM
    • 12 TB disk space

For detailed information about the requirements for CDH, Cloudera Manager, and the other services, see the CDH Requirements and Supported Versions documentation.

Operating System Requirements

Workload XM on-premises supports the following operating systems:

CentOS and RHEL versions:

  • 7.6, 7.5, 7.4, 7.3, 7.2

CDP Data Center and Cloudera Manager Versions

Workload XM on-premises must run on CDP Data Center 7.0.3 or later with Cloudera Manager 7.0.3 or later.

Analysis of data from the following CDH and CDP clusters is supported:

  • CDH 5.x clusters:
    • CDH version 5.8 and later
    • Cloudera Manager version 5.15.1 and later
  • CDH 6.x clusters:
    • CDH version 6.1 and later
    • Cloudera Manager version 6.1 and later
  • CDP 7.x clusters:
    • CDP 7.0.3
    • CDP 7.1.1 (Hive on Tez is also supported)

Services

You must have the following services installed on the Workload XM cluster:

  • HBase
  • HDFS
  • Hive
  • Hue - Hue is optional, but is recommended for troubleshooting and data extraction
  • Impala
  • Phoenix
  • YARN
  • Zookeeper

Security

See the relevant sections below to configure Kerberos and TLS/SSL.

Configuring Kerberos

If you install Workload XM on a Kerberized environment, Workload XM must be able to create Phoenix tables for its data storage. To enable this, you must add the wxm user to the hbase.superuser property, as in the image below:



Configuring TLS

Cloudera recommends that you configure your cluster ot use auto-TLS to ease the process of configuring TLS/SSL. See the Enable Auto-TLS documentation for instructions on how to do this.

SSL is supported in the following contexts:

  • Between the browser and the Workload XM UI
  • Between the Telemetry Publisher and Workload XM API

Advanced TLS Configuraiton

Workload XM supports various TLS connections throughout the application. Configure the appropriate properties that are listed below to configure TLS based on the edge that you want to encrypt.

The properties listed here are organized based on where you want to enable TLS:

  • Between the browser and the Workload XM UI
    • Enable TLS/SSL for Console Server (ssl.enabled)
    • Console Server TLS/SSL Server Private Key File (PEM Format) (ssl.privatekey.path)
    • Console Server TLS/SSL Server Certificate File (PEM Format) (ssl.cert.path)
    • Console Server TLS/SSL Priavate Key Password (ssl.privatekey.password)
  • Between the Console Server (and other REST clients) and the API Server, Databus API Server, Admin API Server
    • Console Server TLS/SSL Certificate Trust Store File (ssl.cacert.path)
    • Admin API Server TLS/SSL Certificate Trust Store File (ssl.trustStore.path)
    • Enable TLS/SSL for Databus API Server (ssl.enabled)
    • Databus API Server TLS/SSL Server JKS Keystore File Location (ssl.keyStore.path)
    • Databus API Server TLS/SSL Server JKS Keystore File Password (ssl.keyStore.password)
    • Databus API Server TLS/SSL Server JKS Keystore Key Password (ssl.keyManager.password)
    • Enable TLS/SSL for API Server (ssl.enabled)
    • API Server TLS/SSL Server JKS Keystore File Location (ssl.keyStore.path)
    • API Server TLS/SSL Server JKS Keystore File Password (ssl.keyStore.password)
    • API Server TLS/SSL Server JKS Keystore Key Password (ssl.keyManager.password)
    • Enable TLS/SSL for Admin API Server (ssl.enabled)
    • Admin API Server TLS/SSL Server JKS Keystore File Location (ssl.keyStore.path)
    • Admin API Server TLS/SSL Server JKS Keystore File Password (ssl.keyStore.password)
    • Admin API Server TLS/SSL Server JKS Keystore Key Password (ssl.keyManager.password)
  • Between the Pipeline Server, Analytic Database Server, Entities Server, Databus Server, SDX Server and the Phoenix Server, Impala Server
      • Pipelines Server TLS/SSL Client Trust Store File (ssl.trustStore.path)
      • Pipelines Server TLS/SSL Client Trust Store Password (ssl.trustStore.password)
      • Analytic Database Server TLS/SSL Client Trust Store File (ssl.trustStore.path)
      • Analytic Database Server TLS/SSL Client Trust Store Password (ssl.trustStore.password)
      • Entities Server TLS/SSL Client Trust Store File (ssl.trustStore.path)
      • Entities Server TLS/SSL Client Trust Store Password (ssl.trustStore.password)
      • Databus Server TLS/SSL Client Trust Store File (ssl.trustStore.path)
      • Databus Server TLS/SSL Client Trust Store Password (ssl.trustStore.password)
      • SDX Server TLS/SSL Client Trust Store File (ssl.trustStore.path)
      • SDX Server TLS/SSL Client Trust Store Password (ssl.trustStore.password)

Authentication

See the relevant sections below to configure local authentication and LDAP authentication.

Local Authentication

After you install Workload XM, you can specify a user file that contains the users that you want to use to login to the UI.

To specify the user file, open the Workload XM Service and click the Configuration tab. Search for the following properties:

  • The User Authorization File Directory (user-file.dir) is the local directory for storing the user authorization file required by the Console Server. Defaults is /etc/wxm/conf.
  • The User Authorization File Name (user-file.name) is the name of the user authorization file required by the Console Server. This file is stored in the directory set by the user-file.dir parameter, and is created at service startup if it does not already exist. If this property is not set, it defaults to user-file.json.

You can manage users with the Console Server executable. On the workload XM host, navigate to the following directory:

${PARCELS_ROOT}/WXM/lib/thunderhead-sigma-console

To view the help output, enter this command:

./onprem-linux -h

You can add a user, remove a user, and list a user with the following commands, respectively:

./onprem-linux user add --user-file <user-file.dir><user-file.name>;
./onprem-linux user remove --user-file <user-file.dir><user-file.name>;
./onprem-linux user list --user-file <user-file.dir><user-file.name>;

When you add a user, follow the propmpts to create the username and password. If you try to edit a user file that does not exist, a prompt asks if you would like to create the file.

To change a user's username or password, first remove the user, then add it back again with the new credentials.

LDAP Authentication

Workload XM supports LDAP authentication trhrough the following properties:

    • Enable LDAP (ldap.enabled)
    • LDAP URL (ldap.url)
    • LDAP Bind User Distinguished Name (ldap.bind_dn)
    • LDAP Bind Password (ldap.bind_password)
    • LDAP Search Base (ldap.search_base)
    • LDAP Search Filter Property (ldap.search_filter_property)
    • LDAP Server CA Certificate (ldap.ca_cert)

For more information about LDAP authentication, see Configuring External Authentication and Authorization for Cloudera Manager.

Component Layout

Cloudera recommends following component layout for a 5-node cluster:

Service Node 1

(Master components of all services)

Node 2, 3, 4

(Worker components of all services)

Node 5

(Workload XM components)

HDFS
  • Balancer
  • Gateway
  • NameNode
  • NFS Gateway
  • SecondaryNameNode
  • DataNode
  • Gateway
YARN
  • Gateway
  • JobHistory Server
  • ResourceManager
  • NodeManager
 
HBase
  • Gateway
  • Master
  • Thrift Server
  • RegionServer
  • Gateway
Phoenix
  • Query Server
   
Hive
  • Gateway
  • Metastore Server
  • HiveServer2 (Hive on Tez for CDP)
 
  • Gateway
Impala
  • Catalog Server
  • StateStore
  • Impala Daemon
 
Hue
  • Load Balancer
  • Hue Server
   
Cloudera Management
  • Alert Publisher
  • Event Server
  • Host Monitor
  • Reports Manager
  • Service Monitor
   
Workload XM    
  • Console Server
Zookeeper  
  • Server
 
     

The following components will be placed automatically:

  • Admin API Server
  • Analytic Database Server
  • API Server
  • Baseline Server
  • Databus API Server
  • Databus Server
  • Entities Server
  • Pipelines Server
  • SDX Server

Add Safety Valves

Add the safety valves in the Phoenix and HBase site.xml files. You must have the Cluster Administrator or Full Administrator role to complete this task.

  1. In Cloudera Manager, open the HBase service and click the Configuration tab.
  2. Search for snippet and find the following safety valve: HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml
  3. Click View as XML and add the following text to the existing text:
    <property>
       <name>hbase.regionserver.wal.codec</name>
    <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
       <description>Set hbase.regionserver.wal.codec to enable custom Write Ahead Log ("WAL") edits to be written</description>
    </property>
       <property>
    <name>hbase.region.server.rpc.scheduler.factory.class</name>
    <value>org.apache.hadoop.hbase.ipc.PhoenixRpcSchedulerFactory</value>
       <description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description>
    </property>
    <property>
       <name>hbase.rpc.controllerfactory.class</name>
    <value>org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory</value>
       <description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description>
    </property>
    <property>
      <name>phoenix.functions.allowUserDefinedFunctions</name>
       <value>true</value>
       <description>enable UDF functions</description>
    </property>
    <property>
       <name>phoenix.queryserver.serialization</name>
       <value>JSON</value>
       <description>serialization format between client and query server</description>
    </property>
    <property><name>hbase.server.keyvalue.maxsize</name>
       <value>52428800</value>
       <description>limits max file size for blobs</description>
    </property>
    <property>
       <name>phoenix.schema.isNamespaceMappingEnabled</name><value>true</value>
    </property>



  4. Confirm that the WAL Codec Class property has the following text:
    org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec
  5. Click Save Changes.
  6. Still in the Configuration tab, search for the following property: Maximum Size of HBase Client KeyValue

    Set the property to 50 Mib.



  7. If you are installing Workload XM on a Kerberized environment, make sure the wxm user has been added to the hbase.superuser property, as described in the Security section above.



  8. Click Save Changes.
  9. Go back to the Cloudera Manager home page and open the Phoenix service. From there, open the Configurations tab.
  10. Search for snippet and find the following safety valve: Query Server Advanced Configuration Snippet (Safety Valve) for phoenix-site.xml
  11. Click View as XML and enter the following text:
    <property>
       <name>phoenix.queryserver.serialization</name>
       <value>JSON</value>
       <description>serialization format between client and query server</description>
    </property>
    <property>
       <name>phoenix.schema.isNamespaceMappingEnabled</name>
       <value>true</value>
    </property>



  12. Click Save Changes.
  13. On the Cloudera Manager home page, click Deploy Client Configruation. Restart the HBase and Phoenix services. To restart a service, click the menu button next to the service and select Restart.
  14. Instructions for validating the Phoenix installation are located here: Validating the Phoenix istallation

Extract and Deploy WXM Package

Deploy Workload XM with the CSD. As mentioned in the Prequisites section of this guide, you need the following files:

  • Parcel: WXM-2.1.0.2.1.0-b804-3162213-el7.parcel
  • Parcel SHA: WXM-2.1.0.2.1.0-b804-3162213-el7.parcel.sha
  • CSD: WXM-2.1.0.2.1.0-b804-3162213.jar

If you do not already have the files, download them here: https://www.cloudera.com/downloads/workloadxm.html. If you do not have access to the downloads, contact your administrator for login information.

  1. Login to the Cloudera Manager Server host and save the CSD in the following location:
    /opt/cloudera/csd
  2. Set the file ownership with the following command:
    chown cloudera-scm:cloudera-scm WXM-*;
    chmod 644 WXM-*;
  3. Restart the Cloudera Manager Server with the following command:
    service cloudera-scm-server restart
  4. Login to Cloudera Manager. In the Cloudera Management Service section in the left panel, click the menu button and select Restart. In case you're wondering, this step is not redundant and must be completed in addtion to the previous step.

Install Workload XM On-Premises

Next, install Workload XM on-premises!

Activating the Workload XM Parcel

  1. In Cloudera Manager, click Hosts > Parcels. This opens the Parcels page.
  2. Click Distribute for the Workload XM parcel.



  3. When the parcel is distributed, the parcel moves to the Distributed status section in the left menu. Find it there and click Activate.



  4. A window opens asking if you’re sure you want to activate the parcel. You are sure. Click OK.
  5. The Workload XM parcel will appear as distributed and activated:



Deploy Workload XM

Before you start, gather the following information, which you will use when you add the Workload XM service:

  • Phoenix Query Server Host
    • Open the Phoenix service and click the Instances tab. The Query Server host is listed there.
  • Impala Daemon Host
    • Open the Impala service and click the Instances tab. The Impala Daemon hosts are listed there. Pick any host that is running the Impala Daemon.
  1. Open the Cluster menu and select Add Service.



  2. In the window that opens, select Workload XM from the list of services and click Continue.
  3. On the Assign Roles page, you can assign roles for the Console Server.
  4. In the Review Changes page, you need to configure a few things. This is where you enter the Phoenix and Impala information you gathered at the beginning of this section:
    • Phoenix Query Server Host
    • Impala Daemon Host



  5. You can also see how much memory is being allocated for each group. If there is insufficient memory, Cloudera Manager lowers the default and shows a warning.
    • The following table contains the recommended heap sizes for various components:

      Service Heap Size
      Analytic Database Server 8 GB
      API Server 4 GB
      Admin API Server 2 GB
      Baseline Server 8 GB
      Databus API Server 4 GB
      Databus Server 2 GB
      Entities Server 8 GB
      Pipelines Server 8 GB
  6. On the bottom of the Review Changes page, you can edit these properties:
    • Console Service TLS/SSL Server Private Key File. Enter the location of your private key file.
    • Console Service TLS/SSL Server Certificate File. Enter the location of the certificate.
    • Console Service TLS/SSL Private Key password. Enter the password for your private key.
  7. Leave the Console Service TLS/SSL Server CA Certificate text box blank.



  8. Click Continue.
  9. You can watch the progress of the installation. When the installation is complete, the Workload XM Service will appear in the list of services in Cloudera Manager.
  10. Restart the Cloudera Management Service.

Configure Telemetry Publisher on the CDH Cluster

After Workload XM on-premises is installed, you must configure Telemetry Publisher on your CDH cluster.

Complete the following steps on the cluster where the workloads are running:

  1. In Cloudera Manager, open the Cloudera Management Service and open the Configuration tab.
  2. Search for the following property: Telemetry Publisher Advanced Configuration Snippet (Safety Valve) for telemetrypublisher.conf
  3. Add the text below for the property. If TLS/SSL is enabled for the Databus API Server (ssl.enabled), then use https for the telemetry.altus.url property. Otherwise, use http.
    telemetry.upload.job.logs=true
    telemetry.altus.url=<http|https>://<host of the Databus API Service>:12022
    telemetry.wa.enabled=true
    
  4. Next, follow the instructions in the Workload XM documentation for configuring Telemetry Publisher.

    Using Workload Experience Manager

    After you configure Telemetry Publisher, restart the Cloudera Management Service. In the Cloudera Management Service section in the left panel, click the menu button and select Restart.

  5. At this point, and before you login to the Workload XM, you can configure local authentication or LDAP authentication. For information on how to do this, see the Authentication section above.

Login to Workload XM On-Premises

Now you’re ready to login to the UI!

  1. Open the Workload XM on-premises service and click the Workload XM UI button. This opens the Workload XM login screen.
  2. Enter your username and password to login. The default is admin/admin.
  3. In the Clusters page, click on a cluster name to view the workload analytics for that cluster. Note that if you have not configured Telemetry Publisher on the CDH cluster, as described in the previous section, the cluster name will not show up here.
  4. For more information about how to use the Workload XM UI, see the Workload XM documentation here: Using Workload Experience Manager

And you're finished! The sections below contain additional information about the log files and ports that you may find useful.

Ports

Workload XM on-premises uses the following ports:

Service Web Port GRPC Port
API Server 12011, 12012  
Databus API Server 12021, 12022  
Analytic Database Server 12031 12032
Baseline Server 12041 12042
Databus Server 12051 12052
Entities Server 12061 12062
Pipelines Server 12071 12072
Admin API Server 12111 12112

The following ports are exposed service-side:

  • The Phoenix Query Server Port (phoenix.queryserver.port) is the port for the Phoenix Query Server that will be used by Workload XM.
  • The Impala Daemon Port (impala.daemon.port) is the port for the Impala Daemon that will be used by Workload XM.

For each role, refer to the chart below for a complete list of configurable ports and their default values. The purpose of each type of port is as follows:

  • UI Port (ui.port ) - Serves the Workload XM UI. It is served over HTTPS if TLS/SSL is enabled, and over HTTP otherwise.
  • API Port (api.port) - Listens to REST calls to API-based servers. It is served over HTTPS if TLS/ssl is enabled, and over HTTP otherwise.
  • Metrics Port (webservice.port) - Exposes an interface to metrics that Workload XM roles collect.
  • GRPC Port (grpc.port) - Listens to GRPC requests agains the backend servers. This protocol is used for inter-role communication.
Service UI Port

(ui.port)

API Port

(api.port)

Metrics Port

(webservice.port)

GRPC Port

(grpc.port)

Console Server 12001      
API Server   12012 12011  
Databus API Server   12022 12021  
Analytic Database Server     12031 12032
Baseline Server     12041 12042
Databus Server     12051 12052
Entities Server     12061 12062
Pipelines Server     12071 12072
SDX Server     12081 12082
Admin API Server     12111 12112

Troubleshooting

Failure Creating Phoenix Schema

You may encounter a stack trace that looks like this:

Role Log

at org.apache.phoenix.shaded.org.eclipse.jetty.util.thread.strategy.ExecuteProduceCon
sume.executeProduceConsume(ExecuteProduceConsume.java:303)

at org.apache.phoenix.shaded.org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)

at org.apache.phoenix.shaded.org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)

at org.apache.phoenix.shaded.org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)

at org.apache.phoenix.shaded.org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)

at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: ERROR 725 (43M08): Cannot create schema because config phoenix.schema.isNamespaceMappingEnabled for enabling name space mapping isn't enabled. schemaName=SIGMA_DB

The solution to this is to add the following property in the hbase-site safety valve:

<property><name>phoenix.schema.isNamespaceMappingEnabled</name><value>true</value></property>

After you add the property, redeploy the client configurations and restart HBase and the dependent services.

Installation Fails with Message

  1. The installation fails with the following message:



    Verify whether the parcel is correctly distributed and activated.

  2. The parcels page shows the following errors while distributing the parcel:



    This error happens when the parcel for Workload XM was placed in an incorrect directory and cloudera-scm-server was restarted. Remove the parcel from any unwanted directory and place it back in the parcel-repo directory. Then restart cloudera-scm-server and cloudera-scm-agent.