On-Premises Installation

This guide will provide you with the information you need to install and configure Workload XM 2.0 on-premises.

Introduction

Workload XM on-premises must be installed in a dedicated cluster, separate from your development, test, or production workload clusters. This configuration minimizes the impact on the cluster and prevents the need to upgrade your CDH clusters to meet the needs of Workload XM.

The following terminology is used in this guide:

  • Workload XM on-premises - A CDP cluster managed by Cloudera Manager where Workload XM is installed on-premises. This cluster is used exclusively for running Workload XM.

  • CDH Cluster - A CDH cluster managed by Cloudera Manager where analytical workloads are run that will be analyzed by Workload XM.

This guide contains the prerequisites for running Workload XM on-premises and instructions for installing and deploying Workload XM.

Prerequisites

Before you install Workload XM on-premises, verify the following prerequisites:

Hardware Requirements

In addition to the requirements for CDH and the other services you have installed, running Workload XM on-premises requires:

  • 5 nodes
  • 64 GB RAM
  • 12 TB disk space

For detailed information about the requirements for CDH, Cloudera Manager, and the other services, see the CDH Requirements and Supported Versions documentation.

Operating System Requirements

Workload XM on-premises supports the following operating systems:

CentOS and RHEL versions:

  • 7.6, 7.5, 7.4, 7.3, 7.2

CDH and Cloudera Manager Versions

Workload XM on-premises must run on CDP.

Analysis of data from the following CDH clusters is supported:

  • CDH 5.x clusters:
    • CDH version 5.8 and later
    • Cloudera Manager version 5.15.1 and later
  • CDH 6.x clusters:
    • CDH version 6.1 and later
    • Cloudera Manager version 6.1 and later

Services

You must have the following services installed on the Workload XM cluster:

  • HBase
  • HDFS
  • Hive
  • Hue - Hue is optional, but is recommended for troubleshooting and data extraction
  • Impala
  • Phoenix
  • YARN
  • Zookeeper

Security

For this version of Workload XM on-premises, you cannot have Kerberos installed and the Workload XM cluster cannot be Kerberized.

SSL between the browser and Workload Manager UI is supported.

Auto TLS must also be disabled in Cloudera Manager.

LDAP authentication is supported. Workload XM has a smiliar LDAP configuration to Cloudera Manger. For more information, see Configuring External Authentication and Authorization for Cloudera Manager.

Component Layout

Cloudera recommends following component layout for a 5-node cluster:

Service Node 1

(Master components of all services)

Node 2, 3, 4

(Worker components of all services)

Node 5

(Workload XM components)

HDFS Non HA / Non-Kerberized
  • Balancer
  • Gateway
  • NameNode
  • NFS Gateway
  • SecondaryNameNode
  • DataNode
  • Gateway
YARN
  • Gateway
  • JobHistory Server
  • ResourceManager
  • NodeManager
 
HBase
  • Gateway
  • Master
  • Thrift Server
  • RegionServer
  • Gateway
Phoenix
  • Query Server
   
Hive
  • Gateway
  • Metastore Server
  • HiveServer2 (Hive on Tez for CDP)
 
  • Gateway
Impala
  • Catalog Server
  • StateStore
  • Impala Daemon
 
Hue
  • Load Balancer
  • Hue Server
   
Cloudera Management
  • Alert Publisher
  • Event Server
  • Host Monitor
  • Reports Manager
  • Service Monitor
   
Workload XM    
  • Console Server
Zookeeper  
  • Server
 
     

The following components will be placed automatically:

  • Databus API Server
  • Admin API Server
  • API Server
  • Baseline Server
  • Databus Server
  • Entities Server
  • Pipelines Server
  • Analytic Database Server

Add Safety Valves

Add the safety valves in the Phoenix and HBase site.xml files. You must have the Cluster Administrator or Full Administrator role to complete this task.

  1. In Cloudera Manager, open the HBase service and click the Configuration tab.
  2. Search for snippet and find the following safety valve: HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml
  3. Click View as XML and add the following text to the existing text:
    <property>
       <name>hbase.regionserver.wal.codec</name>
    <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
       <description>Set hbase.regionserver.wal.codec to enable custom Write Ahead Log ("WAL") edits to be written</description>
    </property>
       <property>
    <name>hbase.region.server.rpc.scheduler.factory.class</name>
    <value>org.apache.hadoop.hbase.ipc.PhoenixRpcSchedulerFactory</value>
       <description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description>
    </property>
    <property>
       <name>hbase.rpc.controllerfactory.class</name>
    <value>org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory</value>
       <description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description>
    </property>
    <property>
      <name>phoenix.functions.allowUserDefinedFunctions</name>
       <value>true</value>
       <description>enable UDF functions</description>
    </property>
    <property>
       <name>phoenix.queryserver.serialization</name>
       <value>JSON</value>
       <description>serialization format between client and query server</description>
    </property>
    <property><name>hbase.server.keyvalue.maxsize</name>
       <value>52428800</value>
       <description>limits max file size for blobs</description>
    </property>
    



  4. Confirm that the WAL Codec Class property has the following text:
    org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec
  5. Click Save Changes.
  6. Still in the Configuration tab, search for the following property: Maximum Size of HBase Client KeyValue

    Set the property to 50 Mib.



  7. Click Save Changes.
  8. Go back to the Cloudera Manager home page and open the Phoenix service. From there, open the Configurations tab.
  9. Search for snippet and find the following safety valve: Query Server Advanced Configuration Snippet (Safety Valve) for phoenix-site.xml
  10. Click View as XML and enter the following text:
    <property>
       <name>phoenix.queryserver.serialization</name>
       <value>JSON</value>
       <description>serialization format between client and query server</description>
    </property>
    <property>
       <name>phoenix.schema.isNamespaceMappingEnabled</name>
       <value>true</value>
    </property>



  11. Click Save Changes.
  12. On the Cloudera Manager home page, click Deploy Client Configruation. Restart the HBase and Phoenix services. To restart a service, click the menu button next to the service and select Restart.
  13. Instructions for validating the Phoenix installation are located here: Validating the Phoenix istallation

Extract and Deploy WXM Package

Deploy Workload XM with the CSD. These instructions assume that you have received the CSD from Cloudera or an ISV.

You will need this file to complete these steps: WXM-2.0.0.jar

  1. Login to the Cloudera Manager Server host and save the CSD in the following location:
    /opt/cloudera/csd
  2. Set the file ownership with the following command:
    chown cloudera-scm:cloudera-scm WXM-*;
    chmod 644 WXM-*;
  3. Restart the Cloudera Manager Server with the following command:
    service cloudera-scm-server restart
  4. Login to Cloudera Manager. In the Cloudera Management Service section in the left panel, click the menu button and select Restart.

Activating the Workload XM Parcel

Place the parcel and .sha file on the Cloudera Manager Server host.

You will need these two files to complete these steps:

  • WXM-2.0.0-el7.parcel
  • WXM-2.0.0-el7.parcel.sha
  1. Login to the Cloudera Manager Server host and save the parcel and .sha file in the following location:
    /opt/cloudera/parcel-repo/
  2. Use the following commands to change the permissions on the files:
    chown cloudera-scm:cloudera-scm WXM-*;
    chmod 644 WXM-*;
    

Install Workload XM On-Premises

Next, install Workload XM on-premises!

Activating the Workload XM Parcel

Place the parcel and .sha file on the Cloudera Manager Server host.

You will need these two files to complete these steps:

  • WXM-2.0.0-el7.parcel
  • WXM-2.0.0-el7.parcel.sha
  1. Login to the Cloudera Manager Server host and save the parcel and .sha file in the following location:
    /opt/cloudera/parcel-repo/
  2. Use the following commands to change the permissions on the files:
    chown cloudera-scm:cloudera-scm WXM-*;
    chmod 644 WXM-2.0.0-el7.parcel WXM-*;
    
  3. In Cloudera Manager, click Hosts > Parcels. This opens the Parcels page.
  4. Click Distribute for the Workload XM parcel.



  5. When the parcel is distributed, the button changes from Distribute to Activate. Click Activate.



  6. A window opens asking if you’re sure you want to activate the parcel. You are sure. Click OK.
  7. The Workload XM parcel will appear as distributed and activated:



Deploy Workload XM

Before you start, gather the following information, which you will use when you add the Workload XM service:

  • Phoenix Query Server Host
    • Open the Phoenix service and click the Instances tab. The Query Server host is listed there.
  • Impala Daemon Host
    • Open the Impala service and click the Instances tab. The Impala Daemon host is listed there.
  1. Open the Cluster menu and select Add Service.



  2. In the window that opens, select Workload XM from the list of services and click Continue.
  3. On the Assign Roles page, you can assign roles for the following services:
    • Workload XM Console Service
    • Workload XM ADB Service
    • Workload XM Pipelines Service



  4. In the Review Changes page, you need to configure a few things. This is where you enter the Phoenix and Impala information you gathered at the beginning of this section:
    • Phoenix Query Server Host
    • Impala Daemon Host



  5. You can also see how much memory is being allocated for each group. If there is insufficient memory, Cloudera Manager lowers the default and shows a warning.
    • The following table contains the recommended heap sizes for various components:

      Service Heap Size
      ADB Service 8 GB
      API Service 4 GB
      Admin API Service 2 GB
      Baseline Service 8 GB
      DBUS API Service 4 GB
      DBUS Service 2 GB
      Entities Service 8 GB
      Pipelines Service 8 GB
      SDX Service 4 GB
      Time Series Service 4 GB
      Upload Service 2 GB
  6. On the bottom of the Review Changes page, you can edit these properties:
    • Console Service TLS/SSL Server Private Key File. Enter the location of your private key file.
    • Console Service TLS/SSL Server Certificate File. Enter the location of the certificate.
    • Console Service TLS/SSL Private Key password. Enter the password for your private key.
  7. Leave the Workload XM Console Service Default Group text box blank.



  8. Click Continue.
  9. You can watch the progress of the installation. When the installation is complete, the Workload XM service will appear in the list of services in Cloudera Manager.

Configure Telemetry Publisher on the CDH Cluster

After Workload XM on-premises is installed, you must configure Telemetry Publisher on your CDH cluster.

Complete the following steps on the cluster where the workloads are running:

  1. In Cloudera Manager, open the Cloudera Management Service and open the Configuration tab.
  2. Search for the following property: Telemetry Publisher Advanced Configuration Snippet (Safety Valve) for telemetrypublisher.conf
  3. Add this text for the property:
    telemetry.altus.url=http://<host of the Databus API Service>:12022
  4. Next, follow the instructions in the Workload XM documentation for configuring Telemetry Publisher.

    Using Workload Experience Manager

    After you configure Telemetry Publisher, restart the Cloudera Management Service. In the Cloudera Management Service section in the left panel, click the menu button and select Restart.

Login to Workload XM On-Premises

Now you’re ready to login to the UI!

  1. Open the Workload XM on-premises service and click the Workload XM UI button. This opens the Workload XM login screen.
  2. Enter your username and password to login.
  3. In the Clusters page, click on a cluster name to view the workload analytics for that cluster. Note that if you have not configured Telemetry Publisher on the CDH cluster, as described in the previous section, the cluster name will not show up here.
  4. For more information about how to use the Workload XM UI, see the Workload XM documentation here: Using Workload Experience Manager

And you're finished! The sections below contain additional information about the log files and ports that you may find useful.

Logs

You can view the Workload XM on-premises logs by following these steps:

  1. Open the Workload XM Service in Cloudera Manager.
  2. The Status Summary box on the right lists the Workload XM service. Click on a role name to open the service.
  3. On the role page, there is a Log Files link in the menu bar. Click that and select Role Log Files.



  4. You can also click Download Full Log to download the logs.



Specify the User File

Optionally, you can specify a user file that contains the users that you want to use to login to the UI.

To specify the user file, open the Workload XM service and click the Configuration tab. Search for the ZZZ property and enter the path and filename of the user file.

After you add the user file, you can edit the file with the executable. On the Workload XM host, navigate to the following directory:
/opt/cloudera/parcels/WXM-1.0/??? 

To view the help output, enter this command:

./onprem-wxm -h

You can add a user, remove a user, and list a user with the following commands, respectively:

./onprem-wxm user add --user-file <user-file>;
./onprem-wxm user remove --user-file <user-file>;
./onprem-wxm user list --user-file <user-file>;

When you add a user, follow the prompts to create the username and password. If you try to edit a user file that does not exist, a prompt asks you if you would like to create the file.

Ports

Workload XM on-premises uses the following ports:

Service Web Port GRPC Port
Console Service 12001  
API Service 12011, 12012  
DBUS API service 12021, 12022  
ADB Service 12031 12032
Baseline Service 12041 12042
DBUS Service 12051 12052
Entities Service 12061 12062
Pipelines Service 12071 12072
Admin API Service 12111 12112