Cloudera Navigator Metadata Server
Describes how to add and configure the Navigator Metadata Server role.
Continue reading:
- Adding the Navigator Metadata Server Role
- Starting, Stopping, and Restarting the Navigator Metadata Server
- Configuring the Navigator Metadata Server Storage Directory
- Configuring the Navigator Metadata Server Port
- Navigator Metadata Server Sizing and Performance Recommendations
- Moving a Navigator Metadata Server Role
- Enabling Hive Metadata Extraction in a Secure Cluster
- Configuring the Metadata Server to Mask Personally Identifiable Information
- Configuring a JMS Server for Policy Messages
- Enabling and Disabling Policy Expressions
Adding the Navigator Metadata Server Role
Minimum Required Role: Navigator Administrator (also provided by Full Administrator)
Before adding the Navigator Metadata Server role, configure the database where policies, roles, and audit report metadata is stored.- Do one of the following:
- Select .
- On the Status tab of the Home page, in Cloudera Management Service table, click the Cloudera Management Service link.
- Click the Instances tab.
- Click the Add Role Instances button. The Customize Role Assignments page displays.
- Assign the Navigator role to a host.
- Customize the assignment of role instances to hosts. The wizard evaluates the hardware configurations of the hosts to determine the best hosts for each role. The wizard assigns all
worker roles to the same set of hosts to which the HDFS DataNode role is assigned. These assignments are typically acceptable, but you can reassign them if necessary.
Click a field below a role to display a dialog containing a list of hosts. If you click a field containing multiple hosts, you can also select All Hosts to assign the role to all hosts or Custom to display the pageable hosts dialog.
The following shortcuts for specifying hostname patterns are supported:- Range of hostnames (without the domain portion)
Range Definition Matching Hosts 10.1.1.[1-4] 10.1.1.1, 10.1.1.2, 10.1.1.3, 10.1.1.4 host[1-3].company.com host1.company.com, host2.company.com, host3.company.com host[07-10].company.com host07.company.com, host08.company.com, host09.company.com, host10.company.com - IP addresses
- Rack name
Click the View By Host button for an overview of the role assignment by hostname ranges.
- Range of hostnames (without the domain portion)
- Customize the assignment of role instances to hosts. The wizard evaluates the hardware configurations of the hosts to determine the best hosts for each role. The wizard assigns all
worker roles to the same set of hosts to which the HDFS DataNode role is assigned. These assignments are typically acceptable, but you can reassign them if necessary.
- When you are satisfied with the assignments, click Continue. The Database Setup screen displays.
- Configure database settings:
- Choose the database type:
- Leave the default setting of Use Embedded Database to have Cloudera Manager create and configure required databases. Make a note of the auto-generated
passwords.
- Select Use Custom Databases to specify external databases.
- Enter the database host, database type, database name, username, and password for the database that you created when you set up the database.
- Leave the default setting of Use Embedded Database to have Cloudera Manager create and configure required databases. Make a note of the auto-generated
passwords.
- Click Test Connection to confirm that Cloudera Manager can communicate with the database using the information you have supplied. If the test succeeds in all cases, click Continue; otherwise check and correct the information you have provided for the database and then try the test again. (For some servers, if you are using the embedded database, you will see a message saying the database will be created at a later step in the installation process.) The Review Changes screen displays.
- Choose the database type:
- Click Finish.
Starting, Stopping, and Restarting the Navigator Metadata Server
- Do one of the following:
- Select .
- On the Status tab of the Home page, in Cloudera Management Service table, click the Cloudera Management Service link.
- Click the Instances tab.
- Do one of the following depending on your role:
-
Minimum Required Role: Full Administrator
- Check the checkbox next to the Navigator Metadata Server role.
- Select . Click Action to confirm the action, where Action is Start, Stop, or Restart.
-
Minimum Required Role: Navigator Administrator (also provided by Full Administrator)
- Click the Navigator Metadata Server role link.
- Select Action this Navigator Metadata Server, where Action is Start, Stop, or Restart, to confirm the action. . Click
-
Configuring the Navigator Metadata Server Storage Directory
Minimum Required Role: Navigator Administrator (also provided by Full Administrator)
Describes how to configure where the Navigator Metadata Server stores extracted data. The default is /var/lib/cloudera-scm-navigator.
- Do one of the following:
- Select .
- On the Status tab of the Home page, in Cloudera Management Service table, click the Cloudera Management Service link.
- Click the Configuration tab.
- Click the Navigator Metadata Server Default Group.
- Specify the directory in the Navigator Metadata Server Storage Dir property.
- Click Save Changes.
- Click the Instances tab.
- Check the checkbox next to the Navigator Metadata Server role.
- Select .
Configuring the Navigator Metadata Server Port
Minimum Required Role: Navigator Administrator (also provided by Full Administrator)
Describes how to configure the port on which the Navigator UI is accessed. The default is 7187.
- Do one of the following:
- Select .
- On the Status tab of the Home page, in Cloudera Management Service table, click the Cloudera Management Service link.
- Click the Configuration tab.
- Select .
- Specify the port in the Navigator Metadata Server Port property.
- Click Save Changes.
- Click the Instances tab.
- Check the checkbox next to the Navigator Metadata Server role.
- Select .
Navigator Metadata Server Sizing and Performance Recommendations
Minimum Required Role: Navigator Administrator (also provided by Full Administrator)
- Extracting metadata from the cluster and creating relationships
- Querying
The Navigator Metadata Server uses Solr to store, index, and query metadata. Indexing happens during extraction. Querying is fast and efficient because the data is indexed.
Memory and CPU requirements are based on amount of data that is stored and indexed. With 6 GB of RAM and 8-10 cores Solr can process 6 million entities in 25-30 minutes or 80 million entities in 8 to 9 hours. Any less RAM than 6GB and will result in excessive garbage collection and possibly out-of-memory exceptions. For large clusters, Cloudera advises at least 8 GB of RAM and 8 cores. The Solr instance runs in process with Navigator, so the Java heap for the Navigator Metadata Server should be set according to the size of cluster.
By default, during the Cloudera Manager Installation wizard the Navigator Audit Server and Navigator Metadata Server are assigned to the same host as the Cloudera Management Service monitoring roles. This configuration works for a small cluster, but should be updated before the cluster grows. You can either change the configuration at installation time or move the Navigator Metadata Server if necessary.
Moving a Navigator Metadata Server Role
Minimum Required Role: Navigator Administrator (also provided by Full Administrator)
- Stop the Navigator Metadata Server role, delete it from existing host, and add it to a new host.
- If the Solr data path is not on NFS/SAN, move the data to the same path on the new host.
- Start the Navigator Metadata Server role.
Enabling Hive Metadata Extraction in a Secure Cluster
Minimum Required Role: Navigator Administrator (also provided by Full Administrator)
The Navigator Metadata Server uses the hue user to connect to the Hive Metastore. The hue user is able to connect to the Hive Metastore by default. However, if the Hive service Hive Metastore Access Control and Proxy User Groups Override property and/or the HDFS service Hive Proxy User Groups property have been changed from their default values to settings that prevent the hue user from connecting to the Hive Metastore, Navigator Metadata Server will be unable to extract metadata from Hive. If this is the case, modify the Hive service Hive Metastore Access Control and Proxy User Groups Override property and/or the HDFS service Hive Proxy User Groups property so that the hue user can connect as follows:- Go to the Hive or HDFS service.
- Click the Configuration tab.
- Expand the category.
- In the Hive service Hive Metastore Access Control and Proxy User Groups Override field or the HDFS service Hive Proxy User Groups field, click the Value column, and click to add a new row.
- Type hue.
- Click Save Changes to commit the changes.
- Restart the service.
Configuring the Metadata Server to Mask Personally Identifiable Information
Minimum Required Role: Navigator Administrator (also provided by Full Administrator)
- Do one of the following:
- Select .
- On the Status tab of the Home page, in Cloudera Management Service table, click the Cloudera Management Service link.
- Click the Configuration tab.
- Expand the Navigator Metadata Server Default Group category.
- Click the Advanced category.
- Configure the PII Masking Regular Expression property with a regular expression that matches the credit card number formats to be masked. The default
expression is:
(4[0-9]{12}(?:[0-9]{3})?)|(5[1-5][0-9]{14})|(3[47][0-9]{13}) |(3(?:0[0-5]|[68][0-9])[0-9]{11})|(6(?:011|5[0-9]{2})[0-9]{12})|((?:2131|1800|35\\d{3})\\d{11})
which is constructed from the following subexpressions:- Visa - (4[0-9]{12}(?:[0-9]{3})?)
- MasterCard - (5[1-5][0-9]{14})
- American Express - (3[47][0-9]{13})
- Diners Club - (3(?:0[0-5]|[68][0-9])[0-9]{11})
- Discover - (6(?:011|5[0-9]{2})[0-9]{12})
- JCB - ((?:2131|1800|35\\d{3})\\d{11})
- Click Save Changes to commit the changes.
Configuring a JMS Server for Policy Messages
- Do one of the following:
- Select .
- On the Status tab of the Home page, in Cloudera Management Service table, click the Cloudera Management Service link.
- Click the Configuration tab.
- Expand the Navigator Metadata Server Default Group category.
- Expand the Policies category.
- Set the following properties:
Property Description JMS URL The URL of the JMS server to which notifications of changes to entities affected by policies are sent. Default: tcp://localhost:61616.
JMS User The JMS user to which notifications of changes to entities affected by policies are sent. Default: Navigator.
JMS Password The password of the JMS user to which notifications of changes to entities affected by policies are sent. Default: admin.
JMS Queue The JMS queue to which notifications of changes to entities affected by policies are sent. Default: admin.
- Click Save Changes to commit the changes.
- Restart the Metadata Server role.
Enabling and Disabling Policy Expressions
Minimum Required Role: Navigator Administrator (also provided by Full Administrator)
- Do one of the following:
- Select .
- On the Status tab of the Home page, in Cloudera Management Service table, click the Cloudera Management Service link.
- Click the Configuration tab.
- Expand the Navigator Metadata Server Default Group category.
- Expand the Policies category.
- Check or uncheck the Enable Expression Input checkbox.
- Click Save Changes to commit the changes.
- Restart the Metadata Server role.