New Features and Changes in Cloudera Navigator Data Management
New Features in Cloudera Navigator
The following sections describe what's new in each Cloudera Navigator release.
- What's New in Cloudera Navigator 2.15.2
- What's New in Cloudera Navigator 2.15.1
- What's New in Cloudera Navigator 2.14.2
- What's New in Cloudera Navigator 2.14.1
- What's New in Cloudera Navigator 2.14.0
- What's New in Cloudera Navigator 2.13.4
- What's New in Cloudera Navigator 2.13.2
- What's New in Cloudera Navigator 2.13.1
- What's New in Cloudera Navigator 2.13.0
- What's New in Cloudera Navigator 2.12.3
- What's New in Cloudera Navigator 2.12.1
- What's New in Cloudera Navigator 2.12.0
- What's New in Cloudera Navigator 2.11.2
- What's New in Cloudera Navigator 2.11.1
- What's New in Cloudera Navigator 2.11.0
- What's New in Cloudera Navigator 2.10.2
- What's New in Cloudera Navigator 2.10.1
- What's New in Cloudera Navigator 2.10.0
- What's New in Cloudera Navigator 2.9.2
- What's New in Cloudera Navigator 2.9.1
- What's New in Cloudera Navigator 2.9.0
- What's New in Cloudera Navigator 2.8.3
- What's New in Cloudera Navigator 2.8.2
- What's New in Cloudera Navigator 2.8.1
- What's New in Cloudera Navigator 2.8.0
- What's New in Cloudera Navigator 2.7.5
- What's New in Cloudera Navigator 2.7.4
- What's New in Cloudera Navigator 2.7.3
- What's New in Cloudera Navigator 2.7.2
- What's New in Cloudera Navigator 2.7.1
- What's New in Cloudera Navigator 2.7.0
- What's New in Cloudera Navigator 2.6.6
- What's New in Cloudera Navigator 2.6.5
- What's New in Cloudera Navigator 2.6.4
- What's New in Cloudera Navigator 2.6.3
- What's New in Cloudera Navigator 2.6.2
- What's New in Cloudera Navigator 2.6.1
- What's New in Cloudera Navigator 2.6.0
- What's New in Cloudera Navigator 2.5.1
- What's New in Cloudera Navigator 2.5.0
- What's new in Cloudera Navigator 2.4.6
- What's New in Cloudera Navigator 2.4.4
- What's New in Cloudera Navigator 2.4.3
- What's New in Cloudera Navigator 2.4.2
- What's New in Cloudera Navigator 2.4.1
- What's New in Cloudera Navigator 2.4.0
- What's New in Cloudera Navigator 2.3.10
- What's New in Cloudera Navigator 2.3.9
- What's New in Cloudera Navigator 2.3.8
- What's New in Cloudera Navigator 2.3.3
- What's New in Cloudera Navigator 2.3.1
- What's New in Cloudera Navigator 2.3.0
- What's New in Cloudera Navigator 2.2.10
- What's New in Cloudera Navigator 2.2.9
- What's New in Cloudera Navigator 2.2.4
- What's New in Cloudera Navigator 2.2.3
- What's New in Cloudera Navigator 2.2.2
- What's New in Cloudera Navigator 2.2.1
- What's New in Cloudera Navigator 2.2.0
- What's New in Cloudera Navigator 2.1.6
- What's New in Cloudera Navigator 2.1.5
- What's New in Cloudera Navigator 2.1.4
- What's New in Cloudera Navigator 2.1.2
- What's New in Cloudera Navigator 2.1.1
- What's New in Cloudera Navigator 2.1.0
- What's New in Cloudera Navigator 2.0.5
- What's New in Cloudera Navigator 2.0.3
- What's New in Cloudera Navigator 2.0.2
- What's New in Cloudera Navigator 2.0.1
- What's New in Cloudera Navigator 2.0.0
What's New in Cloudera Navigator 2.15.2
There are no new features in Cloudera Navigator 2.15.2. See Issues Fixed in Cloudera Navigator 2.15.2. See also Known Issues and Workarounds in Cloudera Navigator Data Management for all known issues to date.
What's New in Cloudera Navigator 2.15.1
In addition to several new features and changes to existing features listed below, this release also includes several resolved issues. See Issues Fixed in Cloudera Navigator 2.15.1 for details. See also Known Issues and Workarounds in Cloudera Navigator Data Management for all known issues to date.
- New default audit extraction filters
- Improved HDFS metadata extraction performance
- Improved metadata purge performance
- New Cloudera Manager health alerts for Navigator Metadata Server
- New audit events for Fine Grained Privileges and Ownership
New features include:
- Improvements to audit extraction filters
This release introduces changes to default HDFS audit filters to better focus the collection of audit events. The following rules were added or updated:
- All HDFS denied access events are accepted. Previously, HDFS denied access events from filtered users and filtered HDFS directories were not recorded.
- HDFS events from the Hive, Spark, and Impala staging directories are discarded. HDFS events from the following job history directories are also discarded:
- /user/history/done_intermediate
- /user/spark/applicationHistory
- /user/spark/spark2ApplicationHistory
- All HDFS delete and rename operations from directories that are not already filtered are accepted.
In addition at the end of the filter list, Cloudera Manager provides a rule that filters events from HDFS getfileinfo operations. By default, this filter rule has no affect. Cloudera recommends that you change this rule to Discard to stop capturing events for this HDFS operation that indicates access to file metadata but does not indicate access to file data. This change alone may provide up to 30% reduction in HDFS audit data size.
When upgrading to this release, you may not see the new filters if you have customized your HDFS audit filters previously:
- If HDFS filters have been changed from the default, the new filters will NOT take effect.
- If HDFS filters have not been changed from the default, the new filters will take effect.
To review the recommended filters and determine further optimizations for your system's requirements, see Recommended Audit Filters and Reviewing Default Audit Filters.
- Improved HDFS metadata extraction performance
Previously, Navigator processing for HDFS metadata extraction could take long enough that Navigator could not finish indexing the fsimage before the edit log was checkpointed into a new fsimage. This caused extraction to get stuck in a loop of always parsing the fsimage in bulk extraction mode (slow) rather than moving into incremental extraction from the edit log (fast). This release includes changes that allow Navigator to avoid this "bulk extraction loop" therefore reducing the duration of HDFS extraction by as much as half for the initial extraction. This change improves the performance of subsequent extractions significantly, both by making it much more likely that extraction will be incremental and from additional performance improvements in this fix.
- Improved metadata purge performance
This release includes changes that significantly reduce the time required to complete purging for HDFS entities and relations. The purge behavior is changed such that HDFS entities are not included in the purge if they are referenced as an endpoint in a data flow lineage relation.
- New Cloudera Manager health alerts for Navigator Metadata Server
This release includes new checks for Navigator Metadata Server health. If the checks fail, they trigger a health alert in Cloudera Manager for the Navigator Metadata Server. The checks include "Solr Element Count Threshold" and "Solr Relation Count Threshold". These checks trigger an alert if documents in Navigator Metadata Server's embedded Solr collection are threatening to exceed the maximum allowed number for either the element core or the relation core. Typically, if either of these alerts trigger, Navigator has encountered a problem that produces more relations than it should. The triggers let you know that this is happening while it can be addressed efficiently.
- New audit events for Fine Grained Privileges and Ownership
Navigator Audit Server receives audit events from Sentry, Hive, and Impala that are produced when a user changes ownership for a database, table, or view. For more information, see Sentry's Object Ownership.
What's New in Cloudera Navigator 2.14.0
In addition to several new features and changes to existing features listed below, this release also includes several resolved issues. See Issues Fixed in Cloudera Navigator 2.14.0 for details. See also Known Issues and Workarounds in Cloudera Navigator Data Management for all known issues to date.
New features include:
- Dashboard widget for small files
Small files can be a performance problem on a Hadoop cluster, particularly when table partitions are assembled from many small files. The Navigator dashboard now provides widgets that visualize the presence of small files based on the files' owner, location, and size. An overview section shows the fraction of small files to large files in the cluster, including a warning if the fraction of small files to larger files exceeds 50%. This metric identifies "small files" as being smaller than 256 KiB.
- Dashboard widget for active users
This release includes an additional Navigator dashboard widget the shows the active users by query. The metric includes all Hive and Impala query operation executions.
- Preventing concurrent logins
An option is now available to limit the number of simultaneous sessions authenticated against the same user name. The property nav.max.concurrent.sessions takes an integer value as a limit; set the value to -1 (default) to turn off the limit. To change the default behavior, in Cloudera Manager, add the property with a new value to the "Navigator Metadata Server Advanced Configuration Snippet (Safety Valve) for cloudera-navigator.properties." Restart the Navigator Metadata Server to apply the change.
What's New in Cloudera Navigator 2.13.2
In addition to the new features listed below, this release also includes changes and resolved issues. See What's Changed in Cloudera Navigator 2.13.2 for changes, Issues Fixed in Cloudera Navigator 2.13.2 for resolved issues and Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
The new feature is:
- Preventing concurrent logins
An option is now available to limit the number of simultaneous sessions authenticated against the same user name. The property nav.max.concurrent.sessions takes an integer value as a limit; set the value to -1 (default) to turn off the limit. To change the default behavior, in Cloudera Manager, add the property with a new value to the "Navigator Metadata Server Advanced Configuration Snippet (Safety Valve) for cloudera-navigator.properties." Restart the Navigator Metadata Server to apply the change.
What's New in Cloudera Navigator 2.13.1
This release of Cloudera Navigator includes no new features or fixed issues. See Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
What's New in Cloudera Navigator 2.13.0
In addition to several new features and changes to existing features listed below, this release also includes several resolved issues. See Issues Fixed in Cloudera Navigator 2.13.0 for details. See also Known Issues and Workarounds in Cloudera Navigator Data Management for all known issues to date.
- Spark2 operations supported in search and lineage
- Improved Hive extraction performance
- Group by helps explore connections among search results
- User interface improvements
- Audit Event filtering with "Not Like"
New features include:
- Spark2 operations supported in search and lineage
This release introduces support for Spark2 operations and operation executions in Navigator metadata management, including search and lineage. Navigator extracts metadata for Spark2 events and generates lineage relations using that information. Spark operations and operation executions appear in Navigator with source type "Spark" and source names that start with "SPARK2_ON_YARN".
All the features of Spark 1 are available for Spark 2, starting with Spark 2 version 2.3.
At this time, lineage for Spark2 is not supported on Altus clusters.
- Improved Hive extraction performance
This release includes two areas of performance improvement for Hive metadata extraction:
- Partitions: Metadata extraction performance for Hive partitions is considerably improved.
- Incremental extraction: Metadata extraction through the Hive Metadata Server (HMS) now only extracts new and changed metadata; the result is more frequent extractions for Hive metadata reducing lag-time for users and significantly less load on HMS.
- Group by helps explore connections among search results
Metadata searches in Navigator now include the ability to group search results by common properties. Group by lets you use technical, managed, and custom metadata to quickly identify small files, active SQL users, table-creation trends, and other data aggregation trends revealed by metadata properties.
- User interface improvements
This release includes an improved layout for search results so they are easier to read and understand.
Also, the Navigator console includes some display changes to make detail screens easier to read:
- Tags are now included in the Custom Metadata group.
- Description, Managed Metadata, and Custom Metadata now always show above Technical Metadata, even if the sections are empty.
In addition, on the detail page for a database entity, clicking the table count summary statistic now takes you to search results listing the tables in the database.
- Audit Event filtering with "Not Like"
The Audit Events list now allows filtering using the "not like" operator in addition to "like", "equals", and "not equals". Use this operator to remove events from the list based on a partial match such as part of an IP address.
What's New in Cloudera Navigator 2.12.3
In addition to the new feature listed below, this release also includes a change and several resolved issues. See What's Changed in Cloudera Navigator 2.12.3 for changes, Issues Fixed in Cloudera Navigator 2.12.3 for resolved issues and Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
The new feature is:
- Preventing concurrent logins
An option is now available to limit the number of simultaneous sessions authenticated against the same user name. The property nav.max.concurrent.sessions takes an integer value as a limit; set the value to -1 (default) to turn off the limit. To change the default behavior, in Cloudera Manager, add the property with a new value to the "Navigator Metadata Server Advanced Configuration Snippet (Safety Valve) for cloudera-navigator.properties." Restart the Navigator Metadata Server to apply the change.
What's New in Cloudera Navigator 2.12.1
In addition to the new features listed below, this release also includes several resolved issues. See Issues Fixed in Cloudera Navigator 2.12.1 for details. See also Known Issues and Workarounds in Cloudera Navigator Data Management for all known issues to date.
New features include:
- Improved performance for extracting Hive metadata from the HiveServer2 metastore.
This release includes an optimization for extracting metadata for Hive partitions that considerably improves the extraction performance.
What's New in Cloudera Navigator 2.12.0
In addition to several new features and changes to existing features listed below, this release also includes several resolved issues. See Issues Fixed in Cloudera Navigator 2.12.0 for details. See also Known Issues and Workarounds in Cloudera Navigator Data Management for all known issues to date.
- Custom extraction filters for improved performance and reduced storage consumption
- Support for TLS/SSL for Navigator Audit Server
- Cloudera Navigator console enhancements and new features
New features include:
- New custom filters for more selective extraction also help reduce storage consumption.
Administrators can now create custom blacklists (and whitelists) to specifically exclude (or include) HDFS and Amazon S3 entities during the extraction process. The result is faster extraction—collect only what you really want—and less consumption by Cloudera Navigator on storage media. Users with the Cloudera Manager Full Administrator or Navigator Administrator user roles can configure the settings using the Cloudera Manager Admin Console. The entity filters for HDFS and for Amazon S3 buckets are independently configurable. As shown below, one or more HDFS paths can be blacklisted, while one or more S3 buckets can be whitelisted or blacklisted. The example shows an HDFS path that will be excluded from the extraction process, and two Amazon S3 buckets that will explicitly included in the extraction process (with all other S3 buckets disregarded).
Rather than hard-coding paths, regular expressions can be used. For example, to exclude all the files and directories under /tmp, include the blacklist filter:
/tmp(?:/.*)?
Unlike the HDFS filter which functions solely as a blacklist to exclude specified HDFS paths from the extraction process, the S3 Filter list serves as a blacklist or whitelist for the extraction process, depending on the S3 Filter Default Action specified as follows:- ACCEPT—The buckets listed are blacklisted from the extraction process. The extraction process filters out nothing (accepts everything) but ignores the buckets specified in the S3 Filter list.
- DISCARD—The extraction process filters out (discards) everything, collecting only from the buckets specified in the S3 Filter list.
- TLS/SSL supported by Navigator Audit Server data collection for enhanced security.
The Navigator Audit Server can now be configured to use TLS/SSL to encrypt communications over HTTP (HTTPS) and from RPC client-server processes (for example, Clouder Manager Agent process used to transmit audit data to the server). As with Navigator Metadata Server, TLS/SSL for Navigator Metadata Server is configured using the Cloudera Manager Admin Console.
- Cloudera Navigator console new visual element renderings (selectors, icons, fonts, and the like).
New renderings in the Cloudera Navigator console include the following:
- New date-and-time selector for Data Stewardship Dashboard.
The Cloudera Navigator console (Analytics > Data Stewardship > Dashboard) now has full date and range picker rather than the drop-down period selector of prior releases. The activity summary number also now shows the sum of events over the given time period (rather than the last point on the graph as was in prior releases), and the time period displays in the graph.
- Lineage for tables now includes database in Selected Entity.
In previous releases, the Database, Parent, and other fetched properties displayed on the details page, but not when selected in a lineage graph. In this release, databases are now shown in the Selected Entity box in lineage diagrams when tables are selected.
- Search results display more quickly in Cloudera Navigator console.
Counts for selected entities are now being cached for faster Search display. After the initial search for any selected Filter, the facet counts for each facet are populated and the results are cached and refreshed at regular intervals. Thanks to this caching, the Cloudera Navigator console is much more responsive for searching. The larger the cluster, the more noticeable the improvement.
- Improved formatting of SQL query text for ease of viewing and use.
The information pop-up for Query Text (on the Details for a selected entity) includes better use of white space and other formatting improvements for enhanced readability. (The screenshot below is based on simple sample code that does not show many of the new and improved visual cues for SQL syntax).
- New button to copy SQL to clipboard now available for Query Text.
The Query Text pop-up now includes a new copy-to-clipboard button that lets SQL users quickly copy the SQL text displayed in the pop-up to the clipboard for subsequent use elsewhere. As shown in the screenshot above, the location of this button is in the upper right-hand corner of the Query Text pop-up.
- User feedback module available.
When Analytics is enabled (Navigator Metadata Server set to Allow Usage Data Collection), a Feedback button becomes enabled for the Cloudera Navigator console (right-hand side, vertical orientation—see screenshot above) that lets users log actions and send feedback.
See What's Changed in Cloudera Navigator 2.12.0 for other changes in this release.
What's New in Cloudera Navigator 2.11.2
This release of Cloudera Navigator data management includes no new features. See Issues Fixed in Cloudera Navigator 2.11.2 for resolved issues and see Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
What's New in Cloudera Navigator 2.11.1
In addition to support for clusters deployed to Amazon Web Services (AWS) using Cloudera Altus, this release of Cloudera Navigator data management includes some Issues Fixed in Cloudera Navigator 2.11.1 resolved issues. See Known Issues and Workarounds in Cloudera Navigator Data Management for known issues and workarounds.
- Obtain metadata and lineage from Altus transient clusters.
Cloudera Navigator can now extract Hive and MapReduce metadata and lineage from transient clusters deployed to AWS using Cloudera Altus. Entities from clusters instantiated by Cloudera Altus users can be seen in lineage diagrams in a centralized Cloudera Navigator instance running on a persistent or long-running Cloudera Manager cluster.
This new feature is supported with new properties and attributes displayed in the Cloudera Navigator console, to distinguish long-running from Altus clusters and to distinguish among transient clusters. For example, the Cluster Group property identifies all clusters that have been created using the same Altus Environment Name and Altus Cluster name. The Cluster Instance distinguishes each instance in the same group from the others, so that metadata and lineage extracted from transient clusters over time can be identified. Here is a summary of new Source Type and Type attributes for clusters:- Source Type: Cluster
- Cluster-name (Cluster Group)
- Deployment Type: Long-running, Altus
- Classname: Cluster Group, Cluster Instance
- Cluster Template, Cluster Instance (Classname)
Entities extracted and their labels as seen in the Cloudera Navigator console include the following:Entity Label altusAwsRegion Altus AWS Region altusClusterCrn Altus Cluster CRN altusClusterType Altus Cluster Type altusComputeWorkerInstanceType Altus Compute Worker Instance Type altusEnvCrn Altus Env CRN altusWorkerInstanceType Altus Worker Instance Type
What's New in Cloudera Navigator 2.11.0
In addition to the new features and the changes listed below, this release resolves some issues. See Issues Fixed in Cloudera Navigator 2.11.0.
- Automated Purge feature enabled by default.
The Purge feature is enabled by default in this release. It uses the default settings shown in the table below unless these settings are changed using the Cloudera Navigator console using the new Purge Settings tab.
Property Default Usage note How often Weekly The Purge process runs automatically each week on the day and time specified. Day Saturday Select a day for the purge that will have minimal impact to your user community. Time 12 Midnight Select a time that will have minimal impact on production. Maximum purge duration 12 hours Set the amount of time you want to allow for the Purge process to run. The process will not run beyond your specified duration even if it has not completed the purge. Entities purged to that point remain purged. No other Cloudera Navigator operations can occur during the Purge process. Purge entities deleted more than* 60 days Select the number of days after entity deletion that will elapse until the purge process removes it. A setting of 60 day purges entities deleted over 61 days ago but retains entities deleted within the last 60 days. Purge SELECT operations* Enabled. Hive and Impala SELECT operations older than the number of days specified in the Only Purge SELECT operations older than* setting are purged. Only Purge SELECT operations older than* 60 days The purge will include only those SELECT operations older than this specified number of days. - Scalability enhancement for Hive and Impala query execution.
Heavy Hive and Impala usage can consume lots of storage for similar queries in the Cloudera Navigator data directory (Solr repository). Specifically, because each query can produce (query * query(parts)) entities, a large number of complex queries can result in thousands (or millions) of entities, impeding the lineage tracing process.
To reduce the number of entities created, this release of Cloudera Navigator includes an improved internal query compiler that takes advantage of the fact that many queries vary only by the value of literals. The compiler de-duplicates query executions by dropping the literals when possible and using anonymous queries (using ? rather than literal) instead.
- Enhancements to Cloudera Navigator console to support new or improved features:
- New Administration page for Purge Settings. Purge Settings is a new tab in the Cloudera Navigator console that lets you
schedule weekly Purge operations to remove deleted and stale entities from the Navigator Metadata Server. The Purge is enabled and runs automatically using the default schedule settings listed in the table. Change as needed:
- Log in to the Cloudera Navigator console.
- Select Administration > Purge Settings.
- Data Stewardship Dashboard. The Data Stewardship Dashboard now displays:
- Counts of S3 and HDFS objects
- Counts of encrypted and unencrypted objects
- Correct timezone (the server's timezone) despite any differences between timezone of client system displaying the Dashboard and the server
- Managed Metadata allows deleting namespaces. The Managed Metadata page now provides a Manage Namespaces button that
lists namespaces and lets you delete them from the system.
- Log in to the Cloudera Navigator console.
- Select Administration > Managed Metadata.
- Click the Manage Namespaces button.
- Click an empty namespace from the list and select Delete from the Actions selector.
- New Administration page for Purge Settings. Purge Settings is a new tab in the Cloudera Navigator console that lets you
schedule weekly Purge operations to remove deleted and stale entities from the Navigator Metadata Server. The Purge is enabled and runs automatically using the default schedule settings listed in the table. Change as needed:
- Metadata and lineage extraction from Isilon OneFS. Dell EMC Isilon OneFS and Cloudera Navigator are now fully integrated, enabling Cloudera Navigator to extract metadata and lineage from clusters that use this advanced storage technology.
- New configuration setting available for multiple Hue instances behind external load balancers.
Cloudera Navigator has supported Hue integration (since Cloudera Navigator 2.1 release) by providing links to the appropriate Hue tool in Search results, which open the Hue interface to the entity.
However, for multiple Hue instances deployed behind external load-balancers such as in certain Hue high availability configurations, the links are properly generated only if the host name of the external load-balancer is set as the preferred URL, by configuring the Advanced Configuration Property (Safety Valve) as detailed in the following steps:- Log in to Cloudera Manager Admin Console using an account with Full Administrator privileges.
- Select Clusters > Cloudera Management Service.
- Click the Configuration tab.
- Under filter Scope, click Navigator Metadata Server.
- Under the Category filter, click Advanced.
- Scroll to the property Navigator Metadata Server Advanced Configuration Snippet (Safety Valve) for cloudera-navigator.properties.
- Set the nav.hue.preferred_baseurl.{clusterName} to the fully-qualified domain name of your Hue load-balancer using this format:
nav.hue.preferred_baseurl.{clusterName}=hue-load-balancer.example.com
For example, if a load-balancer for multiple Hue services is setup on host hue_lb_nginx.example.com, the entry would be as follows:
- Click Save.
- Restart the Navigator Metadata Server role. Links to Hue will now be properly created and available in the Search results lists in the Cloudera Navigator console.
- Hue download operations (EXPORT, DOWNLOAD) included in audit data. Download and export operations from Hue are now collected
by Navigator Audit Server. The Audit Event includes the following details:
{ "username": "username", "impersonator": "hue", "eventTime": epoch-time-int, "operationText": "User username exported to HDFS destination: path/here/filename", "service": "service-name", "url": "/service-name-result-path", "allowed": true, "operation": "EXPORT", "ipAddress":client-ip-address"} {
What's New in Cloudera Navigator 2.10.2
This release of Cloudera Navigator data management includes no new features. See Issues Fixed in Cloudera Navigator 2.10.2 for resolved issues and see Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
What's New in Cloudera Navigator 2.10.1
This release of Cloudera Navigator data management includes no new features. See Issues Fixed in Cloudera Navigator 2.10.1 for resolved issues and see Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
What's New in Cloudera Navigator 2.10.0
- Data Stewardship Dashboard. The Data Stewardship Dashboard is fully supported and enabled by default in this release. Originally introduced in Cloudera Navigator 2.9 but disabled by default, the Data Stewardship Dashboard displays summary activity about data entities—tables created, tables altered, databases created, SQL queries started, for example. The time frame for the summary is selectable using the Dashboard's trend widget (daily, weekly, monthly, quarterly) to instantly adjust the overall picture of your data. Displayed below the summary results are more detailed views of the data, ranked by count and listing details for the entities as captured by the Navigator Metadata Server. For example, top files by size lists the top 10 files by file name, path, file size, with graphical bar chart rendering for easy visual comparison.
- Data Explorer. The Data Explorer is an interactive companion to the Data Stewardship Dashboard for viewing and comparing selected data sources over time in terms of averages or trends. Users can select any of the various data sources (databases created, tables created, tables altered, and so on), select chart tools (average, trendline), and select a time period over which to render the data.
- Spark (Spark 1.6) Lineage Support. The collection mechanism for Spark job lineage has been enhanced to capture Spark application execution—input data, output data, and column-level when possible for each invocation of spark-submit. In addition, lineage for Spark jobs is now collected automatically. Support for this capability in the prior release required manual configuration using a safety valve (advanced configuration snippet), Currently, support is for Spark 1.6 (not Spark 2) and lineage is collected for data read and written using Spark Dataframes and SparkSQL only (RDDs not yet supported). See Known Issues and Workarounds in Cloudera Navigator Data Management for other limitations.
- Integration with Navigator for Hue for metadata discovery. Search and tag partitions, databases, views, tables, columns. This capability must be enabled. See How to Enable and Use Navigator in Hue for details.
- Improved logging for Amazon S3 data. Navigator Server logs now capture information (INFO) about Amazon S3 buckets from which the server is both collecting and not collecting metadata. This enhancement can facilitate better root cause analysis for issues that may arise.
What's New in Cloudera Navigator 2.9.2
This release has no new features. However, some issues have been resolved. See Issues Fixed in Cloudera Navigator 2.9.2 for details. See Known Issues and Workarounds in Cloudera Navigator Data Management for current listing of all known issues and workarounds.
What's New in Cloudera Navigator 2.9.1
A number of issues have been fixed. See Issues Fixed in Cloudera Navigator 2.9.1.
- Navigator can now extract technical metadata from data stored on Amazon S3. You can filter results based on the S3 source type and other related S3 properties. See Navigator and S3.
- Solr indexing has been optimized to improve search speed and reduce memory requirements. When Cloudera Navigator is started after an upgrade, you will see a message indicating upgrade progress. See Upgrading the Cloudera Navigator Management Component.
- Managed metadata properties editing has been enhanced. Properties can now be individually viewed, updated, or deleted. Deleted properties are still available from the Administration page and can be restored until they are purged and permanently removed.
- Policy editor has been enhanced:
- All types of managed metadata can be assigned (text, number, Boolean, date, and enumeration) instead of only text.
- UI support for managed metadata assignment has been added.
- Managed metadata assignments can now be multi-valued.
- Auditing health check - You can configure an auditing pipeline health check to verify that auditing is working and is not silently down. See Verifying that Auditing Is Running.
- You can now specify which columns to export when you export an audit report.
- The overall look and feel of the user interface for Cloudera Navigator has been updated.
- Navigator Data Stewardship dashboard - Cloudera Navigator 2.9 introduces the Data Stewardship dashboard, which provides "at-a-glance" information and metrics to help you understand the state of the data and data usage. The dashboard displays information about table, file, and database activity, operations and operation executions, and other information captured by the Navigator Metadata Server. The dashboard is currently unsupported. See Data Stewardship Dashboard.
What's New in Cloudera Navigator 2.8.3
This release of Cloudera Navigator data management includes no new features. See Issues Fixed in Cloudera Navigator 2.8.3 for resolved issues and see Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
What's New in Cloudera Navigator 2.8.1
A number of issues have been fixed. See Issues Fixed in Cloudera Navigator 2.8.1.
What's New in Cloudera Navigator 2.8.0
- Support for purging metadata for Hive and Impala SELECT queries, and for YARN, Sqoop, and Pig operations has been added. See Purging Metadata for HDFS Entities, Hive and Impala Select Queries, and YARN, Sqoop, and Pig Operations.
- A new user role that allows editing custom metadata—Custom Metadata Administrator—has been added. Grant this role to end users so they can tag their data sets with custom metadata without giving them permission to edit managed metadata. See User Roles.
- You can configure display of inputs and outputs in the entity Details page. See Enabling Inputs and Outputs to Display.
What's New in Cloudera Navigator 2.7.3
An issue has been fixed. See Issues Fixed in Cloudera Navigator 2.7.3.
What's New in Cloudera Navigator 2.7.2
A number of issues have been fixed. See Issues Fixed in Cloudera Navigator 2.7.2.
What's New in Cloudera Navigator 2.7.1
An issue has been fixed. See Issues Fixed in Cloudera Navigator 2.7.1.
What's New in Cloudera Navigator 2.7.0
- Platform enhancements
- Added support for assigning managed metadata through policies. Only single-valued managed metadata of type Text is currently supported. See Using Policies to Automate Metadata Tagging.
- Added support for purging physical operations corresponding to logical operations. See Performing Actions on Entities.
- Cross Site Request Forgery (CSRF) protection is enabled by default. If you currently use cookies for authentication in API requests with a non-GET method
(for example, the POST method), you must also obtain CSRF tokens for the request, or disable CSRF protection.
You can disable CSRF protection by setting nav.disable_api_csrf_security=true.
What's New in Cloudera Navigator 2.6.5
A number of issues have been fixed. See Issues Fixed in Cloudera Navigator 2.6.5
What's New in Cloudera Navigator 2.6.2
A number of issues have been fixed. See Issues Fixed in Cloudera Navigator 2.6.2.What's New in Cloudera Navigator 2.6.1
A number of issues have been fixed. See Issues Fixed in Cloudera Navigator 2.6.1.What's New in Cloudera Navigator 2.6.0
- Platform enhancements
- Added support for defining new types of business metadata. See Defining Properties for Managed Metadata.
- Enhanced entity details with type-specific information and behavior. See Displaying Entity Details.
- Added support for filtering lineage. See Exploring Lineage Diagrams.
- Added support for purging the metadata store. See Purging Metadata for HDFS Entities, Hive and Impala Select Queries, and YARN, Sqoop, and Pig Operations.
- Added support for securing audit messages sent to Kafka. See Publishing Audit Events to Kafka.
What's New in Cloudera Navigator 2.5.0
An issue has been fixed. See Issues Fixed in Cloudera Navigator 2.5.0.What's New in Cloudera Navigator 2.4.4
Several issues have been fixed. See Issues Fixed in Cloudera Navigator 2.4.4.What's New in Cloudera Navigator 2.4.2
A number of issues have been fixed. See Issues Fixed in Cloudera Navigator 2.4.2.What's New in Cloudera Navigator 2.4.1
An issue has been fixed. See Issues Fixed in Cloudera Navigator 2.4.1.What's New in Cloudera Navigator 2.4.0
- Platform enhancements
- New entity details page.
- HDFS metadata and audit analytics.
- Command actions, integrated with policies and search, for archiving and purging files.
- Redesigned policies page for scalability.
- Publishing of audit events to Kafka topics. Kafka audit event publishing does not support authorization and Cloudera does not recommend its use in production.
- Navigator SDK for metadata and lineage augmentation.
- Significant scale improvements. Large lineage graphs render faster and consume less memory.
- Expanded service coverage
- Hive metadata: support for extended attributes.
- Hive on Spark metadata and lineage.
- Oozie metadata and lineage added support for the hive2 action, which Cloudera recommends that you use in preference to the hive action.
- Hue auditing (CDH 5.5.0 and higher): added login and logout, user and group events.
- Cloudera Manager auditing: login and logout events.
- Navigator Metadata Server auditing: successful and failed login events.
- Added filtering of audit events for Sentry, Solr, Impala, and Navigator Metadata Server.
What's New in Cloudera Navigator 2.3.9
An issue has been fixed. See Issues Fixed in Cloudera Navigator 2.3.9.What's New in Cloudera Navigator 2.3.8
Several issues have been fixed. See Issues Fixed in Cloudera Navigator 2.3.8.What's New in Cloudera Navigator 2.3.3
Several issues have been fixed. See Issues Fixed in Cloudera Navigator 2.3.3.What's New in Cloudera Navigator 2.3.1
- Navigator self audit events have been enhanced with additional information such as the names of audit reports and policies
- Performance and stability improvements
What's New in Cloudera Navigator 2.3.0
- Platform enhancements
- Redesigned metadata search provides autocomplete, enhanced filtering, and saving searches.
- Added support for SAML for single sign-on.
- Expanded service coverage
- Added Impala (CDH 5.4 and higher) metadata and lineage
- Added Cloudera Search (CDH 5.4 and higher) auditing
- Added auditing for Navigator Metadata Server activity, such as audit views, metadata searches, and policy editing
- Added support for inferring the schema of HDFS Avro and Parquet entities
- Added Spark (CDH 5.4 and higher) metadata and lineage.
What's New in Cloudera Navigator 2.2.9
An issue has been fixed. See Issues Fixed in Cloudera Navigator 2.2.9.What's New in Cloudera Navigator 2.2.4
An issue has been fixed. See Issues Fixed in Cloudera Navigator 2.2.4.What's New in Cloudera Navigator 2.2.3
An issue has been fixed. See Issues Fixed in Cloudera Navigator 2.2.3.What's New in Cloudera Navigator 2.2.2
An issue has been fixed. See Issues Fixed in Cloudera Navigator 2.2.2.What's New in Cloudera Navigator 2.2.1
A number of issues have been fixed. See Issues Fixed in Cloudera Navigator 2.2.1.What's New in Cloudera Navigator 2.2.0
- Policies are generally available and are always enabled. Policy properties now support Java expressions.
- Search - Search functionality now includes autocomplete.
What's New in Cloudera Navigator 2.1.6
An issue has been fixed. See Issues Fixed in Cloudera Navigator 2.1.6.What's New in Cloudera Navigator 2.1.5
A number of issues have been fixed. See Issues Fixed in Cloudera Navigator 2.1.6.What's New in Cloudera Navigator 2.1.4
An issue has been fixed. See Issues Fixed in Cloudera Navigator 2.1.4.What's New in Cloudera Navigator 2.1.2
A number of issues have been fixed. See Issues Fixed in Cloudera Navigator 2.1.2.What's New in Cloudera Navigator 2.1.1
A number of issues have been fixed. See Issues Fixed in Cloudera Navigator 2.1.1.What's New in Cloudera Navigator 2.1.0
- Auditing Component
- New tab in the Cloudera Navigator console (the user interface) featuring saved audit reports. The Cloudera Navigator auditing user interface is no longer available from Cloudera Manager. Instead, auditing is integrated with lineage, discovery, and the policy engine. The Cloudera Navigator console is hosted by the Navigator Metadata Server, which is now required for the auditing component.
- Sentry auditing now includes Sentry commands issued from Impala.
- Metadata Component
- Search results include links to the appropriate Hue browser tool for type:
Type Hue tool HDFS directories and files File Browser Hive database and tables Hive Metastore (HMS) Manager MapReduce, YARN, Pig Job Browser - Policies - support rules for modifying metadata and sending notifications when entities are extracted.
- Search results include links to the appropriate Hue browser tool for type:
- Security
- Role-Based Access Control - support assigning groups to roles that constrain access to Navigator features
- Authentication - LDAP and Active Directory authentication of Navigator users
- TLS/SSL - enable TLS/SSL for encrypted communication
- API
- Version changed to v3
- Supports auditing and policies
What's New in Cloudera Navigator 2.0.5
An issue was fixed. See Issues Fixed in Cloudera Navigator 2.0.5.What's New in Cloudera Navigator 2.0.3
A number of issues have been fixed. See Issues Fixed in Cloudera Navigator 2.0.3.What's New in Cloudera Navigator 2.0.2
An issue was fixed. See Issues Fixed in Cloudera Navigator 2.0.2.What's New in Cloudera Navigator 2.0.1
- Masking of personally identifiable information (PII) in query strings that appear in audit events and lineage. Enabled by default.
- REST API support for registering custom metadata for entities before they appear in Navigator.
What's New in Cloudera Navigator 2.0.0
- Auditing Component
- Added support for auditing the Sentry service
- Added support for publishing audit logs to syslog
- Metadata Component
- Newly designed Query Builder with faceted filtering
- Simplified Pig lineage
- Added support for Sqoop and Oozie lineage
- Many performance and stability improvements
- Security - includes Navigator Encrypt and Navigator Key Trustee, formerly known as Gazzang zNcrypt and Gazzang zTrustee. These features provide enterprise-grade encryption and key management. For information on these features, see the Cloudera Security Datasheet and contact your account team.
Changed Features in Cloudera Navigator
The sections below describe changes to functionality, if any, for each Cloudera Navigator release.
What's Changed in Cloudera Navigator 2.15.2
This maintenance release of Cloudera Navigator data management includes no changed features. See Issues Fixed in Cloudera Navigator 2.15.2 for resolved issues and see Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
What's Changed in Cloudera Navigator 2.15.1
- New Default HDFS audit filters
This release changes the default behavior of HDFS audit event collection ONLY for new installations or upgraded installation if the HDFS audit event filters had not been changed from original default settings.
For more information, see New Default HDFS audit filters in the What's New section.
- HDFS Analytics disabled by default
The HDFS Analytics available in the Navigator console is now disabled by default. To enable the HDFS Analytics, set the following properties in the Cloudera Management Service configuration:
- Set navigator.analytics.enabled to be true in the Navigator Audit Server Advanced Configuration Snippet (Safety Valve) for cloudera-navigator.properties.
- Set nav.analytics.audit.enabled to be true in the Navigator Metadata Server Advanced Configuration Snippet (Safety Valve) for cloudera-navigator.properties.
What's Changed in Cloudera Navigator 2.14.2
This maintenance release of Cloudera Navigator data management includes no changed features. See Issues Fixed in Cloudera Navigator 2.14.2 for resolved issues and see Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
What's Changed in Cloudera Navigator 2.14.1
This maintenance release of Cloudera Navigator data management includes no changed features. See Issues Fixed in Cloudera Navigator 2.14.1 for resolved issues and see Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
What's Changed in Cloudera Navigator 2.14.0
- Name and description lengths increased
This release increases the number of characters available to store entity names and descriptions:
- Name length was 40 characters and is now 500.
- Description length was 500 and is now not limited.
The Navigator console limits manual entry of these values to 40 and 500 characters, so if you edit existing metadata for an entity, you won't be able to save changes if the text is longer than the console limits.
What's Changed in Cloudera Navigator 2.13.2
- Name and description lengths increased
This release increases the number of characters available to store entity names and descriptions:
- Name length was 40 characters and is now 500.
- Description length was 500 and is now not limited.
The Navigator console limits manual entry of these values to 40 and 500 characters, so if you edit existing metadata for an entity, you won't be able to save changes if the text is longer than the console limits.
What's Changed in Cloudera Navigator 2.13.0
- Changes to Navigator roles and privileges.
Along with enhancing the metadata management capabilities of Navigator, the Navigator roles and the privileges they describe have changed. The new role names and privileges are described in User Roles and Privileges Reference. One specific change is that the privilege for editing the name and description metadata for Navigator entities is now part of the Managed and Custom Metadata Editor role. Users with that role or the Full Administrator role can add and update entity names and descriptions in the Navigator console.
- Changes to managed metadata administration.
The Administration Managed Metadata page now shows warnings when properties exist without being associated with entity types. Also, when you add a regular expressions to validate text-type values for managed metadata, the regular expressions are now tested interactively.
What's Changed in Cloudera Navigator 2.12.3
- Name and description lengths increased
This release increases the number of characters available to store entity names and descriptions when created or updated through the API:
- Name length was 40 characters and is now 500.
- Description length was 500 and is now not limited.
The Navigator console limits manual entry of these values to 40 and 500 characters, so if you edit existing metadata for an entity, you won't be able to save changes if the text is longer than the console limits.
What's Changed in Cloudera Navigator 2.12.0
- As of Cloudera Manager 5.13, there is a significant change in the volume of HDFS audits that Navigator collects. Cloudera Manager Backup and Disaster Recovery (BDR) feature includes HDFS replication jobs. Previously the jobs used "run as" user to return the access control list (ACL) from the source files; as of 5.13, it uses the source clusters' HDFS principal to get the source file ACL. The change improves the BDR performance but also means that audits that were previously collected for a specific user/owner are now collected for the HDFS principal. If the increase of audits causes problems, consider changing the HDFS audit filter to add HDFS principal to bypass the additional audit records.
- Deleted entities no longer display by default on Search.
The Cloudera Navigator console displayed both deleted and non-deleted entities by default (Search did not have a filter for Deleted entities). The default non-filtered Search now shows only non-deleted entities prior to any filtering and a new facet lets user filter out Deleted entities from results or find only Deleted entities.
- Changes to Purge schedule settings.
Labels and descriptions for the configurable purge options have been clarified. Cloudera Navigator's configurable Purge operation covers operation executions as well as SELECT operations. The labeling and text has been changed to more accurately describe the functionality. The new labels and descriptions are shown below:
Property Default Range of selectable values and usage note Purge HDFS entities deleted more than* 60 days Select 1 day through 10 days, 20 days through 100 days (in 10-day increments), 150 days, 365 days. These are the number of days after entity deletion that elapse until the purge process removes it. For example, a setting of 1 day purges entities deleted before yesterday but retains entities deleted yesterday. Purge SELECT operations* Enabled Hive and Impala SELECT operations older than days specified in Only Purge SELECT operations older than will be purged. Purge operations older than* 60 days Select 10 days through 100 days (10-day increments), 150 days, 365 days. Yarn, Sqoop, and Pig operations older than the specified date will be purged. If Purge SELECT Operations is enabled, Hive and Impala SELECT operations older than the specified date will also be purged.
What's Changed in Cloudera Navigator 2.11.1
This maintenance release of Cloudera Navigator data management includes no changed features. See Issues Fixed in Cloudera Navigator 2.11.1 for resolved issues and see Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
What's Changed in Cloudera Navigator 2.11.0
- Automated Purge function supplants safety valve setting used in prior releases. The Administration tab of the Cloudera
Navigator console now includes a Purge Settingspage for scheduling the Purge process according to your specifications. If
your current system has an Advanced Configuration Snippet (Safety Valve) for nav.maintenance.purge.schedule set for a cron job to run purge, you can remove the setting.
It will be ignored by the schedule set using the Cloudera Navigator console.
To remove the purge schedule setting from safety valve (if it exists):
- Log in to Cloudera Manager Admin Console.
- Select Clusters > Cloudera Management Service.
- Under Scope filter, click Navigator Metadata Server.
- Under Category filter, click Advanced.
- Scroll to the Navigator Metadata Server Advanced Configuration Snippet (Safety Valve) for cloudera-navigator.properties. If a property named nav.maintenance.purge.schedule exists and has been set to a cron schedule, remove it. It will be ignored by the schedule set using the Cloudera Navigator console.
What's Changed in Cloudera Navigator 2.10.2
This release of Cloudera Navigator data management includes no changed features. See Issues Fixed in Cloudera Navigator 2.10.2 for resolved issues and see Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
What's Changed in Cloudera Navigator 2.10.1
This release of Cloudera Navigator data management includes no changed features. See Issues Fixed in Cloudera Navigator 2.10.1 for resolved issues and see Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
What's Changed in Cloudera Navigator 2.10.0
- Cloudera Navigator Data Stewardship Analytics (Data Stewardship Dashboard) is now fully enabled. See Data Stewardship Dashboard (above) for details.
- Configurable aggregation scheme for metrics. Every 120 minutes, an internal task runs to extract, transform, and load various counts collected by
Navigator to derive metrics and populate the Data Stewardship Dashboard with values. By default, data is aggregated for a 1-day time period.
However, the window can be changed by setting the Advanced Configuration Snippet (Safety Valve). The values for new partitions, new tables, and new files depend on the value for this window.
Safety Valve Description nav.dashboard.etl.maxDataSetSize=n Cloudera Manager 5.11/Navigator 2.10 (and later): Default window size is 1 (1 day) Cloudera Manager 5.10/Navigator 2.9 (and prior): Default window size is 7 days nav.dashboard.etl.include.hdfs=true Enables the Access Denied field on the Dashboard to display values. - Log in to Cloudera Manager Admin Console.
- Navigate to Cloudera Management Service and then select Navigator Metadata Server (under Scope filter).
- Click Configuration and then click Service Configuration.
- Scroll to the Navigator Metadata Server Advanced Configuration Snippet (Safety Valve) for cloudera-navigator.properties.
- Enter one or both safety valves into the field as needed. For example:
nav.dashboard.etl.maxDataSetSize=3 nav.dashboard.etl.include.hdfs=true
What's Changed in Cloudera Navigator 2.9.2
This release of Cloudera Navigator data management includes no changed features. See Issues Fixed in Cloudera Navigator 2.9.2 for resolved issues and see Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
What's Changed in Cloudera Navigator 2.9.1
This release of Cloudera Navigator data management includes no changed features. See Issues Fixed in Cloudera Navigator 2.9.1 for resolved issues and see Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
What's Changed in Cloudera Navigator 2.9.0
Name and description fields, which were previously in custom metadata, have been moved to managed metadata.
What's Changed in Cloudera Navigator 2.8.3
This release of Cloudera Navigator data management includes no changed features. See Issues Fixed in Cloudera Navigator 2.8.3 for resolved issues and see Known Issues and Workarounds in Cloudera Navigator Data Management for known issues.
What's Changed in Cloudera Navigator 2.7.0
- Improvement (reduction) to index size by removal of certain types of relations.
Until now, Cloudera Navigator captured the underlying MapReduce and YARN operations for Hive, Sqoop, and Pig operations, and stored them as physical operations. In this release, previously captured physical operations, operation executions, and associated relations are removed during purge because they were not used in lineage. The upgrade process now starts with Navigator removing the logical-physical relations to operations and operation executions, the relations themselves, their physical endpoints, and any relations connected to those physical endpoints. The end result is improved scalability.
What's Changed in Cloudera Navigator 2.6.0
- Queries generated using filters now use the Lucene Query Parser ,+, - Boolean operator syntax. See Constructing Compound Search Strings.
- Hive on Spark metadata extraction is now supported.
- The Navigator Metadata Server requires 192 MiB of Java PermGen space instead of 128 MiB. The value of this internal setting used by the JDK is increased automatically when upgrading to Cloudera Manager 5.7 (Cloudera Navigator 2.6).
What's Changed in Cloudera Navigator 2.4.0
- Cloudera Navigator no longer supports JDK 1.6.
- In the Administration tab, after roles have been assigned to at least one group, Groups with Navigator Roles is the default selection.
What's Changed in Cloudera Navigator 2.3.3
- In the Search UI, facet values with the count of 0 are not displayed.
What's Changed in Cloudera Navigator 2.3.0
- The PII masking regular expression is superseded by log file redaction in CDH 5.4.
What's Changed in Cloudera Navigator 2.2.0
- Metadata Component - Policies created with Cloudera Navigator 2.1 (containing the Beta version policy engine) are not retained when upgrading to Cloudera Navigator 2.2.
Categories: Data Management | Getting Started | Navigator | New Features | Release Notes | Upgrading | What's New | All Categories