Cloudera Data Management
This guide describes how to perform data management using Cloudera Navigator. Data management activities include auditing access to data residing in HDFS and Hive metastores, reviewing and updating metadata, and discovering the lineage of data objects.
Cloudera Navigator is a fully integrated data-management and security system for the Hadoop platform. Cloudera Navigator enables you to work effectively with data at scale and helps various stakeholders answer the following questions:
- Compliance groups
- Who accessed the data, and what did they do with it?
- Are we prepared for an audit?
- Is our sensitive data protected?
- Hadoop administrators and DBAs
- How can we boost productivity and cluster performance?
- How is data being used?
- How can data be optimized for future workloads?
- Data stewards and curators
- How can data assets be managed and organized?
- What is the lifecycle of the data?
- Data scientists and Business Intelligence users
- Where is the most important data?
- Is this data trustworthy?
- What is the relationship between data sets?
Cloudera Navigator provides the following components to help you answer these questions and meet
data-management and security requirements.
- Data Management - Provides visibility into and control over the data in Hadoop datastores, and the computations performed on that data. Hadoop
administrators, data stewards, and data scientists can use Cloudera Navigator to:
- Audit data access and verify access privileges - The goal of auditing is to capture a complete and immutable record of all activity within a system. Cloudera Navigator auditing adds secure, real-time audit components to key data and access frameworks. Compliance groups can use Cloudera Navigator to configure, collect, and view audit events that show who accessed data, and how.
- Search metadata and visualize lineage - Cloudera Navigator metadata management allows DBAs, data stewards, business analysts, and data scientists to define, search for, amend the properties of, and tag data entities and view relationships between datasets.
- Policies - Data stewards can use Cloudera Navigator policies to define automated actions, based on data access or on a schedule, to add metadata, create alerts, and move or purge data.
- Analytics - Hadoop administrators can use Cloudera Navigator analytics to examine data usage patterns and create policies based on those patterns.
- Data Encryption - Data encryption and key management provide a critical layer of protection against potential threats by malicious actors on the network
or in the datacenter. Encryption and key management are also requirements for meeting key compliance initiatives and ensuring the integrity of your enterprise data. The following Cloudera Navigator
components enable compliance groups to manage encryption:
- Cloudera Navigator Encrypt transparently encrypts and secures data at rest without requiring changes to your applications and ensures there is minimal performance lag in the encryption or decryption process.
- Cloudera Navigator Key Trustee Server is an enterprise-grade virtual safe-deposit box that stores and manages cryptographic keys and other security artifacts.
- Cloudera Navigator Key HSM allows Cloudera Navigator Key Trustee Server to seamlessly integrate with a hardware security module (HSM).
You can install Cloudera Navigator data management and data encryption components independently.
Related Information
- Installing the Cloudera Navigator Data Management Component
- Upgrading the Cloudera Navigator Data Management Component
- Cloudera Navigator Data Management Component Administration
- Configuring Authentication in the Cloudera Navigator Data Management Component
- Configuring TLS/SSL for the Cloudera Navigator Data Management Component
- Cloudera Navigator Data Management Component User Roles