Documentation
  • Products
  • Services & Support
  • Solutions

Cloudera Enterprise 5.4.x | Other versions

Installation GuideInstalling Cloudera Manager and CDHInstalling and Deploying CDH Using the Command LineInstalling CDH 5 ComponentsMahout Installation
View All Categories
  • Cloudera Introduction
    • CDH Overview
      • Cloudera Impala Overview
      • Cloudera Search Overview
        • Cloudera Search and Other Cloudera Components
      • Apache Sentry Overview
    • Cloudera Manager 5 Overview
      • Cloudera Manager Admin Console
        • Starting and Logging into the Admin Console
        • Cloudera Manager Admin Console Home Page
        • Displaying Cloudera Manager Documentation
        • Displaying the Cloudera Manager Server Version and Server Time
      • Cloudera Manager API
        • Using the Cloudera Manager Java API for Cluster Automation
      • Extending Cloudera Manager
      • Cloudera Manager 5 Frequently Asked Questions
    • Cloudera Navigator 2 Overview
      • Cloudera Navigator Data Management Component Overview
      • Cloudera Navigator 2 Frequently Asked Questions
    • Cloudera Navigator Data Encryption Overview
      • Cloudera Navigator Key Trustee Server Overview
      • Cloudera Navigator Key HSM Overview
      • Cloudera Navigator Encrypt Overview
    • Frequently Asked Questions About Cloudera Software
    • Getting Support
  • Cloudera Release Notes
  • Cloudera QuickStart
    • Cloudera QuickStart VM
      • QuickStart VM Software Versions and Documentation
      • QuickStart VM Administrative Information
    • Cloudera Manager and CDH QuickStart Guide
    • CDH 5 QuickStart Guide
      • Before You Install CDH 5 on a Single Node
      • Installing CDH 5 on a Single Linux Node in Pseudo-distributed Mode
        • MapReduce 2.0 (YARN)
        • Installing CDH 5 with MRv1 on a Single Linux Host in Pseudo-distributed mode
        • Installing CDH 5 with YARN on a Single Linux Node in Pseudo-distributed mode
        • Components That Require Additional Configuration
        • Next Steps After QuickStart
    • Cloudera Search QuickStart Guide
      • Prerequisites for Cloudera Search QuickStart Scenarios
      • Load and Index Data in Search
      • Using Search to Query Loaded Data
  • Installation Guide
    • Installation Requirements for Cloudera Manager, Cloudera Navigator, and CDH 5
      • Cloudera Manager 5 Requirements and Supported Versions
        • Single User Mode Requirements
      • Permission Requirements for Package-based Installations and Upgrades of CDH
      • Cloudera Navigator 2 Requirements and Supported Versions
      • CDH 5 Requirements and Supported Versions
      • Supported Configurations with Virtualization and Cloud Platforms
      • Filesystem Requirements
      • Ports
        • Ports Used by Cloudera Manager and Cloudera Navigator
        • Ports Used by Cloudera Navigator Encryption
        • Ports Used by Components of CDH 5
        • Ports Used by Impala
        • Ports Used by Cloudera Search
        • Ports Used by DistCp
        • Ports Used by Third-Party Components
    • Installing Cloudera Manager and CDH
      • Java Development Kit Installation
      • Installing Cloudera Manager, CDH, and Managed Services
        • Cloudera Manager and Managed Service Data Stores
          • Embedded PostgreSQL Database
          • External PostgreSQL Database
          • MySQL Database
          • Oracle Database
          • Configuring an External Database for Oozie
          • Configuring an External Database for Sqoop
          • Backing Up Databases
          • Data Storage for Monitoring Data
          • Storage Space Planning for Cloudera Manager
        • Managing Software Installation
          • Parcels
          • Migrating from Packages to Parcels
          • Migrating from Parcels to Packages
        • Installation Path A - Automated Installation by Cloudera Manager
        • Installation Path B - Manual Installation Using Cloudera Manager Packages
        • Installation Path C - Manual Installation Using Cloudera Manager Tarballs
        • Installing Impala
        • Installing Search
        • Installing Spark
        • Installing Key Trustee KMS
        • Installing GPL Extras
        • Understanding Custom Installation Solutions
          • Creating and Using a Remote Parcel Repository for Cloudera Manager
          • Creating and Using a Package Repository for Cloudera Manager
          • Configuring a Custom Java Home Location
          • Installing Older Versions of Cloudera Manager 5
        • Deploying Clients
        • Testing the Installation
        • Uninstalling Cloudera Manager and Managed Software
        • Uninstalling a CDH Component From a Single Host
      • Installing the Cloudera Navigator Data Management Component
      • Installing Cloudera Navigator Key Trustee Server
      • Installing Cloudera Navigator Key HSM
      • Installing Key Trustee KMS
      • Installing Cloudera Navigator Encrypt
      • Installing and Deploying CDH Using the Command Line
        • Before You Install CDH 5 on a Cluster
        • Creating a Local Yum Repository
        • Installing the Latest CDH 5 Release
        • Installing an Earlier CDH 5 Release
        • CDH 5 and MapReduce
        • Migrating from MapReduce 1 (MRv1) to MapReduce 2 (MRv2, YARN)
        • Tuning YARN
        • Deploying CDH 5 on a Cluster
          • Configuring Dependencies Before Deploying CDH on a Cluster
            • Enabling NTP
            • Configuring Network Names
            • Disabling SELinux
            • Disabling the Firewall
          • Deploying HDFS on a Cluster
          • Deploying MapReduce v2 (YARN) on a Cluster
          • Deploying MapReduce v1 (MRv1) on a Cluster
          • Configuring the Daemons to Start on Boot
        • Installing CDH 5 Components
          • Crunch Installation
            • Crunch Prerequisites
            • Crunch Packaging
            • Installing and Upgrading Crunch
            • Crunch Documentation
          • Flume Installation
            • Upgrading Flume
            • Flume Packaging
            • Installing the Flume Tarball
            • Installing the Flume RPM or Debian Packages
            • Flume Configuration
            • Verifying the Flume Installation
            • Running Flume
            • Files Installed by the Flume RPM and Debian Packages
            • Supported Sources, Sinks, and Channels
            • Using an On-disk Encrypted File Channel
            • Viewing the Flume Documentation
          • HBase Installation
            • New Features and Changes for HBase in CDH 5
            • Upgrading HBase
            • Installing HBase
            • Configuration Settings for HBase
            • Starting HBase in Standalone Mode
            • Configuring HBase in Pseudo-Distributed Mode
            • Deploying HBase on a Cluster
            • Accessing HBase by using the HBase Shell
            • HBase Online Merge
            • Using MapReduce with HBase
            • Troubleshooting HBase
            • Viewing the HBase Documentation
          • HCatalog Installation
            • HCatalog Prerequisites
            • Installing and Upgrading the HCatalog RPM or Debian Packages
            • Configuration Change on Hosts Used with HCatalog
            • Starting and Stopping the WebHCat REST server
            • Accessing Table Information with the HCatalog Command-line API
            • Accessing Table Data with MapReduce
            • Accessing Table Data with Pig
            • Accessing Table Information with REST
            • Viewing the HCatalog Documentation
          • Impala Installation
            • Requirements
            • Installing Impala without Cloudera Manager
            • Upgrading Impala
            • Starting Impala
              • Modifying Impala Startup Options
          • Hive Installation
            • About Hive
            • Upgrading Hive
            • Installing Hive
            • Configuring the Hive Metastore
            • Configuring HiveServer2
            • Starting the Metastore
            • File System Permissions
            • Starting, Stopping, and Using HiveServer2
            • Starting HiveServer1 and the Hive Console
            • Using Hive with HBase
            • Using the Hive Schema Tool
            • Installing the Hive JDBC on Clients
            • Setting HADOOP_MAPRED_HOME
            • Configuring the Metastore to use HDFS High Availability
            • Troubleshooting Hive
            • Viewing the Hive Documentation
          • HttpFS Installation
            • About HttpFS
            • HttpFS Packaging
            • HttpFS Prerequisites
            • Installing HttpFS
            • Configuring HttpFS
            • Starting the HttpFS Server
            • Stopping the HttpFS Server
            • Using the HttpFS Server with curl
          • Hue Installation
            • Supported Browsers for Hue
            • Upgrading Hue
            • Installing Hue
            • Configuring CDH Components for Hue
            • Hue Configuration
            • Administering Hue
              • Using an External Database for Hue Using the Command Line
            • Viewing the Hue User Guide
          • KMS Installation and Upgrade
          • Mahout Installation
            • Upgrading Mahout
            • Installing Mahout
            • The Mahout Executable
            • Getting Started with Mahout
            • Viewing the Mahout Documentation
          • Oozie Installation
            • About Oozie
            • Oozie Packaging
            • Oozie Prerequisites
            • Upgrading Oozie
            • Installing Oozie
            • Configuring Oozie
            • Starting, Stopping, and Accessing the Oozie Server
            • Viewing the Oozie Documentation
          • Pig Installation
            • Upgrading Pig
            • Installing Pig
            • Using Pig with HBase
            • Installing DataFu
            • Viewing the Pig Documentation
          • Search Installation
            • Preparing to Install Cloudera Search
            • Installing Cloudera Search
              • Installing Cloudera Search without Cloudera Manager
              • Deploying Cloudera Search
              • Installing the Spark Indexer
              • Installing MapReduce Tools for use with Cloudera Search
              • Installing the Lily HBase Indexer Service
            • Upgrading Cloudera Search
            • Installing Hue Search
              • Updating Hue Search
          • Sentry Installation
          • Snappy Installation
            • Upgrading Snappy
            • Installing Snappy
            • Using Snappy for MapReduce Compression
            • Using Snappy for Pig Compression
            • Using Snappy for Hive Compression
            • Using Snappy Compression in Sqoop 1 and Sqoop 2 Imports
            • Using Snappy Compression with HBase
            • Viewing the Snappy Documentation
          • Spark Installation
            • Spark Packages
            • Spark Prerequisites
            • Installing and Upgrading Spark
          • Sqoop 1 Installation
            • Upgrading Sqoop 1 from an Earlier CDH 5 release
            • Sqoop 1 Packaging
            • Sqoop 1 Prerequisites
            • Installing the Sqoop 1 RPM or Debian Packages
            • Installing the Sqoop 1 Tarball
            • Installing the JDBC Drivers for Sqoop 1
            • Setting HADOOP_MAPRED_HOME
            • Viewing the Sqoop 1 Documentation
          • Sqoop 2 Installation
            • Upgrading Sqoop 2 from an Earlier CDH 5 Release
            • Installing Sqoop 2
            • Configuring Sqoop 2
            • Starting, Stopping, and Accessing the Sqoop 2 Server
            • Viewing the Sqoop 2 Documentation
            • Feature Differences - Sqoop 1 and Sqoop 2
          • Whirr Installation
            • Upgrading Whirr
            • Installing Whirr
            • Generating an SSH Key Pair for Whirr
            • Defining a Whirr Cluster
            • Managing a Cluster with Whirr
            • Viewing the Whirr Documentation
          • ZooKeeper Installation
            • Upgrading ZooKeeper from an Earlier CDH 5 Release
            • Installing the ZooKeeper Packages
            • Maintaining a ZooKeeper Server
            • Viewing the ZooKeeper Documentation
          • Avro Usage
            • Avro Data Files
            • Compression for Avro Data Files
            • Using Flume with Avro
            • Importing Avro Files with Sqoop 1 Using the Command Line
            • Using Avro with MapReduce
            • Streaming
            • Using Avro with Pig
            • Using Avro with Hive
          • Using the Parquet File Format with Impala, Hive, Pig, and MapReduce
        • Building RPMs from CDH Source RPMs
          • Prerequisites
          • Setting Up an Environment for Building RPMs
          • Building an RPM
        • Apache and Third-Party Licenses
          • Apache License
          • Third-Party Licenses
        • Uninstalling CDH Components
        • Viewing the Apache Hadoop Documentation
    • Troubleshooting Installation and Upgrade Problems
  • Upgrade
  • Cloudera Administration
    • Managing CDH and Managed Services
      • Managing CDH and Managed Services Using Cloudera Manager
        • Configuration Overview
          • Modifying Configuration Properties
          • Modifying Configuration Properties (Classic Layout)
          • Autoconfiguration
          • Custom Configuration
          • Stale Configurations
          • Client Configuration Files
          • Viewing and Reverting Configuration Changes
          • Exporting and Importing Cloudera Manager Configuration
        • Managing Clusters
          • Adding and Deleting Clusters
          • Starting, Stopping, Refreshing, and Restarting a Cluster
          • Renaming a Cluster
          • Cluster-Wide Configuration
          • Moving a Host Between Clusters
        • Managing Services
          • Adding a Service
          • Comparing Configurations for a Service Between Clusters
          • Add-on Services
          • Starting, Stopping, and Restarting Services
          • Rolling Restart
          • Aborting a Pending Command
          • Deleting Services
          • Renaming a Service
          • Configuring Maximum File Descriptors
        • Managing Roles
          • Role Instances
          • Role Groups
        • Managing Hosts
          • Viewing Host Details
          • Using the Host Inspector
          • Adding a Host to the Cluster
          • Specifying Racks for Hosts
          • Host Templates
          • Maintenance Mode
          • Decommissioning and Recommissioning Hosts
          • Deleting Hosts
          • Managing Non-CDH Resources
        • Cloudera Manager 5.4 Configuration Properties
          • Cloudera Manager 5.4 Configuration Properties
          • CDH 5.4.0 Properties
            • Accumulo 1.6 Properties in CDH 5.4.0
            • Flume Properties in CDH 5.4.0
            • HBase Properties in CDH 5.4.0
            • HDFS Properties in CDH 5.4.0
            • Hive Properties in CDH 5.4.0
            • Hue Properties in CDH 5.4.0
            • Impala Properties in CDH 5.4.0
            • Isilon Properties in CDH 5.4.0
            • Java KeyStore KMS Properties in CDH 5.4.0
            • Kafka Properties in CDH 5.4.0
            • Key Trustee KMS Properties in CDH 5.4.0
            • Key-Value Store Indexer Properties in CDH 5.4.0
            • MapReduce Properties in CDH 5.4.0
            • Oozie Properties in CDH 5.4.0
            • Sentry Properties in CDH 5.4.0
            • Solr Properties in CDH 5.4.0
            • Spark Properties in CDH 5.4.0
            • Spark (Standalone) Properties in CDH 5.4.0
            • Sqoop 1 Client Properties in CDH 5.4.0
            • Sqoop 2 Properties in CDH 5.4.0
            • YARN (MR2 Included) Properties in CDH 5.4.0
            • ZooKeeper Properties in CDH 5.4.0
          • CDH 5.3.0 Properties
            • Accumulo 1.6 Properties in CDH 5.3.0
            • Flume Properties in CDH 5.3.0
            • HBase Properties in CDH 5.3.0
            • HDFS Properties in CDH 5.3.0
            • Hive Properties in CDH 5.3.0
            • Hue Properties in CDH 5.3.0
            • Impala Properties in CDH 5.3.0
            • Isilon Properties in CDH 5.3.0
            • Java KeyStore KMS Properties in CDH 5.3.0
            • Kafka Properties in CDH 5.3.0
            • Key Trustee KMS Properties in CDH 5.3.0
            • Key-Value Store Indexer Properties in CDH 5.3.0
            • MapReduce Properties in CDH 5.3.0
            • Oozie Properties in CDH 5.3.0
            • Sentry Properties in CDH 5.3.0
            • Solr Properties in CDH 5.3.0
            • Spark Properties in CDH 5.3.0
            • Spark (Standalone) Properties in CDH 5.3.0
            • Sqoop 1 Client Properties in CDH 5.3.0
            • Sqoop 2 Properties in CDH 5.3.0
            • YARN (MR2 Included) Properties in CDH 5.3.0
            • ZooKeeper Properties in CDH 5.3.0
          • CDH 5.2.0 Properties
            • Accumulo 1.6 Properties in CDH 5.2.0
            • Flume Properties in CDH 5.2.0
            • HBase Properties in CDH 5.2.0
            • HDFS Properties in CDH 5.2.0
            • Hive Properties in CDH 5.2.0
            • Hue Properties in CDH 5.2.0
            • Impala Properties in CDH 5.2.0
            • Isilon Properties in CDH 5.2.0
            • Java KeyStore KMS Properties in CDH 5.2.0
            • Kafka Properties in CDH 5.2.0
            • Key Trustee KMS Properties in CDH 5.2.0
            • Key-Value Store Indexer Properties in CDH 5.2.0
            • MapReduce Properties in CDH 5.2.0
            • Oozie Properties in CDH 5.2.0
            • Sentry Properties in CDH 5.2.0
            • Solr Properties in CDH 5.2.0
            • Spark Properties in CDH 5.2.0
            • Spark (Standalone) Properties in CDH 5.2.0
            • Sqoop 1 Client Properties in CDH 5.2.0
            • Sqoop 2 Properties in CDH 5.2.0
            • YARN (MR2 Included) Properties in CDH 5.2.0
            • ZooKeeper Properties in CDH 5.2.0
          • CDH 5.1.0 Properties
            • Accumulo 1.6 Properties in CDH 5.1.0
            • Flume Properties in CDH 5.1.0
            • HBase Properties in CDH 5.1.0
            • HDFS Properties in CDH 5.1.0
            • Hive Properties in CDH 5.1.0
            • Hue Properties in CDH 5.1.0
            • Impala Properties in CDH 5.1.0
            • Isilon Properties in CDH 5.1.0
            • Kafka Properties in CDH 5.1.0
            • Key-Value Store Indexer Properties in CDH 5.1.0
            • MapReduce Properties in CDH 5.1.0
            • Oozie Properties in CDH 5.1.0
            • Sentry Properties in CDH 5.1.0
            • Solr Properties in CDH 5.1.0
            • Spark Properties in CDH 5.1.0
            • Spark (Standalone) Properties in CDH 5.1.0
            • Sqoop 1 Client Properties in CDH 5.1.0
            • Sqoop 2 Properties in CDH 5.1.0
            • YARN (MR2 Included) Properties in CDH 5.1.0
            • ZooKeeper Properties in CDH 5.1.0
          • CDH 5.0.0 Properties
            • Accumulo 1.6 Properties in CDH 5.0.0
            • Flume Properties in CDH 5.0.0
            • HBase Properties in CDH 5.0.0
            • HDFS Properties in CDH 5.0.0
            • Hive Properties in CDH 5.0.0
            • Hue Properties in CDH 5.0.0
            • Impala Properties in CDH 5.0.0
            • Isilon Properties in CDH 5.0.0
            • Kafka Properties in CDH 5.0.0
            • Key-Value Store Indexer Properties in CDH 5.0.0
            • MapReduce Properties in CDH 5.0.0
            • Oozie Properties in CDH 5.0.0
            • Solr Properties in CDH 5.0.0
            • Spark Properties in CDH 5.0.0
            • Spark (Standalone) Properties in CDH 5.0.0
            • Sqoop 1 Client Properties in CDH 5.0.0
            • Sqoop 2 Properties in CDH 5.0.0
            • YARN (MR2 Included) Properties in CDH 5.0.0
            • ZooKeeper Properties in CDH 5.0.0
          • Host Configuration Properties
          • Cloudera Manager Server Properties
          • Cloudera Management Service
      • Managing CDH from the Command Line
        • Starting CDH Services
          • Configuring init to Start Hadoop System Services
          • Starting and Stopping HBase Using the Command Line
        • Stopping CDH Services Using the Command Line
        • Migrating Data between Clusters Using distcp
          • Copying Data between two Clusters Using distcp
          • Copying Data between a Secure and an Insecure Cluster using DistCp and WebHDFS
          • Post-migration Verification
      • Managing Individual Services
        • Managing Flume
        • Managing the HBase Service
          • Managing HBase
          • Starting and Stopping HBase
          • Configuring the HBase Canary
          • Checking and Repairing HBase Tables
          • Hedged Reads
          • Configuring the Blocksize for HBase
          • Configuring the HBase BlockCache
          • Reading Data from HBase
          • HBase Filtering
          • Writing Data to HBase
          • Importing Data Into HBase
          • Configuring HBase MultiWAL Support
          • Storing Medium Objects (MOBs) in HBase
        • Managing HDFS
          • Managing Federated Nameservices
          • NameNodes
            • Backing Up and Restoring HDFS Metadata
            • Moving NameNode Roles
          • DataNodes
            • Adding and Removing Storage Directories for DataNodes
            • Configuring Storage-Balancing for DataNodes
            • Performing Disk Hot Swap for DataNodes
          • JournalNodes
          • Configuring Short-Circuit Reads
          • Configuring HDFS Trash
          • HDFS Balancers
          • Enabling WebHDFS
          • Adding HttpFS
          • Adding and Configuring an NFS Gateway
          • Setting HDFS Quotas
          • Configuring Mountable HDFS
          • Configuring Centralized Cache Management in HDFS
        • Managing Hive
          • Managing Hive Using Cloudera Manager
          • Hive Table Statistics
          • Managing User-Defined Functions (UDFs) with HiveServer2
          • Running Hive on Spark
            • Configuring Hive on Spark
            • Troubleshooting Hive on Spark
            • Configuring Hive on Spark for Hive CLI
        • Managing Hue
          • Adding a Hue Service and Role Instance
          • Hue and High Availability
          • Managing Hue Analytics Data Collection
          • Enabling Hue Applications Using Cloudera Manager
          • Using an External Database for Hue
            • Using an External Database for Hue Using Cloudera Manager
            • Using an External Database for Hue Using the Command Line
        • Managing Impala
          • The Impala Service
          • Post-Installation Configuration for Impala
          • Configuring Impala to Work with ODBC
          • Configuring Impala to Work with JDBC
        • Managing Isilon
        • Managing Key-Value Store Indexer
        • Managing MapReduce and YARN
          • Managing MapReduce
          • Managing YARN
        • Managing Oozie
          • Configuring Oozie for High Availability
          • Adding the Oozie Service Using Cloudera Manager
          • Redeploying the Oozie ShareLib
          • Configuring Oozie Data Purge Settings Using Cloudera Manager
          • Adding Schema to Oozie Using Cloudera Manager
          • Enabling the Oozie Web Console
          • Setting the Oozie Database Timezone
          • Scheduling in Oozie Using Cron-like Syntax
        • Managing Solr
        • Managing Spark
          • Managing Spark Using Cloudera Manager
          • Managing Spark Standalone Using the Command Line
          • Managing the Spark History Server
          • Spark Applications
            • Running Spark Applications
              • Running Spark Applications on YARN
              • Running Spark Applications on Spark Standalone
            • Running a Crunch Application with Spark
        • Managing the Sqoop 1 Client
        • Managing Sqoop 2
        • Managing ZooKeeper
        • Configuring Services to Use the GPL Extras Parcel
    • Resource Management
      • Managing Resources with Cloudera Manager
        • Linux Control Groups
        • Static Service Pools
        • Dynamic Resource Pools
        • Managing Impala Admission Control
        • Managing the Impala Llama ApplicationMaster
      • Impala Resource Management
        • Admission Control and Query Queuing
        • Integrated Resource Management with YARN
    • Performance Management
      • Optimizing Performance in CDH
      • Choosing a Data Compression Format
      • Tuning the Solr Server
      • Tuning Spark Applications
      • Tuning YARN
    • High Availability
      • HDFS High Availability
        • Introduction to HDFS High Availability
        • Configuring Hardware for HDFS HA
        • Enabling HDFS HA
        • Disabling and Redeploying HDFS HA
        • Configuring Other CDH Components to Use HDFS HA
        • Administering an HDFS High Availability Cluster
        • Changing a Nameservice Name for Highly Available HDFS Using Cloudera Manager
      • MapReduce (MRv1) and YARN (MRv2) High Availability
        • YARN (MRv2) ResourceManager High Availability
        • Work Preserving Recovery for YARN Components
        • MapReduce (MRv1) JobTracker High Availability
          • Usage Notes
      • Cloudera Navigator Key Trustee Server High Availability
      • Key Trustee KMS High Availability
      • High Availability for Other CDH Components
        • HBase High Availability
          • HBase Read Replicas
        • Hive Metastore High Availability
        • Hue High Availability
        • Llama High Availability
        • Configuring Oozie for High Availability
        • Search High Availability
      • Configuring Cloudera Manager for High Availability With a Load Balancer
        • Introduction to Cloudera Manager Deployment Architecture
        • Prerequisites for Setting up Cloudera Manager High Availability
        • High-Level Steps to Configure Cloudera Manager High Availability
          • Step 1: Setting Up Hosts and the Load Balancer
          • Step 2: Installing and Configuring Cloudera Manager Server for High Availability
          • Step 3: Installing and Configuring Cloudera Management Service for High Availability
          • Step 4: Automating Failover with Corosync and Pacemaker
        • Database High Availability Configuration
        • TLS and Kerberos Configuration for Cloudera Manager High Availability
    • Backup and Disaster Recovery
      • Backup and Disaster Recovery Overview
      • Data Replication
        • Designating a Replication Source
        • HBase Replication
        • HDFS Replication
        • Hive Replication
        • Impala Metadata Replication
        • Using Snapshots with Replication
        • Enabling Replication Between Clusters in Different Kerberos Realms
        • Replication of Encrypted Data
      • Snapshots
        • Cloudera Manager Snapshot Policies
        • Managing HBase Snapshots
        • Managing HDFS Snapshots
    • Cloudera Manager Administration
      • Managing the Cloudera Manager Server and Agents
        • Starting, Stopping, and Restarting the Cloudera Manager Server
        • Configuring Cloudera Manager Server Ports
        • Moving the Cloudera Manager Server to a New Host
        • Starting, Stopping, and Restarting Cloudera Manager Agents
        • Configuring Cloudera Manager Agents
        • Managing Cloudera Manager Server and Agent Logs
        • Changing Hostnames
        • Configuring Network Settings
        • Managing Alerts
          • Configuring Alert Email Delivery
          • Configuring Alert SNMP Delivery
        • Managing Licenses
        • Sending Usage and Diagnostic Data to Cloudera
        • Exporting and Importing Cloudera Manager Configuration
        • Other Cloudera Manager Tasks and Settings
      • Cloudera Management Service
    • Cloudera Navigator Data Management Component Administration
      • Cloudera Navigator Audit Server
      • Cloudera Navigator Metadata Server
  • Cloudera Data Management
    • Cloudera Navigator Auditing Architecture
      • Audit Log Properties
      • Service Auditing Properties
        • Auditing
      • Audit Events
      • Audit Event Reports
      • Downloading HDFS Directory Access Permission Reports
    • Cloudera Navigator Metadata Architecture
      • Metadata Search Syntax and Properties
      • Accessing Metadata Using Cloudera Navigator
      • Modifying Custom Metadata
    • Metadata Extraction Policies
      • Metadata Policy Expressions
    • Introduction to Cloudera Navigator Lineage Diagrams
      • Impala Lineage Properties
      • Schema
  • Cloudera Operation
    • Monitoring and Diagnostics
      • Introduction to Cloudera Manager Monitoring
        • Starting and Logging into the Admin Console
        • Time Line
        • Health Tests
        • Cloudera Manager Admin Console Home Page
        • Viewing Charts for Cluster, Service, Role, and Host Instances
        • Configuring Monitoring Settings
      • Monitoring Clusters
      • Monitoring Services
        • Monitoring Service Status
        • Viewing Service Status
        • Viewing Service Instance Details
        • Viewing Role Instance Status
          • The Processes Tab
        • Running Diagnostic Commands for Roles
        • Periodic Stacks Collection
        • Managing and Monitoring Federated HDFS
        • Viewing Running and Recent Commands
        • Monitoring Resource Management
      • Monitoring Hosts
        • Host Details
        • Host Inspector
      • Monitoring Activities
        • Monitoring MapReduce Jobs
          • Viewing and Filtering MapReduce Activities
          • Viewing the Jobs in a Pig, Oozie, or Hive Activity
          • Task Attempts
          • Viewing Activity Details in a Report Format
          • Comparing Similar Activities
          • Viewing the Distribution of Task Attempts
        • Monitoring Impala Queries
          • Query Details
        • Monitoring YARN Applications
        • Monitoring Spark Applications
      • Events
      • Alerts
      • Triggers
      • Audit Events
      • Charting Time-Series Data
        • Dashboards
        • tsquery Language
        • Metric Aggregation
      • Logs
        • Viewing Cloudera Manager Server and Agent Logs
      • Reports
        • Disk Usage Reports
        • Activity, Application, and Query Reports
        • The File Browser
        • Downloading HDFS Directory Access Permission Reports
      • Troubleshooting Cluster Configuration and Operation
    • Cloudera Manager Events
      • ACTIVITY_EVENT Category
      • AUDIT_EVENT Category
      • HBASE Category
      • HEALTH_CHECK Category
      • LOG_MESSAGE Category
      • SYSTEM Category
    • Cloudera Manager Health Tests
      • Activity Monitor Health Tests
      • Alert Publisher Health Tests
      • Beeswax Server Health Tests
      • Cloudera Management Service Health Tests
      • DataNode Health Tests
      • Event Server Health Tests
      • Failover Controller Health Tests
      • Flume Health Tests
      • Flume Agent Health Tests
      • Garbage Collector Health Tests
      • HBase Health Tests
      • HBase REST Server Health Tests
      • HBase Thrift Server Health Tests
      • HDFS Health Tests
      • History Server Health Tests
      • Hive Health Tests
      • Hive Metastore Server Health Tests
      • HiveServer2 Health Tests
      • Host Health Tests
      • Host Monitor Health Tests
      • HttpFS Health Tests
      • Hue Health Tests
      • Hue Server Health Tests
      • Impala Health Tests
      • Impala Catalog Server Health Tests
      • Impala Daemon Health Tests
      • Impala Llama ApplicationMaster Health Tests
      • Impala StateStore Health Tests
      • JobHistory Server Health Tests
      • JobTracker Health Tests
      • JournalNode Health Tests
      • Kafka Broker Health Tests
      • Kafka MirrorMaker Health Tests
      • Kerberos Ticket Renewer Health Tests
      • Key Management Server Health Tests
      • Key Management Server Proxy Health Tests
      • Key-Value Store Indexer Health Tests
      • Lily HBase Indexer Health Tests
      • Logger Health Tests
      • MapReduce Health Tests
      • Master Health Tests
      • Monitor Health Tests
      • NFS Gateway Health Tests
      • NameNode Health Tests
      • Navigator Audit Server Health Tests
      • Navigator Metadata Server Health Tests
      • NodeManager Health Tests
      • Oozie Health Tests
      • Oozie Server Health Tests
      • RegionServer Health Tests
      • Reports Manager Health Tests
      • ResourceManager Health Tests
      • SecondaryNameNode Health Tests
      • Sentry Health Tests
      • Sentry Server Health Tests
      • Service Monitor Health Tests
      • Solr Health Tests
      • Solr Server Health Tests
      • Sqoop 2 Health Tests
      • Sqoop 2 Server Health Tests
      • Tablet Server Health Tests
      • TaskTracker Health Tests
      • Tracer Health Tests
      • WebHCat Server Health Tests
      • Worker Health Tests
      • YARN (MR2 Included) Health Tests
      • ZooKeeper Health Tests
      • ZooKeeper Server Health Tests
    • Cloudera Manager Metrics
      • Accumulo Metrics
      • Accumulo 1.6 Metrics
      • Activity Metrics
      • Activity Monitor Metrics
      • Agent Metrics
      • Alert Publisher Metrics
      • Attempt Metrics
      • Beeswax Server Metrics
      • Cloudera Management Service Metrics
      • Cloudera Manager Server Metrics
      • DataNode Metrics
      • Directory Metrics
      • Disk Metrics
      • Event Server Metrics
      • Failover Controller Metrics
      • Filesystem Metrics
      • Flume Metrics
      • Flume Channel Metrics
      • Flume Sink Metrics
      • Flume Source Metrics
      • Garbage Collector Metrics
      • HBase Metrics
      • HBase REST Server Metrics
      • HBase RegionServer Replication Peer Metrics
      • HBase Thrift Server Metrics
      • HDFS Metrics
      • HDFS Cache Directive Metrics
      • HDFS Cache Pool Metrics
      • HRegion Metrics
      • HTable Metrics
      • History Server Metrics
      • Hive Metrics
      • Hive Metastore Server Metrics
      • HiveServer2 Metrics
      • Host Metrics
      • Host Monitor Metrics
      • HttpFS Metrics
      • Hue Metrics
      • Hue Server Metrics
      • Impala Metrics
      • Impala Catalog Server Metrics
      • Impala Daemon Metrics
      • Impala Llama ApplicationMaster Metrics
      • Impala Query Metrics
      • Impala StateStore Metrics
      • Isilon Metrics
      • Java KeyStore KMS Metrics
      • JobHistory Server Metrics
      • JobTracker Metrics
      • JournalNode Metrics
      • Kafka Metrics
      • Kafka Broker Metrics
      • Kafka Broker Topic Metrics
      • Kafka MirrorMaker Metrics
      • Kerberos Ticket Renewer Metrics
      • Key Management Server Metrics
      • Key Management Server Proxy Metrics
      • Key Trustee KMS Metrics
      • Key-Value Store Indexer Metrics
      • Lily HBase Indexer Metrics
      • Logger Metrics
      • MapReduce Metrics
      • Master Metrics
      • Monitor Metrics
      • NFS Gateway Metrics
      • NameNode Metrics
      • Navigator Audit Server Metrics
      • Navigator Metadata Server Metrics
      • Network Interface Metrics
      • NodeManager Metrics
      • Oozie Metrics
      • Oozie Server Metrics
      • RegionServer Metrics
      • Reports Manager Metrics
      • ResourceManager Metrics
      • SecondaryNameNode Metrics
      • Sentry Metrics
      • Sentry Server Metrics
      • Server Metrics
      • Service Monitor Metrics
      • Solr Metrics
      • Solr Replica Metrics
      • Solr Server Metrics
      • Solr Shard Metrics
      • Spark Metrics
      • Spark (Standalone) Metrics
      • Sqoop 1 Client Metrics
      • Sqoop 2 Metrics
      • Sqoop 2 Server Metrics
      • Tablet Server Metrics
      • TaskTracker Metrics
      • Time Series Table Metrics
      • Tracer Metrics
      • User Metrics
      • WebHCat Server Metrics
      • Worker Metrics
      • YARN (MR2 Included) Metrics
      • YARN Pool Metrics
      • ZooKeeper Metrics
  • Cloudera Security
    • Authentication
      • Configuring Authentication in Cloudera Manager
        • Cloudera Manager User Accounts
        • Configuring External Authentication for Cloudera Manager
        • Kerberos Concepts - Principals, Keytabs and Delegation Tokens
        • Enabling Kerberos Authentication Using the Wizard
          • Step 1: Install Cloudera Manager and CDH
          • Step 2: If You are Using AES-256 Encryption, Install the JCE Policy File
          • Step 3: Get or Create a Kerberos Principal for the Cloudera Manager Server
          • Step 4: Enabling Kerberos Using the Wizard
          • Step 5: Create the HDFS Superuser
          • Step 6: Get or Create a Kerberos Principal for Each User Account
          • Step 7: Prepare the Cluster for Each User
          • Step 8: Verify that Kerberos Security is Working
          • Step 9: (Optional) Enable Authentication for HTTP Web Consoles for Hadoop Roles
        • Enabling Kerberos Authentication for Single User Mode or Non-Default Users
        • Configuring a Cluster with Custom Kerberos Principals
        • Viewing and Regenerating Kerberos Principals
        • Mapping Kerberos Principals to Short Names
        • Using Auth-to-Local Rules to Isolate Cluster Users
        • Configuring Kerberos for Flume Thrift Source and Sink
        • Configuring YARN for Long-running Applications
        • Enabling Kerberos Authentication Without the Wizard
          • Step 1: Install Cloudera Manager and CDH
          • Step 2: If You are Using AES-256 Encryption, Install the JCE Policy File
          • Step 3: Get or Create a Kerberos Principal for the Cloudera Manager Server
          • Step 4: Import KDC Account Manager Credentials
          • Step 5: Configure the Kerberos Default Realm in the Cloudera Manager Admin Console
          • Step 6: Stop All Services
          • Step 7: Enable Hadoop Security
          • Step 8: Wait for the Generate Credentials Command to Finish
          • Step 9: Enable Hue to Work with Hadoop Security using Cloudera Manager
          • Step 10: (Flume Only) Use Substitution Variables for the Kerberos Principal and Keytab
          • Step 11: (CDH 4.0 and 4.1 only) Configure Hue to Use a Local Hive Metastore
          • Step 12: Start All Services
          • Step 13: Deploy Client Configurations
          • Step 14: Create the HDFS Superuser Principal
          • Step 15: Get or Create a Kerberos Principal for Each User Account
          • Step 16: Prepare the Cluster for Each User
          • Step 17: Verify that Kerberos Security is Working
          • Step 18: (Optional) Enable Authentication for HTTP Web Consoles for Hadoop Roles
      • Configuring Authentication in the Cloudera Navigator Data Management Component
        • Configuring External Authentication for the Cloudera Navigator Data Management Component
        • Managing Users and Groups for the Cloudera Navigator Data Management Component
      • Configuring Authentication in CDH Using the Command Line
        • Enabling Kerberos Authentication for Hadoop Using the Command Line
          • Step 1: Install CDH 5
          • Step 2: Verify User Accounts and Groups in CDH 5 Due to Security
          • Step 3: If you are Using AES-256 Encryption, install the JCE Policy File
          • Step 4: Create and Deploy the Kerberos Principals and Keytab Files
          • Step 5: Shut Down the Cluster
          • Step 6: Enable Hadoop Security
          • Step 7: Configure Secure HDFS
          • Optional Step 8: Configuring Security for HDFS High Availability
          • Optional Step 9: Configure secure WebHDFS
          • Optional Step 10: Configuring a secure HDFS NFS Gateway
          • Step 11: Set Variables for Secure DataNodes
          • Step 12: Start up the NameNode
          • Step 12: Start up a DataNode
          • Step 14: Set the Sticky Bit on HDFS Directories
          • Step 15: Start up the Secondary NameNode (if used)
          • Step 16: Configure Either MRv1 Security or YARN Security
            • Configuring MRv1 Security
            • Configuring YARN Security
        • Flume Authentication
          • Configuring Flume's Security Properties
          • Flume Account Requirements
          • Testing the Flume HDFS Sink Configuration
          • Writing to a Secure HBase cluster
        • HBase Authentication
          • Configuring Kerberos Authentication for HBase
          • Configuring Secure HBase Replication
          • Configuring the HBase Client TGT Renewal Period
        • HCatalog Authentication
        • Hive Authentication
          • HiveServer2 Security Configuration
          • Hive Metastore Server Security Configuration
          • Using Hive to Run Queries on a Secure HBase Server
        • HttpFS Authentication
        • Hue Authentication
          • Configuring Kerberos Authentication for Hue
          • Integrating Hue with LDAP
          • Configuring Hue for SAML
        • Impala Authentication
          • Enabling Kerberos Authentication for Impala
          • Enabling LDAP Authentication for Impala
          • Using Multiple Authentication Methods with Impala
          • Configuring Impala Delegation for Hue and BI Tools
        • Llama Authentication
        • Oozie Authentication
          • Configuring Kerberos Authentication for the Oozie Server
          • Configuring Oozie HA with Kerberos
        • Search Authentication
        • Spark Authentication
        • Sqoop 2 Authentication
        • ZooKeeper Authentication
        • FUSE Kerberos Configuration
        • Using kadmin to Create Kerberos Keytab Files
        • Configuring the Mapping from Kerberos Principals to Short Names
        • Enabling Debugging Output for the Sun Kerberos Classes
      • Configuring a Cluster-dedicated MIT KDC with Cross-Realm Trust
      • Integrating Hadoop Security with Active Directory
      • Integrating Hadoop Security with Alternate Authentication
      • Hadoop Users in Cloudera Manager and CDH
      • Authenticating Kerberos Principals in Java Code
      • Using a Web Browser to Access an URL Protected by Kerberos HTTP SPNEGO
      • Troubleshooting Authentication Issues
    • Encryption
      • TLS/SSL Certificates Overview
        • Creating Certificates
        • Creating Java Keystores and Truststores
        • Private Key and Certificate Reuse Across Java Keystores and OpenSSL
      • Configuring TLS Security for Cloudera Manager
        • Configuring TLS Encryption Only for Cloudera Manager
        • Level 1: Configuring TLS Encryption for Cloudera Manager Agents
        • Level 2: Configuring TLS Verification of Cloudera Manager Server by the Agents
        • Level 3: Configuring TLS Authentication of Agents to the Cloudera Manager Server
        • HTTPS Communication in Cloudera Manager
        • Troubleshooting SSL/TLS Connectivity
        • Deploying the Cloudera Manager Keystore for Level 1 TLS with Self-Signed Certificates
      • Configuring SSL for the Cloudera Navigator Data Management Component
      • Configuring SSL for Cloudera Management Service Roles
      • Configuring SSL/TLS Encryption for CDH Services
        • Configuring SSL for HDFS, YARN and MapReduce
        • Configuring SSL for HBase
        • Configuring SSL for Flume Thrift Source and Sink
        • Configuring Encrypted Communication Between Hive and Client Drivers
        • Configuring SSL for Hue
        • Configuring SSL for Impala
        • Configuring SSL for Oozie
        • Configuring SSL for Solr
        • Configuring HttpFS to use SSL
        • Encrypted Shuffle and Encrypted Web UIs
      • Deployment Planning for Data at Rest Encryption
        • Data at Rest Encryption Reference Architecture
        • Data at Rest Encryption Requirements
        • Resource Planning for Data at Rest Encryption
      • Cloudera Navigator Key Trustee Server
        • Backing Up and Restoring Key Trustee Server
        • Initializing Standalone Key Trustee Server
        • Configuring a Mail Transfer Agent for Key Trustee Server
        • Verifying Cloudera Navigator Key Trustee Server Operations
        • Managing Key Trustee Server Organizations
        • Managing Key Trustee Server Certificates
      • Cloudera Navigator Key HSM
        • Initializing Navigator Key HSM
        • HSM-Specific Setup for Cloudera Navigator Key HSM
        • Validating Key HSM Settings
        • Creating a Key Store with CA-Signed Certificate
        • Managing the Navigator Key HSM Service
        • Integrating Key HSM with Key Trustee Server
      • Cloudera Navigator Encrypt
        • Registering Navigator Encrypt with Key Trustee Server
        • Preparing for Encryption Using Cloudera Navigator Encrypt
        • Encrypting and Decrypting Data Using Cloudera Navigator Encrypt
        • Migrating eCryptfs-Encrypted Data to dm-crypt
        • Navigator Encrypt Access Control List
        • Maintaining Navigator Encrypt
      • HDFS Data At Rest Encryption
        • Configuring the Key Management Server (KMS)
        • Securing the Key Management Server (KMS)
        • Integrating HDFS Encryption with Navigator Key Trustee Server
        • Configuring CDH Services for HDFS Encryption
        • Troubleshooting HDFS Encryption
      • Configuring Encrypted HDFS Data Transport
      • Configuring Encrypted HBase Data Transport
    • Authorization
      • Cloudera Manager User Roles
      • Cloudera Navigator Data Management Component User Roles
      • HDFS Extended ACLs
      • Configuring LDAP Group Mappings
      • Authorization With Apache Sentry (Incubating)
        • The Sentry Service
          • Installing and Upgrading the Sentry Service
          • Migrating from Sentry Policy Files to the Sentry Service
          • Configuring the Sentry Service
          • Sentry Debugging and Failure Scenarios
          • Hive SQL Syntax for Use with Sentry
          • Synchronizing HDFS ACLs and Sentry Permissions
          • Reporting Metrics for the Sentry Service
        • Sentry Policy File Authorization
          • Installing and Upgrading Sentry for Policy File Authorization
          • Configuring Sentry Policy File Authorization Using Cloudera Manager
          • Configuring Sentry Policy File Authorization Using the Command Line
        • Enabling Sentry Authorization for Impala
        • Enabling Sentry Authorization for Search using the Command Line
      • Configuring HBase Authorization
    • Sensitive Data Redaction
    • Overview of Impala Security
      • Security Guidelines for Impala
      • Securing Impala Data and Log Files
      • Installation Considerations for Impala Security
      • Securing the Hive Metastore Database
      • Securing the Impala Web User Interface
    • Miscellaneous Topics
      • Jsvc, Task Controller and Container Executor Programs
        • MRv1 ONLY: Task-controller Error Codes
        • YARN ONLY: Container-executor Error Codes
      • Sqoop, Pig, and Whirr Security Support Status
      • Setting Up a Gateway Node to Restrict Cluster Access
      • Logging a Security Support Case
      • Using Antivirus Software on CDH Hosts
  • Cloudera Impala Guide
    • Concepts and Architecture
      • Components
      • Developing Applications
      • Role in the Hadoop Ecosystem
    • Deployment Planning
      • Requirements
      • Designing Schemas
    • Tutorials
    • Administration
      • Setting Timeouts
      • Load-Balancing Proxy for HA
      • Managing Disk Space
      • Auditing
      • Viewing Lineage Info
    • SQL Reference
      • Comments
      • Data Types
        • BIGINT
        • BOOLEAN
        • CHAR
        • DECIMAL
        • DOUBLE
        • FLOAT
        • INT
        • REAL
        • SMALLINT
        • STRING
        • TIMESTAMP
        • TINYINT
        • VARCHAR
      • Literals
      • SQL Operators
      • Schema Objects and Object Names
        • Aliases
        • Databases
        • Functions
        • Identifiers
        • Tables
        • Views
      • SQL Statements
        • DDL Statements
        • DML Statements
        • ALTER TABLE
        • ALTER VIEW
        • COMPUTE STATS
        • CREATE DATABASE
        • CREATE FUNCTION
        • CREATE ROLE
        • CREATE TABLE
        • CREATE VIEW
        • DESCRIBE
        • DROP DATABASE
        • DROP FUNCTION
        • DROP ROLE
        • DROP STATS
        • DROP TABLE
        • DROP VIEW
        • EXPLAIN
        • GRANT
        • INSERT
        • INVALIDATE METADATA
        • LOAD DATA
        • REFRESH
        • REVOKE
        • SELECT
          • Joins
          • ORDER BY Clause
          • GROUP BY Clause
          • HAVING Clause
          • LIMIT Clause
          • OFFSET Clause
          • UNION Clause
          • Subqueries
          • WITH Clause
          • DISTINCT Operator
          • Hints
        • SET
          • Query Options for the SET Statement
            • ABORT_ON_DEFAULT_LIMIT_EXCEEDED
            • ABORT_ON_ERROR
            • ALLOW_UNSUPPORTED_FORMATS
            • APPX_COUNT_DISTINCT
            • BATCH_SIZE
            • COMPRESSION_CODEC
            • DEBUG_ACTION
            • DEFAULT_ORDER_BY_LIMIT
            • DISABLE_CODEGEN
            • DISABLE_UNSAFE_SPILLS
            • EXEC_SINGLE_NODE_ROWS_THRESHOLD
            • EXPLAIN_LEVEL
            • HBASE_CACHE_BLOCKS
            • HBASE_CACHING
            • MAX_ERRORS
            • MAX_IO_BUFFERS
            • MAX_SCAN_RANGE_LENGTH
            • MEM_LIMIT
            • NUM_NODES
            • NUM_SCANNER_THREADS
            • PARQUET_COMPRESSION_CODEC
            • PARQUET_FILE_SIZE
            • QUERY_TIMEOUT_S
            • REQUEST_POOL
            • RESERVATION_REQUEST_TIMEOUT
            • SUPPORT_START_OVER
            • SYNC_DDL
            • V_CPU_CORES
        • SHOW
        • USE
      • Built-In Functions
        • Mathematical Functions
        • Type Conversion Functions
        • Date and Time Functions
        • Conditional Functions
        • String Functions
        • Miscellaneous Functions
        • Aggregate Functions
          • APPX_MEDIAN
          • AVG
          • COUNT
          • GROUP_CONCAT
          • MAX
          • MIN
          • NDV
          • STDDEV, STDDEV_SAMP, STDDEV_POP
          • SUM
          • VARIANCE, VARIANCE_SAMP, VARIANCE_POP, VAR_SAMP, VAR_POP
        • Analytic Functions
        • Impala User-Defined Functions (UDFs)
      • SQL Differences Between Impala and Hive
      • Porting SQL
    • The Impala Shell
      • Configuration Options
      • Connecting to impalad
      • Running Commands and SQL Statements
      • Command Reference
    • Performance Tuning
      • Performance Best Practices
      • Join Performance
      • Table and Column Statistics
      • Benchmarking
      • Controlling Resource Usage
      • HDFS Caching
      • Testing Impala Performance
      • EXPLAIN Plans and Query Profiles
      • HDFS Block Skew
    • Scalability Considerations
    • Partitioning
    • File Formats
      • Text Data Files
      • Parquet Data Files
      • Avro Data Files
      • RCFile Data Files
      • SequenceFile Data Files
    • HBase Tables
    • S3 Tables
    • Isilon Storage
    • Logging
    • Troubleshooting Impala
      • Web User Interface
    • Ports Used by Impala
    • Impala Reserved Words
    • Impala Frequently Asked Questions
  • Cloudera Search Guide
    • Cloudera Search User Guide
      • Cloudera Search Overview
      • Understanding Cloudera Search
        • Cloudera Search and Other Cloudera Components
        • Cloudera Search Architecture
        • Cloudera Search Tasks and Processes
      • Cloudera Search Tutorial
        • Validating the Deployment with the Solr REST API
        • Preparing to Index Data with Cloudera Search
        • Using MapReduce Batch Indexing with Cloudera Search
        • Near Real Time (NRT) Indexing Using Flume and the Solr Sink
          • Deploying Solr Sink into the Flume Agent
          • Configuring the Flume Solr Sink
          • Configuring Flume Solr Sink to Sip from the Twitter Firehose
          • Starting the Flume Agent
          • Indexing a File Containing Tweets with Flume HTTPSource
          • Indexing a File Containing Tweets with Flume SpoolDirectorySource
        • Using Hue with Cloudera Search
      • Solrctl Reference
      • Spark Indexing
      • MapReduce Batch Indexing Reference
        • MapReduceIndexerTool
          • MapReduceIndexerTool Metadata
        • HdfsFindTool
      • Flume Near Real-Time Indexing Reference
        • Flume Morphline Solr Sink Configuration Options
        • Flume Morphline Interceptor Configuration Options
        • Flume Solr UUIDInterceptor Configuration Options
        • Flume Solr BlobHandler Configuration Options
        • Flume Solr BlobDeserializer Configuration Options
      • Extracting, Transforming, and Loading Data With Cloudera Morphlines
        • Example Morphline Usage
      • Using the Lily HBase Batch Indexer for Indexing
        • HBaseMapReduceIndexerTool
      • Configuring the Lily HBase NRT Indexer Service for Use with Cloudera Search
        • Using the Lily HBase NRT Indexer Service
      • Schemaless Mode Overview and Best Practices
      • Using Search through a Proxy for High Availability
      • Migrating Solr Replicas
      • Using Custom JAR Files with Search
      • Troubleshooting Cloudera Search
        • Static Solr Log Analysis
      • Cloudera Search Frequently Asked Questions
  • Cloudera Glossary

Getting Started with Mahout

To get started with Mahout, you can follow the instructions in this Apache Mahout Quickstart.

Categories: Getting Started | Installing | Mahout | All Categories

The Mahout Executable
Viewing the Mahout Documentation
Next Topic Previous Topic Print Back to top
  • About Cloudera
  • Resources
  • Contact
  • Careers
  • Press
  • Documentation

United States: +1 888 789 1488
Outside the US: +1 650 362 0488

© 2021 Cloudera, Inc. All rights reserved. Apache Hadoop and associated open source project names are trademarks of the Apache Software Foundation. For a complete list of trademarks, click here.

If this documentation includes code, including but not limited to, code examples, Cloudera makes this available to you under the terms of the Apache License, Version 2.0, including any required notices. A copy of the Apache License Version 2.0 can be found here.

Terms & Conditions  |  Privacy Policy

Page generated February 3, 2021.