Data Access
Cloudera Data Platform Runtime includes Apache Hive 3 and Apache Impala for storing and accessing data in the Hive metastore database. Hive 3 addresses enterprise data warehouse demands for transactional data in the ORC file format. Impala performs high-performance, low-latency SQL queries on data in Parquet and other formats. Hue is a web-based interactive editor for querying the Hive metastore that also creates Oozie workflows. DAS is a web application for performing operations on Hive tables and also provides recommendations for optimizing the performance of your queries.
- Starting Apache Hive
- Describes how to launch Hive, execute Hive commands, and issue Hive queries from Beeline.
- Using Hive
- Covers how to use Hive 3 to query flat and transactional data using SQL statements.
- Managing Apache Hive
- Includes information about mature ACID v2 operations on transactions.
- Configuring Apache Hive
- Describes how to set up Hive to generate statistics and control the number of concurrent connections to Hive.
- Configuring Apache Hive Metastore
- Covers how to configure Hive metastore (HMS) to access metadata of multiple services, such as Hive, Impala, and Spark.
- Securing Apache Hive
- Discusses how to choose an authorization model based on how your organization uses Hive.
- Integrating Apache Hive with Apache Spark, Apache Kafka, and BI
- Covers accessing Spark data to and from Hive, accessing data in Kafka from Hive, and using the JdbcStorageHandler to access an external data source, such as Business Intelligence (BI) tools.
- Migrating Data Using Sqoop
- Explains how to move data from relational databases directly to Hive or to the file system or object store and how to move data back to Hive.
- Managing Apache Impala
- Presents the task topics for configuring and managing Impala.
- Using Hue
- Describes how to use Hue to query Apache Impala data sets and how to use it to browse metadata in Apache Atlas.
- Administering Hue
- Describes how to configure Hue, customize its web UI, and to enable integration with Apache Atlas.
- Securing Hue
- Describes how to set Hue user and application permissions, configure SSL connections, LDAP authentication, and integration with Apache Ranger and Knox.
- Tuning Hue
- Describes how to add a load balancer and configure high availability for Hue and between Hue and other components, such as Hive, Impala, and HDFS.
- Search Tutorial
- A tutorial on using Cloudera Search.
- Securing Cloudera Search
- Describes how to secure Solr network connections, configure authentication and authorization.
- Tuning Cloudera Search
- Describes how to optimize Cloudera Search performance for various use cases.
- Managing Cloudera Search
- Describes how to configure and manage Cloudera Search.
- Cloudera Search ETL
- Describes how to perform ETL using Cloudera Search and Morphlines.
- Indexing Data Using Cloudera Search
- Describes how to index data using Cloudera Search.