Storage

Cloudera Runtime provides different types of storage components that you can use depending on your data requirements. Apache Kudu completes Apache Hadoop’s storage layer, enabling fast analytics on fast data. Apache Hadoop HDFS is a distributed file system for storing large volumes of data. Apache Ozone is a scalable, redundant, and distributed object store optimized for big data workloads.

Apache Hadoop HDFS

Managing Data Storage

Provides information about optimizing data storage, APIs and services for accessing data, and managing data across clusters.

Configuring Data Protection

Provides information about configuring data protection on a Hadoop cluster.

Accessing Cloud Data

Describes information about the configuration parameters used to access data stored in the cloud.

Configuring HDFS ACLs

Describes the procedure to configure Access Control Lists (ACLs) on Apache Hadoop HDFS.

Configuring Fault Tolerance

Describes the procedure to configure HDFS high availability on a cluster.

Apache Ozone

Storing Data using Ozone

Describes configuring and managing data using the Ozone object store.

Configuring Ozone Security

Describes securing data in Ozone clusters and securing access to this data.

Apache Kudu

Configuring Apache Kudu

Describes common Apache Kudu configuration tasks.

Managing Apache Kudu

Describes common Apache Kudu management tasks and workflows.

Managing Apache Kudu Security

Provide information about how to configure and manage security for Apache Kudu.

Backing Up and Recovering Apache Kudu

Provide information about how to back up and recover Apache Kudu tables.

Developing Applications with Apache Kudu

Provides reference examples to use C++ and Java client APIs to develop apps using Apache Kudu.

Using Apache Impala with Apache Kudu

Provides information about how to use Apache Kudu as a storage for Apache Impala.

Using Hive Metastore with Apache Kudu

Provides information about how to integrate Hive Metastore with Apache Kudu.

Monitoring Apache Kudu

Provide information about how to monitor Kudu metrics and cluster health, and how to collect diagnostics information.