Overview

Cloudera Operational Database supports both high performance storage and Amazon S3 or Azure Data Lake Gen2 storage through the abfs connector as a storage layer. While the default option is cloud storage, Cloudera Operational Database also provides you an option to choose high performance storage.

You can use high performance storage either in an on-premises or a cloud environment to maintain low-latency and high-throughput for your application. If your applications are latency sensitive, Cloudera recommends that you use high performance storage for your Cloudera Operational Database databases. The average read and write latency for Cloudera Operational Database on a high performance storage is 500% better than the Cloudera Operational Database that uses S3 storage for all the workloads.

HDFS volume type🔗

Hadoop Distributed File System (HDFS) is a Java-based file system for storing large volumes of data. Designed to span large clusters of commodity servers, HDFS provides scalable and reliable data storage.

An HDFS cluster contains the following main components: a NameNode and DataNodes. The NameNode manages the cluster metadata that includes file and directory structures, permissions, modifications, and disk space quotas. The file content is split into multiple data blocks, with each block replicated at multiple DataNodes.

For more information, see HDFS Overview.

While creating a Cloudera Operational Database database, you have the following two options for storing your data when you select HDFS as the storage type.

HDD (Hard Disk Drives): HDD is a cost-efficient option if you want to access the data frequently. This is a preferred storage option for applications that are not very sensitive to latency.
SSD (Solid State Drives): SSDs provide more storage volume, speed, and efficiency; however, this storage option could be expensive compared to HDD.

For more information on the difference between SSD and HDD, see Amazon documentation.

Worker nodes🔗

Worker nodes store the data on your Cloudera Operational Database cluster. The general responsibilities of the worker nodes on your Cloudera Operational Database cluster include processing the data stored in the cluster and managing the network traffic. The worker nodes also ensure smooth operations between the applications across multiple Cloudera Operational Database clusters.

When you create a Cloudera Operational Database cluster, the minimum and maximum number of worker nodes vary for different storage types.

Micro duty: Minimum node count: 1. Maximum node count: 5.
Light duty: Minimum node count: 3. Maximum node count: 100.
Heavy duty: Minimum node count: 3. Maximum node count: 800.

Overview

HDFS volume type🔗

Worker nodes🔗

We want your opinion

How can we improve this page?