Overview

Cloudera Operational Database (COD) supports both high performance storage and Amazon S3 or Azure Data Lake Gen2 storage through the abfs connector as a storage layer. While the default option is cloud storage, COD also provides you an option to choose high performance storage.

You can use high performance storage either in an on-premises or a cloud environment to maintain low-latency and high-throughput for your application. If your applications are latency sensitive, Cloudera recommends that you use high performance storage for your COD databases. The average read and write latency for COD on a high performance storage is 500% better than the COD that uses S3 storage for all the workloads.

HDFS volume type

Hadoop Distributed File System (HDFS) is a Java-based file system for storing large volumes of data. Designed to span large clusters of commodity servers, HDFS provides scalable and reliable data storage.

An HDFS cluster contains the following main components: a NameNode and DataNodes. The NameNode manages the cluster metadata that includes file and directory structures, permissions, modifications, and disk space quotas. The file content is split into multiple data blocks, with each block replicated at multiple DataNodes.

For more information, see HDFS Overview.

While creating a COD database, you have the following two options for storing your data when you select HDFS as the storage type.

  • HDD (Hard Disk Drives): HDD is a cost-efficient option if you want to access the data frequently. This is a preferred storage option for applications that are not very sensitive to latency.
  • SSD (Solid State Drives): SSDs provide more storage volume, speed, and efficiency; however, this storage option could be expensive compared to HDD.

For more information on the difference between SSD and HDD, see Amazon documentation.

Worker nodes

Worker nodes store the data on your COD cluster. The general responsibilities of the worker nodes on your COD cluster include processing the data stored in the cluster and managing the network traffic. The worker nodes also ensure smooth operations between the applications across multiple COD clusters.

When you create a COD cluster, the minimum and maximum number of worker nodes vary for different storage types.

  • Micro duty: Minimum node count: 1. Maximum node count: 5.
  • Light duty: Minimum node count: 3. Maximum node count: 100.
  • Heavy duty: Minimum node count: 3. Maximum node count: 800.