Operational Database Overview

Cloudera Operational Database delivers a real-time, always available, scalable operational database that serves traditional structured data alongside unstructured data within a unified operational and warehousing platform.

Cloudera Operational Database is a CDP Public Cloud service for self-service creation of an operational database. Cloudera Operational Database is designed to be used by developers. You do not need a dedicated team of system administrators to run an on-premise cluster of physical servers. You can go directly to CDP and provision a new database, therefore letting you rapidly prototype and deploy new applications onto the public cloud.

Apache HBase

Apache HBase is a column-oriented database management system that runs on top of the Hadoop Distributed File System (HDFS), a main component of Apache Hadoop. It enables random, strictly consistent, and real-time access to petabytes of data.

Apache Phoenix

Apache Phoenix is a SQL layer for Apache HBase that provides a programmatic ANSI SQL interface. Apache Phoenix implements best-practice optimizations to enable software engineers to develop HBase based next-generation applications that operationalize big data.

Apache Kafka

Apache Kafka is a distributed streaming platform that is used to build real-time applications.

Apache Spark

Apache Spark is a general framework for distributed computing that offers high performance for both batch and interactive processing. It exposes APIs for Java, Python, and Scala and consists of Spark core and several related projects.

Apache Nifi

Apache Nifi is used to develop, manage, and monitor data flows.

Apache Hive

Apache Hive is a data warehouse software that facilitates reading, writing, and managing large datasets residing in distributed storage using SQL.

Apache Hue

Apache Hue is a SQL assistant for databases and data warehouses.

Apache Impala

Apache Impala is a modern, open source, distributed SQL query engine for Apache Hadoop.

Apache Knox

Apache Know Gateway provides access to Apache Hadoop through proxying of HTTP resources. It provides proxying services, authentication services, and client DSL or SDK services.

Cloudera Operational Database is powered by Apache HBase and Apache Phoenix. In Cloudera Operational Database, you use object store such as Amazon S3 and Microsoft ADLS Gen2 to store the Apache HBase files. You have the choice to either develop applications using one of the native Apache HBase API, or you can use Apache Phoenix for data access. Apache Phoenix is a SQL layer that provides a programmatic ANSI SQL interface. It works on top of Apache HBase, and it makes it possible to handle data using standard SQL queries and Apache Phoenix commands. You can use Cloudera Operational Database in the public cloud or on-premises.

Public cloud

Cloudera Operational Database (COD) that is a managed database platform as a service (dbPaaS).

On-premises

  • CDP Private Cloud Base

Cloudera Operational Database plays the crucial role of a data store in the enterprise data lifecycle. The following graphic shows some of the key elements in a typical operational database deployment.

The operational database has the following components:

  • Apache Phoenix provides a SQL interface that runs on top of Apache HBase.
  • Apache HBase provides the key-value stores with massive scalability, so you can store unlimited amounts of data in a single platform and handle growing demands for serving data.
  • Apache ZooKeeper provides a distributed configuration service, a synchronization service, and a naming registry.
  • Apache Knox Gateway provides perimeter security so that the enterprise can confidently extend access to new users.
  • Apache HDFS is used to write the Apache HBase WALs.
  • Hue provides a web-based editor to create and browse Apache HBase tables.
  • Object store such as Amazon S3 and Microsoft ADLS Gen2 is used to store the Apache HBase HFiles.
  • Cloudera Shared Data Experience (SDX) is used for security and governance capabilities. Security and governance policies are set once and applied across all data and workloads.
  • IDBroker provides an authentication mechanism that is built as part of the Apache Knox authentication service. It allows an authenticated and authorized user to exchange a set of credentials or a token for cloud vendor access tokens.