Data Mart clusters
Learn about the default Data Mart and Real Time Data Mart clusters, including cluster definition and template names, included services, and compatible Runtime version.
Data Mart is an MPP SQL database powered by Apache Impala designed to support custom Data Mart applications at big data scale. Impala easily scales to petabytes of data, processes tables with trillions of rows, and allows users to store, browse, query, and explore their data in an interactive way.
Data Mart clusters
The Data Mart template provides a ready to use, fully capable, standalone deployment of Impala. Upon deployment, it can be used as a standalone Data Mart to which users point their BI dashboards using JDBC/ODBC end points. Users can also choose to author SQL queries in Cloudera’s web-based SQL query editor, Hue, and run them with Impala providing a delightful end-user focused and interactive SQL/BI experience.
- Cluster definition names
-
-
Data Mart for AWS
-
Data Mart for Azure
-
Data Mart for Google Cloud
-
- Cluster template name
- CDP - Data Mart: Apache Impala, Hue
- Included services
-
- HDFS
- Hue
- Impala
- Compatible Runtime versions
- 7.1.0, 7.2.0, 7.2.1, 7.2.2, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.2.11, 7.2.12, 7.2.14, 7.2.15, 7.2.16, 7.2.17, 7.2.18
Real Time Data Mart clusters
The Real-Time Data Mart template provides a ready-to-use, fully capable, standalone deployment of Impala and Kudu. You can use a Real Time Data Mart cluster as a standalone Data Mart which allows high throughput streaming ingest, supporting updates and deletes as well as inserts. You can immediately query data through BI dashboards using JDBC/ODBC end points. You can choose to author SQL queries in Cloudera's web-based SQL query editor, Hue. Executing queries with Impala, you will enjoy an end-user focused and interactive SQL/BI experience. This template is commonly used for Operational Reporting, Time Series, and other real time analytics use cases.
- Cluster definition names
-
-
Real-time Data Mart for AWS
-
Real-time Data Mart for Azure
-
Real-time Data Mart for Google Cloud
-
- Cluster template name
-
CDP - Real-time Data Mart: Apache Impala, Hue, Apache Kudu, Apache Spark
- Included services
-
- HDFS
- Hue
- Impala
- Kudu
- Spark 2
- Yarn
- Compatible Runtime versions
-
7.1.0, 7.2.0, 7.2.1, 7.2.2, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.2.11, 7.2.12, 7.2.14, 7.2.15, 7.2.16, 7.2.17
- Cluster definition names
-
-
Real-time Data Mart - Spark3 for AWS
-
Real-time Data Mart - Spark3 for Azure
-
Real-time Data Mart - Spark3 for Google Cloud
-
- Cluster template name
-
Real-time Data Mart: Apache Impala, Hue, Apache Kudu, Apache Spark3
- Included services
-
- HDFS
- Hue
- Impala
- Kudu
- Spark 3
- Yarn
- Compatible Runtime versions
- 7.2.16, 7.2.17, 7.2.18
High availability
Cloudera recommends that you use high availability (HA), and track any services that are not capable of restarting or performing failover in some way.
Impala HA
- Catalog service
- Statestore service
Kudu HA
Both Kudu Masters and TabletServers offer high availability.