Introduction
Welcome to the Cloudbreak 2.5 Technical Preview documentation!
Cloudbreak simplifies the provisioning, management, and monitoring of on-demand HDP and HDF clusters in virtual and cloud environments. It leverages cloud infrastructure to create host instances, and uses Apache Ambari via Ambari blueprints to provision and manage HDP clusters.
Cloudbreak allows you to create clusters using the Cloudbreak web UI, Cloudbreak CLI, and Cloudbreak REST API. Clusters can be launched on public cloud infrastructure platforms Microsoft Azure, Amazon Web Services (AWS), and Google Cloud Platform (GCP), and on the private cloud infrastructure platform OpenStack.
Primary Use Cases
Cloudbreak allows you to create, manage, and monitor your HDP and HDF clusters on your chosen cloud platform:
- Dynamically deploy, configure, and manage clusters on public and private clouds (AWS, Azure, Google Cloud, OpenStack).
- Use automated scaling to seamlessly manage elasticity requirements as cluster workloads change.
- Secure your cluster by enabling Kerberos.
Default Cluster Configurations
Cloudbreak includes default cluster configurations (in the form of blueprints) and supports using your own custom cluster configurations (in the form of custom blueprints).
The following default cluster configurations are available:
Platform Version: HDP 2.6
Cluster Type | Main Services | Description | List of All Services Included |
---|---|---|---|
Data Science | Spark 2, Zeppelin |
Useful for data science with Spark 2 and Zeppelin. | HDFS, YARN, MapReduce2, Tez, Hive, Pig, Sqoop, ZooKeeper, Ambari Metrics, Spark 2, Zeppelin |
EDW - Analytics | Hive 2 LLAP, Zeppelin |
Useful for EDW analytics using Hive LLAP. | HDFS, YARN, MapReduce2, Tez, Hive 2 LLAP, Druid, Pig, ZooKeeper, Ambari Metrics, Spark 2 |
EDW - ETL | Hive, Spark 2 |
Useful for ETL data processing with Hive and Spark 2. | HDFS, YARN, MapReduce2, Tez, Hive, Pig, ZooKeeper, Ambari Metrics, Spark 2 |
Platform Version: HDF 3.1
Cluster Type | Main Services | Description | List of All Services Included |
---|---|---|---|
Flow Management | NiFi | Useful for flow management with NiFi. | NiFi, ZooKeeper, Ambari Metrics |
Core Concepts
Refer to Architecture and Core Concepts.
Get Started
To get started with Cloudbreak:
- Select the cloud platform on which you would like to launch Cloudbreak.
- Select the deployment option that you would like to use.
- Launch Cloudbreak.
Select Cloud Platform
You can deploy and use Cloudbreak on the following cloud platforms:
- Amazon Web Services (AWS)
- Microsoft Azure
- Google Cloud Platform (GCP)
- OpenStack
Select Deployment Option
There are two basic deployment options:
Deployment option | When to use |
---|---|
Option 1: Instantiate Cloudbreak using one of the provided pre-built cloud images | This is the basic deployment option and the easiest to get started with. The cloud images include Cloudbreak deployer pre-installed on a CentOS VM. |
Option 2: Install the Cloudbreak deployer on your own VM | This is an advanced deployment option. Select this option if you have custom VM requirements. The supported operating systems are RHEL, CentOS, and Oracle Linux 7 (64-bit). |
Launch Cloudbreak
(Option 1) You can launch Cloudbreak from one of the pre-built images:
(Option 2) Or you can launch Cloudbreak on your own VM on one of these cloud platforms. This is an advanced deployment option that you should only use if you have custom VM requirements.
In general, the steps include meeting the prerequisites, launching Cloudbreak on a VM, and creating the Cloudbreak credential. After performing these steps, you can create a cluster based on one of the default blueprints or upload your own blueprint and then create a cluster.
Note
The Cloudbreak software runs in your cloud environment. You are responsible for cloud infrastructure related charges while running Cloudbreak and the clusters being managed by Cloudbreak.