Creating a cluster on Azure

Use these steps to create a cluster.

Troubleshooting cluster creation

If you experience problems during cluster creation, refer to Troubleshooting cluster creation.

Steps

  1. Log in to the Cloudbreak UI.

  2. Click the Create Cluster button and the Create Cluster wizard is displayed.
    By default, Basic view is displayed. To view advanced options, click Advanced. To learn about advanced options, refer to Advanced cluster options.

  3. On the General Configuration page, specify the following general parameters for your cluster:

    Parameter Description
    Select Credential Choose a previously created credential.
    Cluster Name Enter a name for your cluster. The name must be between 5 and 40 characters, must start with a letter, and must only include lowercase letters, numbers, and hyphens.
    Region Select the Azure region in which you would like to launch your cluster. For information on available Azure regions, refer to Azure documentation.
    Platform Version Choose the HDP or HDF version to use for this cluster. Blueprints available for this platform version will be populated under "Cluster Type" below. If you selected the HDF platform, refer to Creating HDF clusters for HDF cluster configuration tips.
    Cluster Type Choose one of the default cluster configurations, or, if you have defined your own cluster configuration via Ambari blueprint, you can choose it here. For more information on default and custom blueprints, refer to Using custom blueprints.
    Flex Subscription This option will appear if you have configured your deployment for a flex support subscription.
  4. On the Hardware and Storage page, for each host group provide the following information to define your cluster nodes and attached storage.

    To edit this section, click on the . When done editing, click on the to save the changes.

    Parameter Description
    Ambari Server You must select one node for Ambari Server by clicking the On button. The "Instance Count" for that host group must be set to "1". If you are using one of the default blueprints, this is set by default.
    Instance Type Select an instance type. For information about instance types on Azure refer to Azure documentation.
    Instance Count Enter the number of instances of a given type. Default is 1.
    Storage Type

    Select the volume type. The options are:

    • Locally-redundant storage
    • Geo-redundant storage
    • Premium locally-redundant storage
    For more information about these options refer to Azure documentation.
    Attached Volumes Per Instance Enter the number of volumes attached per instance. Default is 1.
    Volume Size Enter the size in GBs for each volume. Default is 100.
  5. On the Gateway Configuration page, you can access gateway configuration options.

    When creating a cluster, Cloudbreak installs and configures a gateway (powered by Apache Knox) to protect access to the cluster resources. By default, the gateway is enabled for Ambari; You can optionally enable it for other cluster services.

    For more information, refer to Configuring the Gateway documentation.

  6. On the Network page, provide the following to specify the networking resources that will be used for your cluster:

    Parameter Description
    Select Network Select the virtual network in which you would like your cluster to be provisioned. You can select an existing network or create a new network.
    Select Subnet Select the subnet in which you would like your cluster to be provisioned. If you are using a new network, create a new subnet. If you are using an existing network, select an existing subnet.
    Subnet (CIDR) If you selected to create a new subnet, you must define a valid CIDR for the subnet. Default is 10.0.0.0/16.

    Cloudbreak uses public IP addresses when communicating with cluster nodes.

  7. Define security groups for each host group. You can either create new security groups and define their rules or reuse existing security groups:

    Option Description
    New Security Group

    (Default) Creates a new security group with the rules that you defined:

    • A set of default rules is provided. You should review and adjust these default rules. If you do not make any modifications, default rules will be applied.
    • You may open ports by defining the CIDR, entering port range, selecting protocol and clicking +.
    • You may delete default or previously added rules using the delete icon.
    • If you don't want to use security group, remove the default rules.
    Existing Security Groups Allows you to select an existing security group that is already available in the selected provider region. This selection is disabled if no existing security groups are available in your chosen region.

    The default experience of creating network resources such as network, subnet and security group automatically is provided for convenience. We strongly recommend you review these options and for production cluster deployments leverage your existing network resources that you have defined and validated to meet your enterprise requirements. For more information, refer to Restricting inbound access from Cloudbreak to cluster.

  8. On the Security page, provide the following parameters:

    Parameter Description
    Cluster User You can log in to the Ambari UI using this username. By default, this is set to admin.
    Password You can log in to the Ambari UI using this password.
    Confirm Password Confirm the password.
    New SSH public key Check this option to specify a new public key and then enter the public key. You will use the matching private key to access your cluster nodes via SSH.
    Existing SSH public key Select an existing public key. You will use the matching private key to access your cluster nodes via SSH. This is a default option as long as an existing SSH public key is available. This option cannot be used with Azure or Google Cloud.
  9. Click on Create Cluster to create a cluster.

  10. You will be redirected to the Cloudbreak dashboard, and a new tile representing your cluster will appear at the top of the page.

Related links
Flex support subscription
Using custom blueprints
Default cluster security groups
Troubleshooting cluster creation
Azure regions (External)
CIDR (External)
General purpose Linux VM sizes (External)

Advanced cluster options

Click on Advanced to view and enter additional configuration options

Enable lifetime management

Check this option if you would like your cluster to be automatically terminated after a specific amount of time (defined as "Time to Live" in minutes).

Tags

You can optionally add tags, which will help you find your cluster-related resources, such as VMs, in your cloud provider account.

By default, the following tags are created:

Tag Description
cb-version Cloudbreak version
Owner Your Cloudbreak admin email.
cb-account-name Your automatically generated Cloudbreak account name stored in the identity server.
cb-user-name Your Cloudbreak admin email.

For more information, refer to Tagging resources.

Related links
Tagging resources

Choose image catalog

By default, Choose Image Catalog is set to the default image catalog that is provided with Cloudbreak. If you would like to use a different image catalog, you must first create and register it. For complete instructions, refer to Using custom images.

Related links
Using custom images

Choose image type

Cloudbreak supports the following types of images for launching clusters:

Image type Description Default images provided Support for custom images
Prewarmed Image By default, Cloudbreak launches clusters from prewarmed images. Prewarmed images include the operating system as well as Ambari and HDP/HDF. The Ambari and HDP/HDF version used by prewarmed images cannot be customized. Yes No
Base Image Base images include default configuration and default tooling. These images include the operating system but do not include Ambari or HDP/HDF software. Yes Yes

By default, Cloudbreak uses the included default prewarmed images, which include the operating system, as well as Ambari and HDP/HDF packages installed. You can optionally select the base image option if you would like to:

Choose image

If under Choose image catalog, you selected a custom image catalog, under Choose Image you can select an image from that catalog. For complete instructions, refer to Using custom images.

If you are trying to customize Ambari and HDP/HDF versions, you can ignore the Choose Image option; in this case default base image is used.

Ambari repo specification

If you would like to use a custom Ambari version, provide the following information:

Parameter Description Example
Version Enter Ambari version. 2.6.1.3
Repo Url Provide a URL to the Ambari version repo that you would like to use. http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.6.1.3
Repo Gpg Key Url Provide a URL to the repo GPG key. Each stable RPM package that is published by CentOS Project is signed with a GPG signature. By default, yum and the graphical update tools will verify these signatures and refuse to install any packages that are not signed, or have an incorrect signature. http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins

HDP/HDF repo specification

If you would like to use a custom HDP or HDF version, provide the following information:

Parameter Description Example
Stack This is populated by default based on the "Platform Version" parameter. HDP
Version This is populated by default based on the "Platform Version" parameter. 2.6
OS Operating system. centos7 (Azure, GCP, OpenStack) or amazonlinux (AWS)
Repository Version Enter repository version. 2.6.4.0-91
Version Definition File Enter the URL of the VDF file. http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.6.4.0/HDP-2.6.4.0-91.xml
(HDF only) MPack Url (HDF only) Provide MPack URL. http://public-repo-1.hortonworks.com/HDF/centos7/3.x/updates/3.1.1.0/tars/hdf_ambari_mp/hdf-ambari-mpack-3.1.1.0-35.tar.gz
Enable Ambari Server to download and install GPL Licensed LZO packages? (Optional, only available if using Ambari 2.6.1.0 or newer) Use this option to enable LZO compression in your HDP/HDF cluster. LZO is a lossless data compression library that favors speed over compression ratio. Ambari does not install nor enable LZO compression libraries by default, and must be explicitly configured to do so. For more information, refer to Enabling LZO. On

If you choose to use a base image with custom Ambari and/or HDP/HDF version, Cloudbreak validates the information entered. When Cloudbreak detects that the information entered is incorrect, it displays a warning marked with the sign. You should review all the warnings before proceeding and make sure that the information that you entered is correct. If you choose to proceed in spite of the warnings, check "Ignore repository warnings".

Related links
Using custom images

Root volume size

Use this option to increase the root volume size. Default is 30 GB. This option is useful if your custom image requires more space than the default 30 GB.

If you change the value of the root volume size, an osDisk with the given rootVolumeSize will be created for the instance automatically; However, you will have to manually resize the osDisk partition by using the steps provided in the Azure documentation.

Related links
How to: Resize Linux osDisk partition on Azure

Availability sets

To support fault tolerance for VMs, Azure uses the concept of availability sets. This allows two or more VMs to be mapped to multiple fault domains, each of which defines a group of virtual machines that share a common power source and a network switch. When adding VMs to an availability set, Azure automatically assigns each VM a fault domain. The SLA includes guarantees that during OS Patching in Azure or during maintenance operations, at least one VM belonging to a given fault domain will be available.

In Cloudbreak, an availability set is automatically configured during cluster creation for each non-Ambari host group with "Instance Count" that is set to 2 or larger. The assignment of fault domains is automated by Azure, so there is no option for this in Cloudbreak UI.

Cloudbreak allows you to configure the availability set on the advanced Hardware and Storage page of the create cluster wizard by providing the following options for each host group:

Parameter Description Default
Availability Set Name Choose a name for the availability set that will be created for the selected host group The following convention is used: "clustername-hostgroupname-as"
Fault Domain Count The number of fault domains. 2 or 3, depending on the setting supported by Azure
Update Domain Count This number of update domains. This can be set to a number in range of 2-20. 20

After the deployment is finished, you can check the layout of the VMs inside an availability set on Azure Portal. You will find the "Availability set" resources corresponding to the host groups inside the deployment's resource group.

Cloud storage

If you would like to access ADLS or WASB from your cluster, you must configure access as described in Configuring access to ADLS or Configuring access to WASB.

Related links
Configuring access to ADLS
Configuring access to WASB

Recipes

This option allows you to select previously uploaded recipes (scripts that can be run pre or post cluster deployment) for each host group. For more information on recipes, refer to Using custom scripts (recipes).

Related links
Using custom scripts (recipes)

Management packs

This option allows you to select previously uploaded management packs. For more information on management packs, refer to Using management packs.

Related links
Using management packs

External sources

You can register external sources with Cloudbreak, and then select and attach them during cluster create. To register external sources with Cloudbreak, refer to:

Custom properties

This option allows you to set custom properties based on the template defined in your custom blueprint. For more information, refer to Set custom properties.

Related links
Set custom properties

Single sign-on (SSO)

This option allows you to configure the gateway to be the SSO identity provider.

This option is technical preview.

For more information, refer to Configuring the Gateway documentation.

Don't create public IP

This option is available if you are creating a cluster in an existing network and subnet. Select this option if you don't want to use public IPs for the network.

Don't create new firewall rules

This option is available if you are creating a cluster in an existing network and subnet. Select this option if you don't want to create new firewall rules for the network.

Ambari server master key

The Ambari server master key is used to configure Ambari to encrypt database and Kerberos credentials that are retained by Ambari as part of the Ambari setup.

Enable Kerberos security

Select this option to enable Kerberos for your cluster. For information about available Kerberos options, refer to Enabling Kerberos security.

Related links
Enabling Kerberos security
Introduction to Microsoft Azure storage (External)