Cluster definitions

A cluster definition is a reusable cluster template in JSON format that can be used for creating multiple Data Hub clusters with identical cloud provider settings.

Data Hub includes a set of default cluster definitions for common data analytics and data engineering use cases, and allows you to save your own custom cluster definitions. While default cluster definitions can be used across all environments, custom cluster definitions are associated with one or more specific environments. Cluster definitions facilitate repeated cluster reuse and are therefore useful for creating automation around cluster creation.

The easiest way to create a custom cluster definition is to open the create cluster wizard, provide all the parameters, and then save the settings as a cluster definition. The option to save is available on the last page of the create cluster wizard.

Prior to creating your own cluster definitions, we recommend that you review the default cluster definitions to check if they meet your requirements. You can access the default cluster definitions by clicking Environments, then selecting an environment and clicking the Cluster Definitions tab. To view details of a cluster definition, click on its name. For each cluster template, you can access a graphical representation ("list view") and a raw JSON file ("raw view") of all cluster host groups and their components.