You can create an operational database in your registered CDP environment using the
Cloudera Operational Database (COD).
Required role: You must be logged into the COD as an
ODAdmin.
Understand the CDP environment and user management. For more information,
see User Management and CDP Environments
topics.
Set up an environment that gives you credential and cloud storage. For more
information, see Before you create an operational database
cluster.
Ensure that you are authorized to create a database.
In the COD web interface, click Create Database.
Specify the location of the database where you want to store it.
Provide a name for the database in the Database
Name field.
Select the CDP environment from the list in which you want to associate
the database.
Click Next.
If an environment does not exist, you can create one by clicking
Create New Environment.
For more
information, see Register your first
environment.
Commission your database by defining a scale for your database using a
predefined Data Lake template.
The template helps you to structure your database automatically thereby saving
your time and cost. COD creates the predefined number of LITE or HEAVY gateway
and master nodes, a set of worker nodes, and also adds additional functionalists
into the new database. In case you need to modify the default number of nodes
defined in the template, you can do so after the database creation.
The
available templates are Micro Duty, Light
Duty, and Heavy Duty. By default,
Light Duty is selected.
You can create a
small database using the Micro Duty template, which
consists of one Gateway node and one Worker node. In a Micro database, the
Gateway node carries out the processes involved in the Master or Leader
nodes. You can consider using a Micro cluster for your testing and
development purposes.
Configure your database by selecting the storage type as Cloud
Storage with Caching, Cloud Storage with Caching and
Data Tiering, Cloud Storage, or
HDFS.
The storage type Cloud Storage with Caching is
equivalent to using --storage-type CLOUD_WITH_EPHEMERAL
option on CDP CLI while creating an operational database.
The storage type Cloud Storage with Caching and Data Tiering resembles cloud storage with time-based priority caching, where data within a specified time range is given a higher priority. In contrast, older data are likely to get evicted. For more information on this storage type, see <cite>HBase Time-based Data Tiering using Persistent BucketCache</cite>.
You must have the COD_DATATIERING entitlement to be able to use this storage type.
The storage type Cloud Storage, which resembles
block storage, is equivalent to using --storage-type
CLOUD option on CDP CLI while creating an operational
database.
The storage type HDFS is equivalent to using
--storage-type HDFS option on CDP CLI while
creating an operational database.
By default, Cloud Storage with Caching is
selected.
Check or update the settings for your database.
Check all the default settings for your database under the
Default tab.
Go to the Advanced tab if you need to modify any
of the default values.
The HDFS Volume Type option appears under
the Advanced tab only if you select
HDFS as the storage type in the
Configuration step.
If you disable the Autoscaling option
using the Advanced tab, the
Worker Nodes and Compute
Nodes options are hidden. Instead, a
Node Count option appears.
The minimum and maximum number of worker nodes vary for different
storage types.
Micro duty: Minimum node count: 1.
Maximum node count: 5.
Light duty: Minimum node count: 3.
Maximum node count: 100.
Heavy duty: Minimum node count: 3.
Maximum node count: 800.
Review the details before creating the database.
Click Show CLI Command to get the complete command
details corresponding to your settings. You can use it to create the database
using CDP CLI.
Alternatively, you can use the following sample command to
create the database using CDP
CLI.