About Data Context

Data Contexts in Cloudera Manager are used to access data in Cloudera Base on premises environment. In other words, a Data Context helps you to share the services from one cluster to another.

A Cloudera Base on premises cluster can have one or more Compute clusters. A Compute cluster employs a Data Context to connect to the Cloudera Base on premises cluster for accessing the data and metadata in the Cloudera Base on premises cluster. The context itself is merely a logical entity where there is no specific deployment or cluster running activities.

When the Data Context is created, you can view the list of available services within the Cloudera Base on premises cluster that you can share with compute cluster: this list is a subset of the services in the cluster, as not all services have the ability to be shared through Data Context.

Note the following:

  • A Data Context can only be used to connect a Cloudera Base on premises cluster with one or more compute clusters. A Data Context cannot be used to connect two Cloudera Base on premises clusters.

  • You can connect your compute cluster to a Cloudera Base on premises cluster only if the Data Context is made available. Later you can use the services available in the Cloudera Base on premises cluster because the Data Context provides the metadata to connect the Virtual Private Cluster to the Cloudera Base on premises cluster.

  • The Compute cluster needs configuration files and some information to be able to communicate with a Base cluster. All configuration files are managed by Cloudera Manager automatically, and will be updated as required if any service in the Data Context has changed their configurations.