Creating a Cluster Entity
Always specify a cluster entity before defining other elements in your data pipeline. The cluster entity defines where the data and the processes for your data pipeline are stored. For more information, see the cluster entity XSD here.
To use the Falcon web UI to define a cluster entity:
At the top of the Falcon web UI page, click Cluster.
On the New Cluster page, specify the following values:
Table 2.1. Cluster Entity Configuration Values
Value
Description
Name
Name of the cluster entity. Not necessarily the actual cluster name.
Colo and Description
Name and description of the data center.
Tags
Metadata tagging.
Access Control List
Specify the HDFS access permissions.
Interfaces
Specify the interface types:
readonly -- Required for distcp (distributed copy) used in replication.
write --Required to write to HDFS.
execute --Required to write jobs to MapReduce.
workflow --Required. This interface submits Oozie jobs.
messaging --Required to send alerts.
registry --Required to register or deregister partitions in the Hive Metastore and to fetch events on partition availability.
Properties
Specify a name and value for each property.
Location
Specify HDFS locations for the staging, temp, and working directories. For more information, see Prerequisite Setup Steps.
Click Next to view a summary of your cluster entity definition. The XML file is displayed to the right of the summary. Click Edit XML to edit the XML directly.
If you are satisfied with the cluster entity definition, click Save.
To verify that you successfully created the cluster entity, enter the cluster entity name in the Falcon web UI Search well and press Enter. If the cluster entity name appears in the search results, it was successfully created. See Search For and Manage Data Pipeline Entities.