3. Configuring Oozie for Falcon

Falcon uses HCatalog for data availability notification when Hive tables are replicated. Make the following configuration changes to Oozie to ensure Hive table replication in Falcon:

  1. Stop the Oozie service on all Falcon clusters.

  2. Copy each cluster's hadoop conf directory to a different location. For example, if you have two clusters, copy one to /etc/hadoop/conf-1 and the other to /etc/hadoop/conf-2.

  3. For each oozie-site.xml file, modify the oozie.service.HadoopAccessorService.hadoop.configurations property, specifying clusters, the RPC ports of the NameNodes, and HostManagers accordingly.

    For example, if Falcon connects to three clusters, specify:

    <property>
          <name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
          <value>*=/etc/hadoop/conf,$NameNode:$rpcPortNN=$hadoopConfDir1,$ResourceManager1:$rpcPortRM=$hadoopConfDir1,$NameNode2=$hadoopConfDir2,$ResourceManager2:$rpcPortRM=$hadoopConfDir2,$NameNode3 :$rpcPortNN =$hadoopConfDir3,$ResourceManager3 :$rpcPortRM =$hadoopConfDir3</value>
          <description>
              Comma separated AUTHORITY=HADOOP_CONF_DIR, where AUTHORITY is the HOST:PORT of
              the Hadoop service (JobTracker, HDFS). The wildcard '*' configuration is
              used when there is no exact match for an authority. The HADOOP_CONF_DIR contains
              the relevant Hadoop *-site.xml files. If the path is relative is looked within
              the Oozie configuration directory; though the path can be absolute (i.e. to point
              to Hadoop client conf/ directories in the local filesystem.
          </description>
        </property>
  4. Restart the Oozie service on all clusters.


loading table of contents...