3. Configuring Oozie for Falcon

Falcon uses HCatalog for data availability notification when Hive tables are replicated. Make the following configuration changes to Oozie to ensure Hive table replication in Falcon:

  1. Stop the Oozie service on all Falcon clusters.

    Execute these commands on the Oozie host machine.

    su $OOZIE_USER 
    /usr/lib/oozie/bin/oozie-stop.sh

    Where $OOZIE_USER is the Oozie user. For example, oozie.

  2. Copy each cluster's hadoop conf directory to a different location. For example, if you have two clusters, copy one to /etc/hadoop/conf-1 and the other to /etc/hadoop/conf-2.

  3. For each oozie-site.xml file, modify the oozie.service.HadoopAccessorService.hadoop.configurations property, specifying clusters, the RPC ports of the NameNodes, and HostManagers accordingly.

    For example, if Falcon connects to three clusters, specify:

    <property>
          <name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
          <value>*=/etc/hadoop/conf,$NameNode:$rpcPortNN=$hadoopConfDir1,$ResourceManager1:$rpcPortRM=$hadoopConfDir1,$NameNode2=$hadoopConfDir2,$ResourceManager2:$rpcPortRM=$hadoopConfDir2,$NameNode3 :$rpcPortNN =$hadoopConfDir3,$ResourceManager3 :$rpcPortRM =$hadoopConfDir3</value>
          <description>
              Comma separated AUTHORITY=HADOOP_CONF_DIR, where AUTHORITY is the HOST:PORT of
              the Hadoop service (JobTracker, HDFS). The wildcard '*' configuration is
              used when there is no exact match for an authority. The HADOOP_CONF_DIR contains
              the relevant Hadoop *-site.xml files. If the path is relative is looked within
              the Oozie configuration directory; though the path can be absolute (i.e. to point
              to Hadoop client conf/ directories in the local filesystem.
          </description>
        </property>
  4. Add the following properties in bold to the /etc/oozie/conf/oozie-site.xml file:

    <property>
          <name>oozie.service.ProxyUserService.proxyuser.falcon.hosts</name>
          <value>*</value>
    </property>
    <property>
          <name>oozie.service.ProxyUserService.proxyuser.falcon.groups</name>
          <value>*</value>
    </property>
    <property>
          <name>oozie.service.URIHandlerService.uri.handlers</name>
          <value>org.apache.oozie.dependency.FSURIHandler,org.apache.oozie.dependency.HCatURIHandler</val
    ue>
    </property>
    <property>
          <name>oozie.services.ext</name>
          <value>org.apache.oozie.service.JMSAccessorService,
    org.apache.oozie.service.PartitionDependencyManagerService,
    org.apache.oozie.service.HCatAccessorService
    </value>
    </property>
    <!-- Coord EL Functions Properties -->
    <property>
          <name>oozie.service.ELService.ext.functions.coord-job-submit-instances</name>
          <value>now=org.apache.oozie.extensions.OozieELExtensions#ph1_now_echo,
             today=org.apache.oozie.extensions.OozieELExtensions#ph1_today_echo,
             yesterday=org.apache.oozie.extensions.OozieELExtensions#ph1_yesterday_echo,
             currentMonth=org.apache.oozie.extensions.OozieELExtensions#ph1_currentMonth_echo,
             lastMonth=org.apache.oozie.extensions.OozieELExtensions#ph1_lastMonth_echo, currentYear=org.apache.oozie.extensions.OozieELExtensions#ph1_currentYear_echo,
             lastYear=org.apache.oozie.extensions.OozieELExtensions#ph1_lastYear_echo,
             formatTime=org.apache.oozie.coord.CoordELFunctions#ph1_coord_formatTime_echo,
             latest=org.apache.oozie.coord.CoordELFunctions#ph2_coord_latest_echo,
             future=org.apache.oozie.coord.CoordELFunctions#ph2_coord_future_echo
          </value>
    </property>
    <property>
          <name>oozie.service.ELService.ext.functions.coord-action-create-inst</name>
          <value>
             now=org.apache.oozie.extensions.OozieELExtensions#ph2_now_inst,
             today=org.apache.oozie.extensions.OozieELExtensions#ph2_today_inst,
             yesterday=org.apache.oozie.extensions.OozieELExtensions#ph2_yesterday_inst,
             currentMonth=org.apache.oozie.extensions.OozieELExtensions#ph2_currentMonth_inst,
             lastMonth=org.apache.oozie.extensions.OozieELExtensions#ph2_lastMonth_inst,
             currentYear=org.apache.oozie.extensions.OozieELExtensions#ph2_currentYear_inst,
             lastYear=org.apache.oozie.extensions.OozieELExtensions#ph2_lastYear_inst,
             latest=org.apache.oozie.coord.CoordELFunctions#ph2_coord_latest_echo,
             future=org.apache.oozie.coord.CoordELFunctions#ph2_coord_future_echo,
             formatTime=org.apache.oozie.coord.CoordELFunctions#ph2_coord_formatTime,
             user=org.apache.oozie.coord.CoordELFunctions#coord_user
          </value>
    </property>
    <property>
          <name>oozie.service.ELService.ext.functions.coord-action-start</name>
          <value>
             now=org.apache.oozie.extensions.OozieELExtensions#ph2_now,
             today=org.apache.oozie.extensions.OozieELExtensions#ph2_today,
             yesterday=org.apache.oozie.extensions.OozieELExtensions#ph2_yesterday,
             currentMonth=org.apache.oozie.extensions.OozieELExtensions#ph2_currentMonth,
             lastMonth=org.apache.oozie.extensions.OozieELExtensions#ph2_lastMonth,
             currentYear=org.apache.oozie.extensions.OozieELExtensions#ph2_currentYear,
             lastYear=org.apache.oozie.extensions.OozieELExtensions#ph2_lastYear,
             latest=org.apache.oozie.coord.CoordELFunctions#ph3_coord_latest,
             future=org.apache.oozie.coord.CoordELFunctions#ph3_coord_future,
             dataIn=org.apache.oozie.extensions.OozieELExtensions#ph3_dataIn,
             instanceTime=org.apache.oozie.coord.CoordELFunctions#ph3_coord_nominalTime,
             dateOffset=org.apache.oozie.coord.CoordELFunctions#ph3_coord_dateOffset,
             formatTime=org.apache.oozie.coord.CoordELFunctions#ph3_coord_formatTime,
             user=org.apache.oozie.coord.CoordELFunctions#coord_user
          </value>
    </property>
    <property>
          <name>oozie.service.ELService.ext.functions.coord-sla-submit</name>
          <value>
             instanceTime=org.apache.oozie.coord.CoordELFunctions#ph1_coord_nominalTime_echo_fixed,
             user=org.apache.oozie.coord.CoordELFunctions#coord_user
          </value>
    </property>
    <property>
          <name>oozie.service.ELService.ext.functions.coord-sla-create</name>
          <value>
             instanceTime=org.apache.oozie.coord.CoordELFunctions#ph2_coord_nominalTime,
             user=org.apache.oozie.coord.CoordELFunctions#coord_user
          </value>
    </property>
  5. Copy the existing Oozie WAR file to /usr/lib/oozie/oozie.war.

    This will make sure all existing items in the WAR file are still present after the current update.

    su root
    cp $CATALINA_BASE/webapps/oozie.war /usr/lib/oozie/oozie.war 

    Where $CATALINA_BASE is the path for the Oozie web app. By default, $CATALINA_BASE is /var/lib/oozie/oozie-server.

  6. Add the Falcon EL extensions to Oozie.

    Copy the extension JAR files provided with the Falcon Server to a temporary directory on the Oozie server. For example, if your standalone Falcon Server is on the same machine as your Oozie server, you can just copy the JAR files.

    mkdir /tmp/falcon-oozie-jars cp
    /usr/lib/falcon/oozie/ext/falcon-oozie-el-extension-0.5.0.2.1.5.0-695.jar \
    /tmp/falcon-oozie-jars/
  7. Package the Oozie WAR file.

    su oozie  
    cd /usr/lib/oozie/bin 
    ./oozie-setup.sh prepare-war –d /tmp/falcon-oozie-jars
  8. Start the Oozie service on all Falcon clusters.

    Execute these commands on the Oozie host machine.

    su $OOZIE_USER 
    /usr/lib/oozie/bin/oozie-start.sh

    Where $OOZIE_USER is the Oozie user. For example, oozie.


loading table of contents...