4. Configuring Oozie for Falcon

Falcon uses HCatalog for data availability notification when Hive tables are replicated. Make the following configuration changes to Oozie to ensure Hive table replication in Falcon:

  1. Stop the Oozie service on all Falcon clusters. Run the following commands on the Oozie host machine.

    su $OOZIE_USER

    /usr/hdp/current/oozie-server/bin/oozie-stop.sh

    where $OOZIE_USER is the Oozie user. For example, oozie.

  2. Copy each cluster's hadoop conf directory to a different location. For example, if you have two clusters, copy one to /etc/hadoop/conf-1 and the other to /etc/hadoop/conf-2.

  3. For each oozie-site.xml file, modify the oozie.service.HadoopAccessorService.hadoop.configurations property, specifying clusters, the RPC ports of the NameNodes, and HostManagers accordingly. For example, if Falcon connects to three clusters, specify:

    <property>
         <name>oozie.service.HadoopAccessorService.hadoop.configurations</name> 
         <value>*=/etc/hadoop/conf,$NameNode:$rpcPortNN=$hadoopConfDir1,$ResourceManager1:$rpcPortRM=$hadoopConfDir1,$NameNode2=$hadoopConfDir2,$ResourceManager2:$rpcPortRM=$hadoopConfDir2,$NameNode3 :$rpcPortNN =$hadoopConfDir3,$ResourceManager3 :$rpcPortRM =$hadoopConfDir3</value>
         <description>
              Comma separated AUTHORITY=HADOOP_CONF_DIR, where AUTHORITY is the HOST:PORT of
              the Hadoop service (JobTracker, HDFS). The wildcard '*' configuration is
              used when there is no exact match for an authority. The HADOOP_CONF_DIR contains
              the relevant Hadoop *-site.xml files. If the path is relative is looked within
              the Oozie configuration directory; though the path can be absolute (i.e. to point
              to Hadoop client conf/ directories in the local filesystem.
         </description>
    </property>
  4. Add the following properties to the /etc/oozie/conf/oozie-site.xml file:

    <property>
         <name>oozie.service.ProxyUserService.proxyuser.falcon.hosts</name>
         <value>*</value>
    </property>
     
    <property>
         <name>oozie.service.ProxyUserService.proxyuser.falcon.groups</name> 
         <value>*</value>
    </property>
     
    <property>
         <name>oozie.service.URIHandlerService.uri.handlers</name> 
         <value>org.apache.oozie.dependency.FSURIHandler, org.apache.oozie.dependency.HCatURIHandler</value>
    </property>
     
    <property>
         <name>oozie.services.ext</name>
         <value>org.apache.oozie.service.JMSAccessorService, org.apache.oozie.service.PartitionDependencyManagerService,
         org.apache.oozie.service.HCatAccessorService</value>
    </property> 
    
    <!-- Coord EL Functions Properties -->
    
    <property>
         <name>oozie.service.ELService.ext.functions.coord-job-submit-instances</name>
         <value>now=org.apache.oozie.extensions.OozieELExtensions#ph1_now_echo,
             today=org.apache.oozie.extensions.OozieELExtensions#ph1_today_echo,
             yesterday=org.apache.oozie.extensions.OozieELExtensions#ph1_yesterday_echo,
             currentMonth=org.apache.oozie.extensions.OozieELExtensions#ph1_currentMonth_echo,
             lastMonth=org.apache.oozie.extensions.OozieELExtensions#ph1_lastMonth_echo,
             currentYear=org.apache.oozie.extensions.OozieELExtensions#ph1_currentYear_echo,
             lastYear=org.apache.oozie.extensions.OozieELExtensions#ph1_lastYear_echo,
             formatTime=org.apache.oozie.coord.CoordELFunctions#ph1_coord_formatTime_echo,
             latest=org.apache.oozie.coord.CoordELFunctions#ph2_coord_latest_echo,
             future=org.apache.oozie.coord.CoordELFunctions#ph2_coord_future_echo
         </value>
    </property>
     
    <property>
         <name>oozie.service.ELService.ext.functions.coord-action-create-inst</name>
         <value>now=org.apache.oozie.extensions.OozieELExtensions#ph2_now_inst,
             today=org.apache.oozie.extensions.OozieELExtensions#ph2_today_inst,
             yesterday=org.apache.oozie.extensions.OozieELExtensions#ph2_yesterday_inst,
             currentMonth=org.apache.oozie.extensions.OozieELExtensions#ph2_currentMonth_inst,
             lastMonth=org.apache.oozie.extensions.OozieELExtensions#ph2_lastMonth_inst,
             currentYear=org.apache.oozie.extensions.OozieELExtensions#ph2_currentYear_inst,
             lastYear=org.apache.oozie.extensions.OozieELExtensions#ph2_lastYear_inst,
             latest=org.apache.oozie.coord.CoordELFunctions#ph2_coord_latest_echo,
             future=org.apache.oozie.coord.CoordELFunctions#ph2_coord_future_echo,
             formatTime=org.apache.oozie.coord.CoordELFunctions#ph2_coord_formatTime,
             user=org.apache.oozie.coord.CoordELFunctions#coord_user
         </value>
    </property>
     
    <property>
    <name>oozie.service.ELService.ext.functions.coord-action-start</name>
    <value>
    now=org.apache.oozie.extensions.OozieELExtensions#ph2_now,
    today=org.apache.oozie.extensions.OozieELExtensions#ph2_today,
    yesterday=org.apache.oozie.extensions.OozieELExtensions#ph2_yesterday,
    currentMonth=org.apache.oozie.extensions.OozieELExtensions#ph2_currentMonth,
    lastMonth=org.apache.oozie.extensions.OozieELExtensions#ph2_lastMonth,
    currentYear=org.apache.oozie.extensions.OozieELExtensions#ph2_currentYear,
    lastYear=org.apache.oozie.extensions.OozieELExtensions#ph2_lastYear,
    latest=org.apache.oozie.coord.CoordELFunctions#ph3_coord_latest,
    future=org.apache.oozie.coord.CoordELFunctions#ph3_coord_future,
    dataIn=org.apache.oozie.extensions.OozieELExtensions#ph3_dataIn,
    instanceTime=org.apache.oozie.coord.CoordELFunctions#ph3_coord_nominalTime,
    dateOffset=org.apache.oozie.coord.CoordELFunctions#ph3_coord_dateOffset,
    formatTime=org.apache.oozie.coord.CoordELFunctions#ph3_coord_formatTime,
    user=org.apache.oozie.coord.CoordELFunctions#coord_user
    </value>
    </property>
     
    <property>
         <name>oozie.service.ELService.ext.functions.coord-sla-submit</name>
         <value>
             instanceTime=org.apache.oozie.coord.CoordELFunctions#ph1_coord_nominalTime_echo_fixed,
             user=org.apache.oozie.coord.CoordELFunctions#coord_user
         </value>
    </property>
     
    <property>
         <name>oozie.service.ELService.ext.functions.coord-sla-create</name>
         <value>
             instanceTime=org.apache.oozie.coord.CoordELFunctions#ph2_coord_nominalTime,
             user=org.apache.oozie.coord.CoordELFunctions#coord_user
         </value>
    </property>
  5. Copy the existing Oozie WAR file to /usr/hdp/current/oozie/oozie.war. This will ensure that all existing items in the WAR file are still present after the current update.

    su root

    cp $CATALINA_BASE/webapps/oozie.war /usr/hdp/current/oozie/oozie.war

    where $CATALINA_BASE is the path for the Oozie web app. By default, $CATALINA_BASE is:

    /var/lib/oozie/oozie-server.

  6. Add the Falcon EL extensions to Oozie.

    Copy the extension JAR files provided with the Falcon Server to a temporary directory on the Oozie server. For example, if your standalone Falcon Server is on the same machine as your Oozie server, you can just copy the JAR files.

    mkdir /tmp/falcon-oozie-jars

    cp/usr/hdp/current/falcon/oozie/ext/falcon-oozie-el-extension-0.6.0.2.2.1.0-*.jar \/tmp/falcon-oozie-jars/

  7. Package the Oozie WAR file as the Oozie user

    su oozie

    cd /usr/hdp/current/oozie-server/bin

    ./oozie-setup.sh prepare-war –d /tmp/falcon-oozie-jars

  8. Start the Oozie service on all Falcon clusters. Run these commands on the Oozie host machine.

    su $OOZIE_USER

    /usr/hdp/current/oozie-server/bin/oozie-start.sh

    Where $OOZIE_USER is the Oozie user. For example, oozie.


loading table of contents...