MapReduceIndexerTool
MapReduceIndexerTool (MRIT) is a MapReduce batch job driver that takes a morphline and creates a set of Solr index shards from a set of input files and writes the indexes into HDFS or local FS in a flexible, scalable, and fault-tolerant manner. MRIT also supports merging the output shards into a set of live customer-facing Solr servers, typically a SolrCloud.
The indexer creates an offline index on HDFS in the output directory specified by the
        --output-dir parameter. If the --go-live parameter is
      specified, Solr merges the resulting offline index into the live running service. Thus, the
      Solr service must have read access to the contents of the output directory to complete the
        go-live step. In an environment with restrictive permissions, such as one with an
      HDFS umask of 077, the Solr user may not be able to read the contents of the newly created
      directory. To address this issue, the indexer automatically applies the HDFS ACLs to enable
      Solr to read the output directory contents. These ACLs are only applied if HDFS ACLs are
      enabled on the HDFS NameNode.
The indexer only makes ACL updates to the output directory and its contents. If the output directory's parent directories do not include the run permission, the Solr service is not able to access the output directory. Solr must have run permissions from standard permissions or ACLs on the parent directories of the output directory.
