org.apache.hadoop.hive.ql.plan
Class TezWork

java.lang.Object
  extended by org.apache.hadoop.hive.ql.plan.AbstractOperatorDesc
      extended by org.apache.hadoop.hive.ql.plan.TezWork
All Implemented Interfaces:
Serializable, Cloneable, OperatorDesc

public class TezWork
extends AbstractOperatorDesc

TezWork. This class encapsulates all the work objects that can be executed in a single tez job. Currently it's basically a tree with MapWork at the leaves and and ReduceWork in all other nodes.

See Also:
Serialized Form

Nested Class Summary
 class TezWork.Dependency
           
 
Constructor Summary
TezWork(String name)
           
 
Method Summary
 void add(BaseWork w)
          add creates a new node in the graph without any connections
 void addAll(BaseWork[] bws)
          add all nodes in the collection without any connections
 void addAll(Collection<BaseWork> c)
          add all nodes in the collection without any connections
 String[] configureJobConfAndExtractJars(org.apache.hadoop.mapred.JobConf jobConf)
          Calls configureJobConf on instances of work that are part of this TezWork.
 void connect(BaseWork a, BaseWork b, TezEdgeProperty edgeProp)
          connect adds an edge between a and b.
 void disconnect(BaseWork a, BaseWork b)
          disconnect removes an edge between a and b.
 List<BaseWork> getAllWork()
          getAllWork returns a topologically sorted list of BaseWork
 Collection<BaseWork> getAllWorkUnsorted()
           
 List<BaseWork> getChildren(BaseWork work)
          getChildren returns all the nodes with edges leading out of work
 Map<String,List<TezWork.Dependency>> getDependencyMap()
           
 TezEdgeProperty getEdgeProperty(BaseWork a, BaseWork b)
          returns the edge type connecting work a and b
 TezEdgeProperty.EdgeType getEdgeType(BaseWork a, BaseWork b)
           
 Set<BaseWork> getLeaves()
          getLeaves returns all nodes that do not have a child
 String getName()
           
 List<BaseWork> getParents(BaseWork work)
          getParents returns all the nodes with edges leading into work
 Set<BaseWork> getRoots()
          getRoots returns all nodes that do not have a parent.
 Map<String,BaseWork> getWorkMap()
          getWorkMap returns a map of "vertex name" to BaseWork
 void remove(BaseWork work)
          remove removes a node from the graph and removes all edges with work as start or end point.
 
Methods inherited from class org.apache.hadoop.hive.ql.plan.AbstractOperatorDesc
clone, getOpTraits, getStatistics, setOpTraits, setStatistics, setVectorMode
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TezWork

public TezWork(String name)
Method Detail

getName

public String getName()

getWorkMap

public Map<String,BaseWork> getWorkMap()
getWorkMap returns a map of "vertex name" to BaseWork


getAllWork

public List<BaseWork> getAllWork()
getAllWork returns a topologically sorted list of BaseWork


getAllWorkUnsorted

public Collection<BaseWork> getAllWorkUnsorted()

addAll

public void addAll(Collection<BaseWork> c)
add all nodes in the collection without any connections


addAll

public void addAll(BaseWork[] bws)
add all nodes in the collection without any connections


add

public void add(BaseWork w)
add creates a new node in the graph without any connections


disconnect

public void disconnect(BaseWork a,
                       BaseWork b)
disconnect removes an edge between a and b. Both a and b have to be in the graph. If there is no matching edge no change happens.


getRoots

public Set<BaseWork> getRoots()
getRoots returns all nodes that do not have a parent.


getLeaves

public Set<BaseWork> getLeaves()
getLeaves returns all nodes that do not have a child


getParents

public List<BaseWork> getParents(BaseWork work)
getParents returns all the nodes with edges leading into work


getChildren

public List<BaseWork> getChildren(BaseWork work)
getChildren returns all the nodes with edges leading out of work


remove

public void remove(BaseWork work)
remove removes a node from the graph and removes all edges with work as start or end point. No change to the graph if the node doesn't exist.


getEdgeType

public TezEdgeProperty.EdgeType getEdgeType(BaseWork a,
                                            BaseWork b)

getEdgeProperty

public TezEdgeProperty getEdgeProperty(BaseWork a,
                                       BaseWork b)
returns the edge type connecting work a and b


getDependencyMap

public Map<String,List<TezWork.Dependency>> getDependencyMap()

configureJobConfAndExtractJars

public String[] configureJobConfAndExtractJars(org.apache.hadoop.mapred.JobConf jobConf)
Calls configureJobConf on instances of work that are part of this TezWork. Uses the passed job configuration to extract "tmpjars" added by these, so that Tez could add them to the job proper Tez way. This is a very hacky way but currently there's no good way to get these JARs - both storage handler interface, and HBase code, would have to change to get the list directly (right now it adds to tmpjars). This will happen in 0.14 hopefully.

Parameters:
jobConf - Job configuration.
Returns:
List of files added to tmpjars by storage handlers.

connect

public void connect(BaseWork a,
                    BaseWork b,
                    TezEdgeProperty edgeProp)
connect adds an edge between a and b. Both nodes have to be added prior to calling connect.

Parameters:
-


Copyright © 2014 The Apache Software Foundation. All rights reserved.