Class TezWork

  extended by org.apache.hadoop.hive.ql.plan.AbstractOperatorDesc
      extended by org.apache.hadoop.hive.ql.plan.TezWork
All Implemented Interfaces:
Serializable, Cloneable, OperatorDesc

public class TezWork
extends AbstractOperatorDesc

TezWork. This class encapsulates all the work objects that can be executed in a single tez job. Currently it's basically a tree with MapWork at the leaves and and ReduceWork in all other nodes.

See Also:
Serialized Form

Nested Class Summary
 class TezWork.Dependency
Constructor Summary
TezWork(String name)
Method Summary
 void add(BaseWork w)
          add creates a new node in the graph without any connections
 void addAll(BaseWork[] bws)
          add all nodes in the collection without any connections
 void addAll(Collection<BaseWork> c)
          add all nodes in the collection without any connections
 String[] configureJobConfAndExtractJars(org.apache.hadoop.mapred.JobConf jobConf)
          Calls configureJobConf on instances of work that are part of this TezWork.
 void connect(BaseWork a, BaseWork b, TezEdgeProperty edgeProp)
          connect adds an edge between a and b.
 void disconnect(BaseWork a, BaseWork b)
          disconnect removes an edge between a and b.
 List<BaseWork> getAllWork()
          getAllWork returns a topologically sorted list of BaseWork
 Collection<BaseWork> getAllWorkUnsorted()
 List<BaseWork> getChildren(BaseWork work)
          getChildren returns all the nodes with edges leading out of work
 Map<String,List<TezWork.Dependency>> getDependencyMap()
 TezEdgeProperty getEdgeProperty(BaseWork a, BaseWork b)
          returns the edge type connecting work a and b
 TezEdgeProperty.EdgeType getEdgeType(BaseWork a, BaseWork b)
 Set<BaseWork> getLeaves()
          getLeaves returns all nodes that do not have a child
 String getName()
 List<BaseWork> getParents(BaseWork work)
          getParents returns all the nodes with edges leading into work
 Set<BaseWork> getRoots()
          getRoots returns all nodes that do not have a parent.
 Map<String,BaseWork> getWorkMap()
          getWorkMap returns a map of "vertex name" to BaseWork
 void remove(BaseWork work)
          remove removes a node from the graph and removes all edges with work as start or end point.
Methods inherited from class org.apache.hadoop.hive.ql.plan.AbstractOperatorDesc
clone, getOpTraits, getStatistics, setOpTraits, setStatistics, setVectorMode
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail


public TezWork(String name)
Method Detail


public String getName()


public Map<String,BaseWork> getWorkMap()
getWorkMap returns a map of "vertex name" to BaseWork


public List<BaseWork> getAllWork()
getAllWork returns a topologically sorted list of BaseWork


public Collection<BaseWork> getAllWorkUnsorted()


public void addAll(Collection<BaseWork> c)
add all nodes in the collection without any connections


public void addAll(BaseWork[] bws)
add all nodes in the collection without any connections


public void add(BaseWork w)
add creates a new node in the graph without any connections


public void disconnect(BaseWork a,
                       BaseWork b)
disconnect removes an edge between a and b. Both a and b have to be in the graph. If there is no matching edge no change happens.


public Set<BaseWork> getRoots()
getRoots returns all nodes that do not have a parent.


public Set<BaseWork> getLeaves()
getLeaves returns all nodes that do not have a child


public List<BaseWork> getParents(BaseWork work)
getParents returns all the nodes with edges leading into work


public List<BaseWork> getChildren(BaseWork work)
getChildren returns all the nodes with edges leading out of work


public void remove(BaseWork work)
remove removes a node from the graph and removes all edges with work as start or end point. No change to the graph if the node doesn't exist.


public TezEdgeProperty.EdgeType getEdgeType(BaseWork a,
                                            BaseWork b)


public TezEdgeProperty getEdgeProperty(BaseWork a,
                                       BaseWork b)
returns the edge type connecting work a and b


public Map<String,List<TezWork.Dependency>> getDependencyMap()


public String[] configureJobConfAndExtractJars(org.apache.hadoop.mapred.JobConf jobConf)
Calls configureJobConf on instances of work that are part of this TezWork. Uses the passed job configuration to extract "tmpjars" added by these, so that Tez could add them to the job proper Tez way. This is a very hacky way but currently there's no good way to get these JARs - both storage handler interface, and HBase code, would have to change to get the list directly (right now it adds to tmpjars). This will happen in 0.14 hopefully.

jobConf - Job configuration.
List of files added to tmpjars by storage handlers.


public void connect(BaseWork a,
                    BaseWork b,
                    TezEdgeProperty edgeProp)
connect adds an edge between a and b. Both nodes have to be added prior to calling connect.


Copyright © 2014 The Apache Software Foundation. All rights reserved.