Maximum Heartbeat Interval |
The maximum heartbeat interval between the Application Master and RM in milliseconds. |
tez.am.am-rm.heartbeat.interval-ms.max
|
250 millisecond(s) |
tez.am.am-rm.heartbeat.interval-ms.max
|
true |
Maximum Timout to Hold Idle Containers |
The maximum amount of time to hold on to a container if no task can be assigned to it immediately. Only active when reuse is enabled. |
tez.am.container.idle.release-timeout-max.millis
|
20 second(s) |
tez.am.container.idle.release-timeout-max.millis
|
true |
Minimum Timout to Hold Idle Containers |
The minimum amount of time to hold on to a container that is idle. Only active when reuse is enabled. |
tez.am.container.idle.release-timeout-min.millis
|
10 second(s) |
tez.am.container.idle.release-timeout-min.millis
|
true |
Enable Container Reuse |
Configuration to specify whether container should be reused. |
tez.am.container.reuse.enabled
|
true |
tez.am.container.reuse.enabled
|
true |
Timeout Before Container Reuse |
The amount of time to wait before assigning a container to the next level of locality. NODE > RACK > NON_LOCAL |
tez.am.container.reuse.locality.delay-allocation-millis
|
250 millisecond(s) |
tez.am.container.reuse.locality.delay-allocation-millis
|
true |
Enable Container Reuse for Non-Local Tasks |
Whether to reuse containers for non-local tasks. Active only if reuse is enabled. |
tez.am.container.reuse.non-local-fallback.enabled
|
false |
tez.am.container.reuse.non-local-fallback.enabled
|
true |
Enable Container Reuse for Rack Local Tasks |
Whether to reuse containers for rack local tasks. Active only if reuse is enabled. |
tez.am.container.reuse.rack-fallback.enabled
|
true |
tez.am.container.reuse.rack-fallback.enabled
|
true |
Tez Application Master Default Command Line Options |
Cluster default Java options for the Tez Application Master process. These will be prepended to the properties specified via tez.am.launch.cmd-opts. |
tez.am.launch.cluster-default.cmd-opts
|
-server -Djava.net.preferIPv4Stack=true |
tez.am.launch.cluster-default.cmd-opts
|
true |
Tez Application Master Command Line Options |
Java options for the Tez Application Master process. The Xmx value is derived based on tez.am.resource.memory.mb and is 80% of the value by default. Used only if the value is not specified explicitly by the DAG definition. |
tez.am.launch.cmd-opts
|
-XX:+PrintGCDetails -verbose:gc -XX:+UseNUMA -XX:+UseG1GC -XX:+ResizeTLAB -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp |
tez.am.launch.cmd-opts
|
true |
Tez Application Master Environment Settings |
Additional execution environment entries for tez. This is not an additive property. You must preserve the original value if you want to have access to native libraries. Used only if the value is not specified explicitly by the DAG definition. |
tez.am.launch.env
|
LD_LIBRARY_PATH=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native |
tez.am.launch.env
|
true |
Log level for Application Masters |
Root Logging level passed to the Tez Application Master. |
tez.am.log.level
|
INFO |
tez.am.log.level
|
true |
Number of Recovery Runs |
Specifies the total number of time the Application Master will run in case recovery is triggered. |
tez.am.max.app.attempts
|
2 |
tez.am.max.app.attempts
|
true |
Maximum Task Attempts |
The maximum number of allowed task attempt failures on a node before it gets marked as blacklisted. |
tez.am.maxtaskfailures.per.node
|
10 |
tez.am.maxtaskfailures.per.node
|
true |
Tez Application Master Memory |
The amount of memory to be used by the Application Master. Used only if the value is not specified explicitly by the DAG definition. |
tez.am.resource.memory.mb
|
2 GiB |
tez.am.resource.memory.mb
|
true |
History URL Template |
Template to generate the History URL for a particular Tez Application. Template replaces __APPLICATION_ID__ with the actual applicationId and __HISTORY_URL_BASE__ with the value from the tez.tez-ui.history-url.base config property |
tez.am.tez-ui.history-url.template
|
__HISTORY_URL_BASE__?viewPath=%2F%23%2Ftez-app%2F__APPLICATION_ID__ |
tez.am.tez-ui.history-url.template
|
true |
Tez Application Master View ACLs |
Application Master view ACLs. This allows the specified users/groups to view the status of the Application Master and all DAGs that run within this Appliation Master. Value format: Comma separated list of users, followed by whitespace, followed by a comma separated list of groups. |
tez.am.view-acls
|
* |
tez.am.view-acls
|
false |
Tez Additional Classpath |
Specify additional classpath information to be used for Tez AM and all containers. |
tez.cluster.additional.classpath.prefix
|
|
tez.cluster.additional.classpath.prefix
|
false |
Maximum Number of Counters |
The number of allowed counters for the executing DAG. |
tez.counters.max
|
10000 |
tez.counters.max
|
true |
Maximum Counter Groups |
The number of allowed counter groups for the executing DAG. |
tez.counters.max.groups
|
3000 |
tez.counters.max.groups
|
true |
Whether to generate debug artifacts |
Generate debug artifacts such as a text representation of the submitted DAG plan. |
tez.generate.debug.artifacts
|
false |
tez.generate.debug.artifacts
|
true |
Grouped Split Maximum Size |
Upper bound on the size (in bytes) of a grouped split, to avoid generating excessively large split. |
tez.grouping.max-size
|
1 GiB |
tez.grouping.max-size
|
true |
Grouped Split Minimum Size |
Lower bound on the size (in bytes) of a grouped split, to avoid generating too many splits. |
tez.grouping.min-size
|
16 MiB |
tez.grouping.min-size
|
true |
Queue Capacity Multiplier |
The multiplier for available queue capacity when determining number of tasks for a Vertex. 1.7 with 100% queue available implies generating a number of tasks roughly equal to 170% of the available containers on the queue. |
tez.grouping.split-waves
|
1.7 |
tez.grouping.split-waves
|
true |
Tez history events directory |
Directory where proto logger writes the history events, should generally be sys.db database directory. |
tez.history.logging.proto-base-dir
|
/warehouse/tablespace/managed/hive/sys.db |
tez.history.logging.proto-base-dir
|
true |
DAGs per Group |
DAGs per group. |
tez.history.logging.timeline-cache-plugin.old-num-dags-per-group
|
5 |
tez.history.logging.timeline-cache-plugin.old-num-dags-per-group
|
true |
Enable Intermediate Data Compression |
Whether intermediate data should be compressed or not. |
tez.runtime.compress
|
true |
tez.runtime.compress
|
true |
Codec for Compressing Intermediate Data |
The codec to be used if compressing intermediate data. Only applicable if tez.runtime.compress is enabled. |
tez.runtime.compress.codec
|
org.apache.hadoop.io.compress.SnappyCodec |
tez.runtime.compress.codec
|
true |
Publish Configuration Information |
Whether to publish configuration information to History logger. |
tez.runtime.convert.user-payload.to.history-text
|
false |
tez.runtime.convert.user-payload.to.history-text
|
true |
Sort Buffer Size |
The size of the sort buffer when output needs to be sorted. |
tez.runtime.io.sort.mb
|
272 MiB |
tez.runtime.io.sort.mb
|
true |
Enable Accessing the Local Files Directly |
If the shuffle input is on the local host bypass the http fetch and access the files directly. |
tez.runtime.optimize.local.fetch
|
true |
tez.runtime.optimize.local.fetch
|
true |
Pipeline Sorter Sort Threads |
Tez runtime pipelined sorter sort threads. |
tez.runtime.pipelined.sorter.sort.threads
|
2 |
tez.runtime.pipelined.sorter.sort.threads
|
true |
Fraction of Memory to Retain Shuffled Data |
Fraction (0-1) of the available memory which can be used to retain shuffled data. |
tez.runtime.shuffle.fetch.buffer.percent
|
0.6 |
tez.runtime.shuffle.fetch.buffer.percent
|
true |
Maximum Percent of Shuffle Segment |
This property determines the maximum size of a shuffle segment which can be fetched to memory. Fraction (0-1) of shuffle memory (after applying tez.runtime.shuffle.fetch.buffer.percent). |
tez.runtime.shuffle.memory.limit.percent
|
0.25 |
tez.runtime.shuffle.memory.limit.percent
|
true |
Buffer Size for Unordered Output |
The size of the buffer when output does not require to be sorted. |
tez.runtime.unordered.output.buffer.size-mb
|
100 MiB |
tez.runtime.unordered.output.buffer.size-mb
|
true |
Timeout for Application Master for a Task |
Time (in seconds) for which the Tez Application Master should wait for a DAG to be submitted before shutting down. |
tez.session.am.dag.submit.timeout.secs
|
5 minute(s) |
tez.session.am.dag.submit.timeout.secs
|
true |
Timeout for Application Master to Come up |
Time (in seconds) to wait for Application Master to come up when trying to submit a DAG from the client. |
tez.session.client.timeout.secs
|
-1 second(s) |
tez.session.client.timeout.secs
|
true |
ScatterGather Connection Maximum Fraction of Tasks |
In case of a ScatterGather connection, once this fraction of source tasks have completed, all tasks on the current vertex can be scheduled. Number of tasks ready for scheduling on the current vertex scales linearly between min-fraction and max-fraction. |
tez.shuffle-vertex-manager.max-src-fraction
|
0.4 |
tez.shuffle-vertex-manager.max-src-fraction
|
true |
ScatterGather Connection Minimum Fraction of Tasks |
In case of a ScatterGather connection, the fraction of source tasks which should complete before tasks for the current vertex are schedule. |
tez.shuffle-vertex-manager.min-src-fraction
|
0.2 |
tez.shuffle-vertex-manager.min-src-fraction
|
true |
TEZ Staging directory |
The staging dir used while submitting DAGs. |
tez.staging-dir
|
/tmp/$user.name/staging |
tez.staging-dir
|
true |
Heartbeat Interval |
Time interval at which task counters are sent to the Application Master. |
tez.task.am.heartbeat.counter.interval-ms.max
|
4 second(s) |
tez.task.am.heartbeat.counter.interval-ms.max
|
true |
Generate Counters on a Per-Edge Basis |
Whether to generate counters on a per-edge basis for a Tez DAG. Helpful for in-depth analysis. |
tez.task.generate.counters.per.io
|
true |
tez.task.generate.counters.per.io
|
true |
Maximum Time Between Tasks |
The maximum amount of time, in seconds, to wait before a task asks an Application Master for another task. |
tez.task.get-task.sleep.interval-ms.max
|
200 millisecond(s) |
tez.task.get-task.sleep.interval-ms.max
|
true |
Tez Task Default Command Line Options |
Cluster default Java options for tasks. These will be prepended to the properties specified via tez.task.launch.cmd-opts. |
tez.task.launch.cluster-default.cmd-opts
|
-server -Djava.net.preferIPv4Stack=true |
tez.task.launch.cluster-default.cmd-opts
|
true |
Tez Task Command Line Options |
Java options for tasks. The Xmx value is derived based on tez.task.resource.memory.mb and is 80% of this value by default. Used only if the value is not specified explicitly by the DAG definition. |
tez.task.launch.cmd-opts
|
-XX:+PrintGCDetails -verbose:gc -XX:+UseNUMA -XX:+UseG1GC -XX:+ResizeTLAB -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp |
tez.task.launch.cmd-opts
|
true |
Tez Task Environment Settings |
Additional execution environment entries for tez. This is not an additive property. You must preserve the original value if you want to have access to native libraries. Used only if the value is not specified explicitly by the DAG definition. |
tez.task.launch.env
|
LD_LIBRARY_PATH=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native |
tez.task.launch.env
|
true |
Maximum Number of Events in a Heartbeat |
Maximum number of events to fetch from the Application Master by the tasks in a single heartbeat. |
tez.task.max-events-per-heartbeat
|
500 |
tez.task.max-events-per-heartbeat
|
true |
Tez Task Memory |
The amount of memory to be used by launched tasks. Used only if the value is not specified explicitly by the DAG definition. |
tez.task.resource.memory.mb
|
1536 MiB |
tez.task.resource.memory.mb
|
true |
Tez UI URL Base |
The base of the Tez UI URL. |
tez.tez-ui.history-url.base
|
|
tez.tez-ui.history-url.base
|
false |
Use Hadoop Libs |
This being true implies that the deployment is relying on hadoop jars being available on the cluster on all nodes. |
tez.use.cluster.hadoop-libs
|
false |
tez.use.cluster.hadoop-libs
|
true |
Enable Yarn Timeline-Service |
Timeline service version we're currently using. |
yarn.timeline-service.enabled
|
false |
yarn.timeline-service.enabled
|
true |
YARN Service |
Name of the YARN service that this Tez service instance depends on |
|
|
yarn_service
|
true |