If your cluster properties file specifies IS_TEZ=yes
(use Tez for
Hive), perform the following steps after HDP deployment:
Open the command prompt with the
hadoop
account:runas /user:hadoop cmd
Make a Tez application directory in HDFS:
%HADOOP_HOME%\bin\hdfs dfs -mkdir /apps/tez
Allow all users read and write access:
%HADOOP_HOME%\bin\hdfs dfs -chmod -R 755 /apps/tez
Change the owner of the file to
hadoop
:%HADOOP_HOME%\bin\hdfs dfs -chown -R hadoop:users /apps/tez
Copy the Tez home directory on the local machine, into the HDFS
/apps/tez
directory:%HADOOP_HOME%\bin\hdfs dfs -put %TEZ_HOME%\* /apps/tez
Remove the Tez configuration directory from the HDFS Tez application directory:
%HADOOP_HOME%\bin\hdfs dfs -rm -r -skipTrash /apps/tez/conf
Ensure that the following properties are set in the
%HIVE_HOME%\conf\hive- site.xml
file:Table 4.1. Required properties
Property
Default Value
Description
hive.auto.convert.join. noconditionaltask
true
Specifies whether Hive optimizes converting common JOIN statements into MAPJOIN statements. JOIN statements are converted if this property is enabled and the sum of size for n-1 of the tables/partitions for an n-way join is smaller than the size specified with the hive.auto.convert.join. noconditionaltask.size property.
hive.auto.convert.join. noconditionaltask.size
10000000 (10 MB)
Specifies the size used to calculate whether Hive converts a JOIN statement into a MAPJOIN statement. The configuration property is ignored unless hive.auto.convert.join. noconditionaltask is enabled.
hive.optimize. reducededuplication. min.reducer
4
Specifies the minimum reducer parallelism threshold to meet before merging two MapReduce jobs. However, combining a mapreduce job with parallelism 100 with a mapreduce job with parallelism 1 may negatively impact query performance even with the reduced number of jobs. The optimization is disabled if the number of reducers is less than the specified value.
hive.tez.container.size
-1
By default, Tez uses the java options from map tasks. Use this property to override that value. Assigned value must match value specified for mapreduce.map.child.java.opts.
hive.tez.java.opts
n/a
Set to the same value as
mapreduce.map.java.opts
.Adjust the settings above to your environment where appropriate;
hive-default.xml.template
contains examples of the properties.To verify that the installation process succeeded, run smoke tests for Tez and Hive.