BUG-10248: java.lang.ClassCastException while running a join query
Problem: when a self join is done with 2 or more columns of different data types. For example:
join tab1.a = tab1.a join tab1.b=tab1.b
and a and b are different data types. a is double and b is a string for e.g.. Now b cannot be cast into a double. It shouldn't have attempted to use the same serialization for both columns.Workaround:Set the
hive.auto.convert.join.noconditionaltask.size
to a value such that the joins are split across multiple tasks.BUG-5512: Mapreduce task from Hive dynamic partitioning query is killed.
Problem: When using the Hive script to create and populate the partitioned table dynamically, the following error is reported in the TaskTracker log file:
TaskTree [pid=30275,tipID=attempt_201305041854_0350_m_000000_0] is running beyond memory-limits. Current usage : 1619562496bytes. Limit : 1610612736bytes. Killing task. TaskTree [pid=30275,tipID=attempt_201305041854_0350_m_000000_0] is running beyond memory-limits. Current usage : 1619562496bytes. Limit : 1610612736bytes. Killing task. Dump of the process-tree for attempt_201305041854_0350_m_000000_0 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 30275 20786 30275 30275 (java) 2179 476 1619562496 190241 /usr/jdk64/jdk1.6.0_31/jre/bin/java ...
Workaround: The workaround is disable all the memory settings by setting value of the following perperties to -1 in the
mapred-site.xml
file on the JobTracker and TaskTracker host machines in your cluster:mapred.cluster.map.memory.mb = -1 mapred.cluster.reduce.memory.mb = -1 mapred.job.map.memory.mb = -1 mapred.job.reduce.memory.mb = -1 mapred.cluster.max.map.memory.mb = -1 mapred.cluster.max.reduce.memory.mb = -1
To change these values using the UI, use the instructions provided here to update these properties.
BUG-5221:Hive Windowing test Ordering_1 fails
Problem: While executing the following query:
select s, avg(d) over (partition by i order by f, b) from over100k;
the following error is reported in the Hive log file:
FAILED: SemanticException Range based Window Frame can have only 1 Sort Key
Workaround: The workaround is to use the following query:
select s, avg(d) over (partition by i order by f, b rows unbounded preceding) from over100k;
BUG-5220:Hive Windowing test OverWithExpression_3 fails
Problem: While executing the following query:
select s, i, avg(d) over (partition by s order by i) / 10.0 from over100k;
the following error is reported in the Hive log file:
NoViableAltException(15@[129:7: ( ( ( KW_AS )? identifier ) | ( KW_AS LPAREN identifier ( COMMA identifier )* RPAREN ) )?]) at org.antlr.runtime.DFA.noViableAlt(DFA.java:158) at org.antlr.runtime.DFA.predict(DFA.java:116) at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2298) at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1042) at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:779) at org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:30649) at org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:28851) at org.apache.hadoop.hive.ql.parse.HiveParser.regular_body(HiveParser.java:28766) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatement(HiveParser.java:28306) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:28100) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1213) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:928) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:190) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:418) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) FAILED: ParseException line 1:53 cannot recognize input near '/' '10.0' 'from' in selection target
Workaround: The workaround is to use the following query:
select s, i, avg(d) / 10.0 over (partition by s order by i) from over100k;
Problem: While using indexes in Hive, the following error is reported:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask
Problem: Partition in hive table that is of datatype
int
is able to acceptstring
entries. For example,CREATE TABLE tab1 (id1 int,id2 string) PARTITIONED BY(month string,day int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ ;
In the above example, the partition day of datatype
int
can also acceptstring
entries while data insertions.Workaround: The workaround is to avoid adding
string
toint
fields.