5.2. Known Issues for Hive - Hortonworks Data Platform

BUG-10248: java.lang.ClassCastException while running a join query

Problem: when a self join is done with 2 or more columns of different data types. For example: join tab1.a = tab1.a join tab1.b=tab1.b and a and b are different data types. a is double and b is a string for e.g.. Now b cannot be cast into a double. It shouldn't have attempted to use the same serialization for both columns.

Workaround:Set the hive.auto.convert.join.noconditionaltask.size to a value such that the joins are split across multiple tasks.

BUG-5221:Hive Windowing test Ordering_1 fails

Problem: While executing the following query:

select s, avg(d) over (partition by i order by f, b) from over100k;

the following error is reported in the Hive log file:

FAILED: SemanticException Range based Window Frame can have only 1 Sort Key

Workaround: The workaround is to use the following query:

select s, avg(d) over (partition by i order by f, b rows unbounded preceding) from over100k;

BUG-5220:Hive Windowing test OverWithExpression_3 fails

Problem: While executing the following query:

select s, i, avg(d) over (partition by s order by i) / 10.0 from over100k;

the following error is reported in the Hive log file:

NoViableAltException(15@[129:7: ( ( ( KW_AS )? identifier ) | ( KW_AS LPAREN identifier ( COMMA identifier )* RPAREN ) )?])
	at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
	at org.antlr.runtime.DFA.predict(DFA.java:116)
	at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2298)
	at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1042)
	at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:779)
	at org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:30649)
	at org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:28851)
	at org.apache.hadoop.hive.ql.parse.HiveParser.regular_body(HiveParser.java:28766)
	at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatement(HiveParser.java:28306)
	at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:28100)
	at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1213)
	at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:928)
	at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:190)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:418)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
	at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
	at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
FAILED: ParseException line 1:53 cannot recognize input near '/' '10.0' 'from' in selection target

Workaround: The workaround is to use the following query:

select s, i, avg(d) / 10.0 over (partition by s order by i) from over100k;

BUG-5512: Mapreduce task from Hive dynamic partitioning query is killed.

Problem: When using the Hive script to create and populate the partitioned table dynamically, the following error is reported in the TaskTracker log file:

TaskTree [pid=30275,tipID=attempt_201305041854_0350_m_000000_0] is running beyond memory-limits. Current usage : 1619562496bytes. Limit : 1610612736bytes. Killing task. TaskTree [pid=30275,tipID=attempt_201305041854_0350_m_000000_0] is running beyond memory-limits. Current usage : 1619562496bytes. Limit : 1610612736bytes. Killing task. Dump of the process-tree for attempt_201305041854_0350_m_000000_0 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 30275 20786 30275 30275 (java) 2179 476 1619562496 190241 /usr/jdk64/jdk1.6.0_31/jre/bin/java ...

Workaround: The workaround is disable all the memory settings by setting value of the following perperties to -1 in the mapred-site.xml file on the JobTracker and TaskTracker host machines in your cluster:

mapred.cluster.map.memory.mb = -1
mapred.cluster.reduce.memory.mb = -1
mapred.job.map.memory.mb = -1
mapred.job.reduce.memory.mb = -1
mapred.cluster.max.map.memory.mb = -1
mapred.cluster.max.reduce.memory.mb = -1

To change these values using the UI, use the instructions provided here to update these properties.

BUG-4714: Hive Server 2 Concurrency Failure (create_index.q).

Problem: While using indexes in Hive, the following error is reported:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask

BUG-2131, HIVE-5297: Partition in hive table that is of datatype ‘int’ is able to accept ‘string’ entries

Problem: Partition in hive table that is of datatype int is able to accept string entries. For example,

CREATE TABLE tab1 (id1 int,id2 string) PARTITIONED BY(month string,day int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ ;

In the above example, the partition day of datatype int can also accept string entries while data insertions.

Workaround: The workaround is to avoid adding string to int fields.