This is the documentation for CDH 5.0.x. Documentation for other versions is available at Cloudera Documentation.

Apache Hive Known Issues

— UDF infile() does not accept arguments of type CHAR or VARCHAR

Bug: HIVE-6637

Severity: Low

Workaround: Cast the argument to type String.

— UDF translate() does not accept arguments of type CHAR or VARCHAR

Bug: HIVE-6622

Severity: Low

Workaround: Cast the argument to type String.

— UDF printf() does not accept arguments of type CHAR or VARCHAR

Bug: HIVE-6620

Severity: Low

Workaround: Cast the argument to type String.

— Select * fails on Parquet tables with the map data type

Bug: HIVE-6575

Severity: Low

Workaround: Use the map's column name in the SELECT statement.

— Hive's Timestamp type cannot be stored in Parquet

Tables containing timestamp columns can't use Parquet as the storage engine.

Bug: HIVE-6394

Severity: Low

Workaround: Use a different file format.

— Hive's Decimal type cannot be stored in Parquet and Avro

Tables containing decimal columns can't use Parquet or the Avro storage engine.

Bug: HIVE-6367 and HIVE-5823

Severity: Low

Workaround: Use a different file format.

— Hive creates an invalid table if you specify more than one partition with alter table

Hive (in all known versions from 0.7) allows you to configure multiple partitions with a single alter table command, but the configuration it creates is invalid for both Hive and Impala.

Bug: None

Severity: Medium

Workaround:

Correct results can be obtained by configuring each partition with its own alter table command in either Hive or Impala .For example, the following:
ALTER TABLE page_view ADD PARTITION (dt='2008-08-08', country='us') location '/path/to/us/part080808' PARTITION
(dt='2008-08-09', country='us') location '/path/to/us/part080809';
should be replaced with:
ALTER TABLE page_view ADD PARTITION (dt='2008-08-08', country='us') location '/path/to/us/part080808';
ALTER TABLE page_view ADD PARTITION (dt='2008-08-09', country='us') location '/path/to/us/part080809';

— PostgreSQL 9.0+ requires additional configuration

The Hive metastore will not start if you use a version of PostgreSQL later than 9.0 in the default configuration. You will see output similar to this in the log:
Caused by: javax.jdo.JDODataStoreException: Error executing JDOQL query
"SELECT "THIS"."TBL_NAME" AS NUCORDER0 FROM "TBLS" "THIS" LEFT OUTER JOIN "DBS" "THIS_DATABASE_NAME" ON "THIS"."DB_ID" = "THIS_DATABASE_NAME"."DB_ID"
WHERE "THIS_DATABASE_NAME"."NAME" = ? AND (LOWER("THIS"."TBL_NAME") LIKE ? ESCAPE '\\' ) ORDER BY NUCORDER0 " : ERROR: invalid escape string
Hint: Escape string must be empty or one character..
NestedThrowables:
org.postgresql.util.PSQLException: ERROR: invalid escape string
 Hint: Escape string must be empty or one character.
 at org.datanucleus.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:313)
 at org.datanucleus.jdo.JDOQuery.execute(JDOQuery.java:252)
 at org.apache.hadoop.hive.metastore.ObjectStore.getTables(ObjectStore.java:759)
 ... 28 more
Caused by: org.postgresql.util.PSQLException: ERROR: invalid escape string
 Hint: Escape string must be empty or one character.
 at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2096)
 at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1829)
 at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
 at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:510)
 at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:386)
 at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:271)
 at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
 at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
 at org.datanucleus.store.rdbms.SQLController.executeStatementQuery(SQLController.java:457)
 at org.datanucleus.store.rdbms.query.legacy.SQLEvaluator.evaluate(SQLEvaluator.java:123)
 at org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.performExecute(JDOQLQuery.java:288)
 at org.datanucleus.store.query.Query.executeQuery(Query.java:1657)
 at org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.executeQuery(JDOQLQuery.java:245)
 at org.datanucleus.store.query.Query.executeWithArray(Query.java:1499)
 at org.datanucleus.jdo.JDOQuery.execute(JDOQuery.java:243)
 ... 29 more 

The problem is caused by a backward-incompatible change in the default value of the standard_conforming_strings property. Versions up to PostgreSQL 9.0 defaulted to off, but starting with version 9.0 the default is on.

Bug: None

Severity: Low

Workaround: As the administrator user, use the following command to turn standard_conforming_strings off:
ALTER DATABASE <hive_db_name> SET standard_conforming_strings = off; 

— Queries spawned from MapReduce jobs in MRv1 fail if mapreduce.framework.name is set to yarn

Queries spawned from MapReduce jobs fail in MRv1 with a null pointer exception (NPE) if /etc/mapred/conf/mapred-site.xml has the following:
<property>
 <name>mapreduce.framework.name</name>
 <value>yarn</value>
</property> 

Bug: None

Severity: High

Resolution: Use workaround

Workaround: Remove the mapreduce.framework.name property from mapred-site.xml.

— Commands run against an Oracle backed Metastore may fail

Commands run against an Oracle backed Metastore fail with error:
javax.jdo.JDODataStoreException Incompatible data type for column TBLS.VIEW_EXPANDED_TEXT : was CLOB (datastore),
but type expected was LONGVARCHAR (metadata). Please check that the type in the datastore and the type specified in the MetaData are consistent.

This error may occur if the metastore is run on top of an Oracle database with the configuration property datanucleus.validateColumns set to true.

Bug: None

Severity: Low

Workaround: Set datanucleus.validateColumns=false in the hive-site.xml configuration file.

— Hive, Pig, and Sqoop 1 fail in MRv1 tarball installation because /usr/bin/hbase sets HADOOP_MAPRED_HOME to MR2

This problem affects tarball installations only.

Bug: None

Severity: High

Workaround: If you are using MRv1, edit the following line in /etc/default/hadoop from
export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce 
to
export HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce 

In addition, /usr/lib/hadoop-mapreduce must not exist in HADOOP_CLASSPATH.

— Hive Web Interface not supported

Cloudera no longer supports the Hive Web Interface because of inconsistent upstream maintenance of this project.

Bug: DISTRO-77

Severity: Low

Workaround: Use Hue and Beeswax instead of the Hive Web Interface.

— Hive may need additional configuration to make it work in an Federated HDFS cluster

Hive jobs normally move data from a temporary directory to a warehouse directory during execution. Hive uses /tmp as its temporary directory by default, and users usually configure /user/hive/warehouse/ as the warehouse directory. Under Federated HDFS, /tmp and /user are configured as ViewFS mount tables, and so the Hive job will actually try to move data between two ViewFS mount tables. Federated HDFS does not support this, and the job will fail with the following error:
Failed with exception Renames across Mount
				points not supported 

Bug: None

Severity: Low

Anticipated Resolution: No software fix planned; use the workaround.

Workaround: Modify /etc/hive/conf/hive-site.xml to allow the temporary directory and warehouse directory to use the same ViewFS mount table. For example, if the warehouse directory is /user/hive/warehouse, add the following property to /etc/hive/conf/hive-site.xml so both directories use the ViewFS mount table for /user.
					<property>
 <name>hive.exec.scratchdir</name>
 <value>/user/${user.name}/tmp</value>
</property> 

— Cannot create archive partitions with external HAR (HTTP Archive) tables

ALTER TABLE ... ARCHIVE PARTITION is not supported on external tables.

Bug: None

Severity: Low

Workaround: None

— Setting hive.optimize.skewjoin to true causes long running queries to fail

Bug: None

Severity: Low

Workaround: None

— JDBC - executeUpdate does not returns the number of rows modified

Contrary to the documentation, method executeUpdate always returns zero.

Severity: Low

Workaround: None

HCatalog Known Issues

  Note: As of CDH 5, HCatalog is part of Apache Hive.

— Hive's DECIMAL data type cannot be mapped to Pig via HCatalog

HCatalog does recognize the DECIMAL data type.

Bug: none

Severity: Low

Workaround: None

— Job submission using WebHCatalog might not work correctly

Bug: none

Severity: Low

Workaround: Cloudera recommends using the Oozie REST interface to submit jobs, as it's a more mature and capable tool.

— WebHCatalog does not work in a Kerberos-secured Federated cluster

Bug: none

Severity: Low

Resolution: None planned

Workaround:None

Page generated September 3, 2015.