Release Notes
Also available as:
PDF

Hive

This release provides Hive 1.2.1 and Hive 2.1.0 in addition to the following patches:

Hive 1.2.1 Apache patches:

  • HIVE-7224: Set incremental printing to true by default in Beeline.

  • HIVE-7239: Fix bug in HiveIndexedInputFormat implementation that causes incorrect query result when input backed by Sequence/RC files.

  • HIVE-9941: sql std authorization on partitioned table: truncate and inser.

  • HIVE-10562: Add versioning/format mechanism to NOTIFICATION_LOG entries, expand MESSAGE siz.

  • HIVE-10924: add support for MERGE statemen.

  • HIVE-11030: Enhance storage layer to create one delta file per writ.

  • HIVE-11293: HiveConnection.setAutoCommit(true) throws exception .

  • HIVE-11594: Analyze Table for column names with embedded space.

  • HIVE-11616: DelegationTokenSecretManager reuses the same objectstore, which has concurrency issue.

  • HIVE-11935: Race condition in HiveMetaStoreClient: isCompatibleWith and clos.

  • HIVE-12077: MSCK Repair table should fix partitions in batche.

  • HIVE-12594: X lock on partition should not conflict with S lock on DB.

  • HIVE-12664: Bug in reduce deduplication optimization causing ArrayOutOfBoundException.

  • HIVE-12968: genNotNullFilterForJoinSourcePlan: needs to merge predicates into the multi-AND.

  • HIVE-13014: RetryingMetaStoreClient is retrying too aggressiveley .

  • HIVE-13083: Writing HiveDecimal to ORC can wrongly suppress present strea.

  • HIVE-13185: orc.ReaderImp.ensureOrcFooter() method fails on small text files with IndexOutOfBoundsException.

  • HIVE-13423: Handle the overflow case for decimal datatype for sum().

  • HIVE-13527: Using deprecated APIs in HBase client causes zookeeper connection leaks.

  • HIVE-13539: HiveHFileOutputFormat searching the wrong directory for HFiles .

  • HIVE-13756: Map failure attempts to delete reducer _temporary dir on pig multi-quer.

  • HIVE-13836: DbNotifications giving an error = Invalid state. Transaction has already started.

  • HIVE-13872: Queries failing with java.lang.ClassCastException when vectorization is enable.

  • HIVE-13936: Add streaming support for row_numbe.

  • HIVE-13966: DbNotificationListener: can loose DDL operation notification.

  • HIVE-14037: java.lang.ClassNotFoundException for the jar in hive.reloadable.aux.jars.path in mapreduc.

  • HIVE-14170: Beeline IncrementalRows should buffer rows and incrementally re-calculate width if TableOutputFormat is used .

  • HIVE-14229: the jars in hive.aux.jar.paths are not added to session classpath.

  • HIVE-14229: the jars in hive.aux.jar.paths are not added to session classpath .

  • HIVE-14251: Union All of different types resolves to incorrect data.

  • HIVE-14278: Migrate TestHadoop20SAuthBridge.java from Unit3 to Unit.

  • HIVE-14279: fix mvn test TestHiveMetaStore.testTransactionalValidatio.

  • HIVE-14290: Refactor HIVE-14054 to use Collections#newSetFromMap.

  • HIVE-14375: hcatalog-pig-adaptor pom.xml uses joda-time 2.2 instead of ${joda.version} that uses 2.8.1.

  • HIVE-14399: Fix test flakiness of org.apache.hive.hcatalog.listener.TestDbNotificationListener.cleanupNotifs.

  • HIVE-14436: Hive 1.2.1/Hitting "ql.Driver: FAILED: IllegalArgumentException Error: , expected at the end of 'decimal(9'" after enabling hive.optimize.skewjoin and with MR engine .

  • HIVE-14445: upgrade maven surefire to 2.19.1.

  • HIVE-14457: Partitions in encryption zone are still trashed though an exception is returned.

  • HIVE-14519: Multi insert query bug .

  • HIVE-14520: We should set a timeout for the blocking calls in TestMsgBusConnection.

  • HIVE-14591: HS2 is shut down unexpectedly during the startup time .

  • HIVE-14607: ORC split generation failed with exception: java.lang.ArrayIndexOutOfBoundsException: 1.

  • HIVE-14659: OutputStream won't close if caught exception in funtion unparseExprForValuesClause in SemanticAnalyzer.java .

  • HIVE-14690: Query fail when hive.exec.parallel=true, with conflicting session di.

  • HIVE-14693: Some paritions will be left out when partition number is the multiple of the option hive.msck.repair.batch.size.

  • HIVE-14715: Hive throws NumberFormatException with query with Null value.

  • HIVE-14762: Add logging while removing scratch spac.

  • HIVE-14773: NPE aggregating column statistics for date column in partitioned table .

  • HIVE-14774: Canceling query using Ctrl-C in beeline might lead to stale locks.

  • HIVE-14805: Subquery inside a view will have the object in the subquery as the direct input.

  • HIVE-14837: JDBC: standalone jar is missing hadoop core dependencie.

  • HIVE-14865: Fix comments after HIVE-14350.

  • HIVE-14922: Add perf logging for post job completion step.

  • HIVE-14924: MSCK REPAIR table with single threaded is throwing null pointer exception.

  • HIVE-14928: Analyze table no scan mess up schema.

  • HIVE-14929: Adding JDBC test for query cancellation scenari.

  • HIVE-14935: Add tests for beeline force optio.

  • HIVE-14943: Base Implementation (merge statement.

  • HIVE-14948: properly handle special characters in identifier.

  • HIVE-14959: Fix DISTINCT with windowing when CBO is enabled/disable.

  • HIVE-14966: Backport: JDBC: HiveConnction never saves HTTP cookies.

  • HIVE-14992: Relocate several common libraries in hive jdbc uber ja.

  • HIVE-14993: make WriteEntity distinguish writeTyp.

  • HIVE-15002: HiveSessionImpl#executeStatementInternal may leave locks in an inconsistent state.

  • HIVE-15010: Make LockComponent aware if it's part of dynamic partition operatio.

  • HIVE-15060: Remove the autoCommit warning from beeline .

  • HIVE-15099: PTFOperator.PTFInvocation didn't properly reset the input partition.

  • HIVE-15124: Fix OrcInputFormat to use reader's schema for include boolean arra.

  • HIVE-15137: metastore add partitions background thread should use current username.

  • HIVE-15151: Bootstrap support for replv2.

  • HIVE-15178: ORC stripe merge may produce many MR jobs and no merge if split size is small .

  • HIVE-15180: Extend JSONMessageFactory to store additional information about metadata objects on different table events.

  • HIVE-15231: query on view with CTE and alias fails with table not found error .

  • HIVE-15232: Add notification events for functions and indexes.

  • HIVE-15284: Add junit test to test replication scenarios.

  • HIVE-15291: Comparison of timestamp fails if only date part is provided.

  • HIVE-15294: Capture additional metadata to replicate a simple insert at destination.

  • HIVE-15307: Hive MERGE: "when matched then update" allows invalid column names.

  • HIVE-15322: Skipping "hbase mapredcp" in hive script for certain service.

  • HIVE-15327: Outerjoin might produce wrong result depending on joinEmitInterval value .

  • HIVE-15332: REPL LOAD & DUMP support for incremental CREATE_TABLE/ADD_PTN.

  • HIVE-15333: Add a FetchTask to REPL DUMP plan for reading dump uri, last repl id as ResultSet.

  • HIVE-15355: Concurrency issues during parallel moveFile due to HDFSUtils.setFullFileStatu.

  • HIVE-15365: Add new methods to MessageFactory API (corresponding to the ones added in JSONMessageFactory).

  • HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events.

  • HIVE-15426: Fix order guarantee of event executions for REPL LOAD.

  • HIVE-15437: avro tables join fails when - tbl join tbl_postfix .

  • HIVE-15448: ChangeManager.

  • HIVE-15466: REPL LOAD & DUMP support for incremental DROP_TABLE/DROP_PTN.

  • HIVE-15469: Fix REPL DUMP/LOAD DROP_PTN so it works on non-string-ptn-key tables.

  • HIVE-15472: JDBC: Standalone jar is missing ZK dependencie.

  • HIVE-15473: Progress Bar on Beeline clien.

  • HIVE-15478: Add file + checksum list for create table/partition during notification creation (whenever relevant.

  • HIVE-15522: REPL LOAD & DUMP support for incremental ALTER_TABLE/ALTER_PTN including renames.

  • HIVE-15525: Hooking ChangeManager to "drop table", "drop partition.

  • HIVE-15534: Update db/table repl.last.id at the end of REPL LOAD of a batch of events.

  • HIVE-15542: NPE in StatsUtils::getColStatistics when all values in DATE column are NULL.

  • HIVE-15550: fix arglist logging in schematool .

  • HIVE-15551: memory leak in directsql for mysql+bonecp specific initialization.

  • HIVE-15551: memory leak in directsql for mysql+bonecp specific initialization .

  • HIVE-15569: failures in RetryingHMSHandler.

  • HIVE-15579: Support HADOOP_PROXY_USER for secure impersonation in hive metastore client.

  • HIVE-15588: Vectorization: Fix deallocation of scratch columns in VectorUDFCoalesce, etc to prevent wrong reus.

  • HIVE-15589: Flaky org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testHeartbeater .

  • HIVE-15668: change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword.

  • HIVE-15684: Wrong posBigTable used in VectorMapJoinOuterFilteredOperato.

  • HIVE-15714: backport HIVE-11985 (and HIVE-12601) to branch-1 .

  • HIVE-15717: JDBC: Implement rowDeleted, rowInserted and rowUpdated to return false.

  • HIVE-15752: MSCK should add output WriteEntity for table in semantic analysis .

  • HIVE-15755: NullPointerException on invalid table name in ON clause of Merge statemen.

  • HIVE-15774: Ensure DbLockManager backward compatibility for non-ACID resources .

  • HIVE-15803: msck can hang when nested partitions are present.

  • HIVE-15830: Allow additional ACLs for tez jobs.

  • HIVE-15839: Don't force cardinality check if only WHEN NOT MATCHED is specifie.

  • HIVE-15840: Webhcat test TestPig_5 failing with Pig on Tez at check for percent complete of jo.

  • HIVE-15846: CNF error without hadoop jars, Relocate more dependencies (e.g. org.apache.zookeeper) for JDBC uber ja.

  • HIVE-15846: Relocate more dependencies (e.g. org.apache.zookeeper) for JDBC uber ja.

  • HIVE-15847: In Progress update refreshes seem slo.

  • HIVE-15848: count or sum distinct incorrect when hive.optimize.reducededuplication set to true.

  • HIVE-15851: SHOW COMPACTIONS doesn't show JobI.

  • HIVE-15871: Add cross join check in SQL MERGE stm.

  • HIVE-15871: enable cardinality check by defaul.

  • HIVE-15872: The PERCENTILE_APPROX UDAF does not work with empty se.

  • HIVE-15879: Fix HiveMetaStoreChecker.checkPartitionDirs metho.

  • HIVE-15889: Some tasks still run after hive cli is shutdow.

  • HIVE-15891: Detect query rewrite scenario for UPDATE/DELETE/MERGE and fail fast.

  • HIVE-15917: incorrect error handling from BackgroundWork can cause beeline query to hang.

  • HIVE-15935: ACL is not set in ATS dat.

  • HIVE-15936: ConcurrentModificationException in ATSHoo.

  • HIVE-15941: Fix o.a.h.hive.ql.exec.tez.TezTask compilation issue with tez maste.

  • HIVE-15950: Make DbTxnManager use Metastore client consistently with caller.

  • HIVE-15970: Merge statement implementation clashes with AST rewrite.

  • HIVE-15999: Fix flakiness in TestDbTxnManager2.

  • HIVE-16014: HiveMetastoreChecker should use hive.metastore.fshandler.threads instead of hive.mv.files.thread for pool siz.

  • HIVE-16028: Fail UPDATE/DELETE/MERGE queries when Ranger authorization manager is use.

  • HIVE-16045: Print progress bar along with operation lo.

  • HIVE-16050: Regression: Union of null with non-nul.

  • HIVE-16070: fix nonReserved list in IdentifiersParser..

  • HIVE-16086: Fix HiveMetaStoreChecker.checkPartitionDirsSingleThreaded metho.

  • HIVE-16090: Addendum to HIVE-1601.

  • HIVE-16102: Grouping sets do not conform to SQL standar.

  • HIVE-16114: NullPointerException in TezSessionPoolManager when getting the sessio.

  • HIVE-16160: OutOfMemoryError: GC overhead limit exceeded on Hs2 longevity test.

  • HIVE-16170: Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone ja.

  • HIVE-16172: Switch to a fairness lock to synchronize HS2 thrift clien.

  • HIVE-16175: Possible race condition in InstanceCache.

  • HIVE-16181: Make logic for hdfs directory location extraction more generic, in webhcat test drive.

Hive 2.1.0 Apache Patches:

  • HIVE-9941: sql std authorization on partitioned table: truncate and insert.

  • HIVE-12492: Inefficient join ordering in TPCDS query19 causing 50-70% slowdown.

  • HIVE-14214: ORC Schema Evolution and Predicate Push Down do not work together (no rows returned).

  • HIVE-14278: Migrate TestHadoop23SAuthBridge.java from Unit3 to Unit4.

  • HIVE-14360: Starting BeeLine after using !save, there is an error logged: "Error setting configuration: conf".

  • HIVE-14362: Support explain analyze in Hive.

  • HIVE-14367: Estimated size for constant nulls is 0.

  • HIVE-14405: Have tests log to the console along with hive.log.

  • HIVE-14432: LLAP signing unit test may be timing-dependent.

  • HIVE-14445: upgrade maven surefire to 2.19.1.

  • HIVE-14612: org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout failure .

  • HIVE-14655: LLAP input format should escape the query string being passed to getSplits().

  • HIVE-14929: Adding JDBC test for query cancellation scenario.

  • HIVE-14935: Add tests for beeline force option.

  • HIVE-14959: Fix DISTINCT with windowing when CBO is enabled/disabled.

  • HIVE-14959: Fix DISTINCT with windowing when CBO is enabled/disabled.

  • HIVE-15002: HiveSessionImpl#executeStatementInternal may leave locks in an inconsistent state.

  • HIVE-15069: Optimize MetaStoreDirectSql:: aggrColStatsForPartitions during query compilation.

  • HIVE-15084: Flaky test: TestMiniTezCliDriver:explainanalyze_1, 2, 3, 4, 5.

  • HIVE-15099: PTFOperator.PTFInvocation didn't properly reset the input partition.

  • HIVE-15570: LLAP: Exception in HostAffinitySplitLocationProvider when running in container mode.

  • HIVE-15668: change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword.

  • HIVE-15789: Vectorization: limit reduce vectorization to 32Mb chunks.

  • HIVE-15799: LLAP: rename VertorDeserializeOrcWriter.

  • HIVE-15809: Typo in the PostgreSQL database name for druid service.

  • HIVE-15830: Allow additional ACLs for tez jobs.

  • HIVE-15847: In Progress update refreshes seem slow.

  • HIVE-15848: count or sum distinct incorrect when hive.optimize.reducededuplication set to true.

  • HIVE-15851: SHOW COMPACTIONS doesn't show JobID.

  • HIVE-15872: The PERCENTILE_APPROX UDAF does not work with empty set.

  • HIVE-15874: Invalid position alias in Group By when CBO failed.

  • HIVE-15877: Upload dependency jars for druid storage handler.

  • HIVE-15879: Fix HiveMetaStoreChecker.checkPartitionDirs method.

  • HIVE-15884: Optimize not between for vectorization.

  • HIVE-15903: Compute table stats when user computes column stats.

  • HIVE-15928: Druid/Hive integration: Parallelization of Select queries in Druid handler.

  • HIVE-15935: ACL is not set in ATS data.

  • HIVE-15938: position alias in order by fails for union queries.

  • HIVE-15941: Fix o.a.h.hive.ql.exec.tez.TezTask compilation issue with tez master.

  • HIVE-15948: Failing test: TestCliDriver, TestSparkCliDriver join31.

  • HIVE-15951: Make sure base persist directory is unique and deleted.

  • HIVE-15955: Provide additional explain plan info to facilitate display of runtime filtering and lateral joins.

  • HIVE-15958: LLAP: Need to check why 1000s of ipc threads are created.

  • HIVE-15959: LLAP: fix headroom calculation and move it to daemon.

  • HIVE-15969: Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer.

  • HIVE-15971: LLAP: logs urls should use daemon container id instead of fake container id.

  • HIVE-15991: Flaky Test: TestEncryptedHDFSCliDriver encryption_join_with_different_encryption_keys.

  • HIVE-15994: Grouping function error when grouping sets are not specified.

  • HIVE-15999: Fix flakiness in TestDbTxnManager2 .

  • HIVE-16002: Correlated IN subquery with aggregate asserts in sq_count_check UDF.

  • HIVE-16005: miscellaneous small fixes to help with llap debuggability.

  • HIVE-16010: incorrect conf.set in TezSessionPoolManager.

  • HIVE-16012: BytesBytes hash table - better capacity exhaustion handling.

  • HIVE-16013: Fragments without locality can stack up on nodes.

  • HIVE-16014: HiveMetastoreChecker should use hive.metastore.fshandler.threads instead of hive.mv.files.thread for pool size.

  • HIVE-16015: LLAP: some Tez INFO logs are too noisy II.

  • HIVE-16015: Modify Hive log settings to integrate with tez reduced logging.

  • HIVE-16018: Add more information for DynamicPartitionPruningOptimization.

  • HIVE-16020: LLAP: Reduce IPC connection misses.

  • HIVE-16022: BloomFilter check not showing up in MERGE statement queries.

  • HIVE-16023: Bad stats estimation in TPCH Query 12.

  • HIVE-16028: Fail UPDATE/DELETE/MERGE queries when Ranger authorization manager is used.

  • HIVE-16033: LLAP: Use PrintGCDateStamps for gc logging.

  • HIVE-16034: Hive/Druid integration: Fix type inference for Decimal DruidOutputFormat.

  • HIVE-16040: union column expansion should take aliases from the leftmost branch.

  • HIVE-16045: Print progress bar along with operation log.

  • HIVE-16050: Regression: Union of null with non-null.

  • HIVE-16054: AMReporter should use application token instead of ugi.getCurrentUser.

  • HIVE-16065: Vectorization: Wrong Key/Value information used by Vectorizer.

  • HIVE-16067: LLAP: send out container complete messages after a fragment completes.

  • HIVE-16068: BloomFilter expectedEntries not always using NDV when it's available during runtime filtering.

  • HIVE-16070: fix nonReserved list in IdentifiersParser.g.

  • HIVE-16072: LLAP: Add some additional jvm metrics for hadoop-metrics2.

  • HIVE-16078: improve abort checking in Tez/LLAP.

  • HIVE-16082: Allow user to change number of listener thread in LlapTaskCommunicator.

  • HIVE-16086: Fix HiveMetaStoreChecker.checkPartitionDirsSingleThreaded method.

  • HIVE-16090: Addendum to HIVE-16014.

  • HIVE-16094: queued containers may timeout if they don't get to run for a long time.

  • HIVE-16097: minor fixes to metrics and logs in LlapTaskScheduler.

  • HIVE-16098: Describe table doesn't show stats for partitioned tables.

  • HIVE-16102: Grouping sets do not conform to SQL standard.

  • HIVE-16103: LLAP: Scheduler timeout monitor never stops with slot nodes.

  • HIVE-16104: LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately.

  • HIVE-16114: NullPointerException in TezSessionPoolManager when getting the session.

  • HIVE-16115: Stop printing progress info from operation logs with beeline progress bar.

  • HIVE-16122: NPE Hive Druid split introduced by HIVE-15928.

  • HIVE-16132: DataSize stats don't seem correct in semijoin opt branch.

  • HIVE-16133: Footer cache in Tez AM can take too much memory.

  • HIVE-16135: Vectorization: unhandled constant type for scalar argument.

  • HIVE-16137: Default value of hive config hive.auto.convert.join.hashtable.max.entries should be set to 40m instead of 4m.

  • HIVE-16140: Stabilize few randomly failing tests.

  • HIVE-16142: ATSHook NPE via LLAP.

  • HIVE-16150: LLAP: HiveInputFormat:getRecordReader: Fix log statements to reduce memory pressure.

  • HIVE-16154: Determine when dynamic runtime filtering should be disabled.

  • HIVE-16160: OutOfMemoryError: GC overhead limit exceeded on Hs2 longevity tests.

  • HIVE-16161: Standalone hive jdbc jar throws ClassNotFoundException.

  • HIVE-16167: Remove transitive dependency on mysql connector jar.

  • HIVE-16168: log links should use the NM nodeId port instead of web port.

  • HIVE-16170: Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone jar.

  • HIVE-16172: Switch to a fairness lock to synchronize HS2 thrift client.

  • HIVE-16175: Possible race condition in InstanceCache.

  • HIVE-16180: LLAP: Native memory leak in EncodedReader.

  • HIVE-16190: Support expression in merge statement.

  • HIVE-16211: MERGE statement failing with ClassCastException.

  • HIVE-16215: counter recording for text cache may not fully work.

  • HIVE-16229: Wrong result for correlated scalar subquery with aggregate.

  • HIVE-16236: BuddyAllocator fragmentation - short-term fix.

  • HIVE-16238: LLAP: reset/end has to be invoked for o.a.h.hive.q.io.orc.encoded.EncodedReaderImpl.

  • HIVE-16245: Vectorization: Does not handle non-column key expressions in MERGEPARTIAL mode.

  • HIVE-16260: Remove parallel edges of semijoin with map joins.

  • HIVE-16274: Support tuning of NDV of columns using lower/upper bounds.

  • HIVE-16278: LLAP: metadata cache may incorrectly decrease memory usage in mem manager.

  • HIVE-16282: Semijoin: Disable slow-start for the bloom filter aggregate task.

  • HIVE-16298: Add config to specify multi-column joins have correlated columns.

  • HIVE-16305: Additional Datanucleus ClassLoaderResolverImpl leaks causing HS2 OOM.

  • HIVE-16310: Get the output operators of Reducesink when vectorization is on.

  • HIVE-16318: LLAP cache: address some issues in 2.2/2.3.

  • HIVE-16319: Fix NPE in ShortestJobFirstComparator.

  • HIVE-16323: HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204.

  • HIVE-16325: Sessions are not restarted properly after the configured interval.