Hive
This release provides Hive 1.2.1 and Hive 2.1.0 in addition to the following patches:
Hive 1.2.1 Apache patches:
HIVE-7224: Set incremental printing to true by default in Beeline.
HIVE-7239: Fix bug in HiveIndexedInputFormat implementation that causes incorrect query result when input backed by Sequence/RC files.
HIVE-9941: sql std authorization on partitioned table: truncate and inser.
HIVE-10562: Add versioning/format mechanism to NOTIFICATION_LOG entries, expand MESSAGE siz.
HIVE-10924: add support for MERGE statemen.
HIVE-11030: Enhance storage layer to create one delta file per writ.
HIVE-11293: HiveConnection.setAutoCommit(true) throws exception .
HIVE-11594: Analyze Table for column names with embedded space.
HIVE-11616: DelegationTokenSecretManager reuses the same objectstore, which has concurrency issue.
HIVE-11935: Race condition in HiveMetaStoreClient: isCompatibleWith and clos.
HIVE-12077: MSCK Repair table should fix partitions in batche.
HIVE-12594: X lock on partition should not conflict with S lock on DB.
HIVE-12664: Bug in reduce deduplication optimization causing ArrayOutOfBoundException.
HIVE-12968: genNotNullFilterForJoinSourcePlan: needs to merge predicates into the multi-AND.
HIVE-13014: RetryingMetaStoreClient is retrying too aggressiveley .
HIVE-13083: Writing HiveDecimal to ORC can wrongly suppress present strea.
HIVE-13185: orc.ReaderImp.ensureOrcFooter() method fails on small text files with IndexOutOfBoundsException.
HIVE-13423: Handle the overflow case for decimal datatype for sum().
HIVE-13527: Using deprecated APIs in HBase client causes zookeeper connection leaks.
HIVE-13539: HiveHFileOutputFormat searching the wrong directory for HFiles .
HIVE-13756: Map failure attempts to delete reducer _temporary dir on pig multi-quer.
HIVE-13836: DbNotifications giving an error = Invalid state. Transaction has already started.
HIVE-13872: Queries failing with java.lang.ClassCastException when vectorization is enable.
HIVE-13936: Add streaming support for row_numbe.
HIVE-13966: DbNotificationListener: can loose DDL operation notification.
HIVE-14037: java.lang.ClassNotFoundException for the jar in hive.reloadable.aux.jars.path in mapreduc.
HIVE-14170: Beeline IncrementalRows should buffer rows and incrementally re-calculate width if TableOutputFormat is used .
HIVE-14229: the jars in hive.aux.jar.paths are not added to session classpath.
HIVE-14229: the jars in hive.aux.jar.paths are not added to session classpath .
HIVE-14251: Union All of different types resolves to incorrect data.
HIVE-14278: Migrate TestHadoop20SAuthBridge.java from Unit3 to Unit.
HIVE-14279: fix mvn test TestHiveMetaStore.testTransactionalValidatio.
HIVE-14290: Refactor HIVE-14054 to use Collections#newSetFromMap.
HIVE-14375: hcatalog-pig-adaptor pom.xml uses joda-time 2.2 instead of ${joda.version} that uses 2.8.1.
HIVE-14399: Fix test flakiness of org.apache.hive.hcatalog.listener.TestDbNotificationListener.cleanupNotifs.
HIVE-14436: Hive 1.2.1/Hitting "ql.Driver: FAILED: IllegalArgumentException Error: , expected at the end of 'decimal(9'" after enabling hive.optimize.skewjoin and with MR engine .
HIVE-14445: upgrade maven surefire to 2.19.1.
HIVE-14457: Partitions in encryption zone are still trashed though an exception is returned.
HIVE-14519: Multi insert query bug .
HIVE-14520: We should set a timeout for the blocking calls in TestMsgBusConnection.
HIVE-14591: HS2 is shut down unexpectedly during the startup time .
HIVE-14607: ORC split generation failed with exception: java.lang.ArrayIndexOutOfBoundsException: 1.
HIVE-14659: OutputStream won't close if caught exception in funtion unparseExprForValuesClause in SemanticAnalyzer.java .
HIVE-14690: Query fail when hive.exec.parallel=true, with conflicting session di.
HIVE-14693: Some paritions will be left out when partition number is the multiple of the option hive.msck.repair.batch.size.
HIVE-14715: Hive throws NumberFormatException with query with Null value.
HIVE-14762: Add logging while removing scratch spac.
HIVE-14773: NPE aggregating column statistics for date column in partitioned table .
HIVE-14774: Canceling query using Ctrl-C in beeline might lead to stale locks.
HIVE-14805: Subquery inside a view will have the object in the subquery as the direct input.
HIVE-14837: JDBC: standalone jar is missing hadoop core dependencie.
HIVE-14865: Fix comments after HIVE-14350.
HIVE-14922: Add perf logging for post job completion step.
HIVE-14924: MSCK REPAIR table with single threaded is throwing null pointer exception.
HIVE-14928: Analyze table no scan mess up schema.
HIVE-14929: Adding JDBC test for query cancellation scenari.
HIVE-14935: Add tests for beeline force optio.
HIVE-14943: Base Implementation (merge statement.
HIVE-14948: properly handle special characters in identifier.
HIVE-14959: Fix DISTINCT with windowing when CBO is enabled/disable.
HIVE-14966: Backport: JDBC: HiveConnction never saves HTTP cookies.
HIVE-14992: Relocate several common libraries in hive jdbc uber ja.
HIVE-14993: make WriteEntity distinguish writeTyp.
HIVE-15002: HiveSessionImpl#executeStatementInternal may leave locks in an inconsistent state.
HIVE-15010: Make LockComponent aware if it's part of dynamic partition operatio.
HIVE-15060: Remove the autoCommit warning from beeline .
HIVE-15099: PTFOperator.PTFInvocation didn't properly reset the input partition.
HIVE-15124: Fix OrcInputFormat to use reader's schema for include boolean arra.
HIVE-15137: metastore add partitions background thread should use current username.
HIVE-15151: Bootstrap support for replv2.
HIVE-15178: ORC stripe merge may produce many MR jobs and no merge if split size is small .
HIVE-15180: Extend JSONMessageFactory to store additional information about metadata objects on different table events.
HIVE-15231: query on view with CTE and alias fails with table not found error .
HIVE-15232: Add notification events for functions and indexes.
HIVE-15284: Add junit test to test replication scenarios.
HIVE-15291: Comparison of timestamp fails if only date part is provided.
HIVE-15294: Capture additional metadata to replicate a simple insert at destination.
HIVE-15307: Hive MERGE: "when matched then update" allows invalid column names.
HIVE-15322: Skipping "hbase mapredcp" in hive script for certain service.
HIVE-15327: Outerjoin might produce wrong result depending on joinEmitInterval value .
HIVE-15332: REPL LOAD & DUMP support for incremental CREATE_TABLE/ADD_PTN.
HIVE-15333: Add a FetchTask to REPL DUMP plan for reading dump uri, last repl id as ResultSet.
HIVE-15355: Concurrency issues during parallel moveFile due to HDFSUtils.setFullFileStatu.
HIVE-15365: Add new methods to MessageFactory API (corresponding to the ones added in JSONMessageFactory).
HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events.
HIVE-15426: Fix order guarantee of event executions for REPL LOAD.
HIVE-15437: avro tables join fails when - tbl join tbl_postfix .
HIVE-15448: ChangeManager.
HIVE-15466: REPL LOAD & DUMP support for incremental DROP_TABLE/DROP_PTN.
HIVE-15469: Fix REPL DUMP/LOAD DROP_PTN so it works on non-string-ptn-key tables.
HIVE-15472: JDBC: Standalone jar is missing ZK dependencie.
HIVE-15473: Progress Bar on Beeline clien.
HIVE-15478: Add file + checksum list for create table/partition during notification creation (whenever relevant.
HIVE-15522: REPL LOAD & DUMP support for incremental ALTER_TABLE/ALTER_PTN including renames.
HIVE-15525: Hooking ChangeManager to "drop table", "drop partition.
HIVE-15534: Update db/table repl.last.id at the end of REPL LOAD of a batch of events.
HIVE-15542: NPE in StatsUtils::getColStatistics when all values in DATE column are NULL.
HIVE-15550: fix arglist logging in schematool .
HIVE-15551: memory leak in directsql for mysql+bonecp specific initialization.
HIVE-15551: memory leak in directsql for mysql+bonecp specific initialization .
HIVE-15569: failures in RetryingHMSHandler.
HIVE-15579: Support HADOOP_PROXY_USER for secure impersonation in hive metastore client.
HIVE-15588: Vectorization: Fix deallocation of scratch columns in VectorUDFCoalesce, etc to prevent wrong reus.
HIVE-15589: Flaky org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testHeartbeater .
HIVE-15668: change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword.
HIVE-15684: Wrong posBigTable used in VectorMapJoinOuterFilteredOperato.
HIVE-15714: backport HIVE-11985 (and HIVE-12601) to branch-1 .
HIVE-15717: JDBC: Implement rowDeleted, rowInserted and rowUpdated to return false.
HIVE-15752: MSCK should add output WriteEntity for table in semantic analysis .
HIVE-15755: NullPointerException on invalid table name in ON clause of Merge statemen.
HIVE-15774: Ensure DbLockManager backward compatibility for non-ACID resources .
HIVE-15803: msck can hang when nested partitions are present.
HIVE-15830: Allow additional ACLs for tez jobs.
HIVE-15839: Don't force cardinality check if only WHEN NOT MATCHED is specifie.
HIVE-15840: Webhcat test TestPig_5 failing with Pig on Tez at check for percent complete of jo.
HIVE-15846: CNF error without hadoop jars, Relocate more dependencies (e.g. org.apache.zookeeper) for JDBC uber ja.
HIVE-15846: Relocate more dependencies (e.g. org.apache.zookeeper) for JDBC uber ja.
HIVE-15847: In Progress update refreshes seem slo.
HIVE-15848: count or sum distinct incorrect when hive.optimize.reducededuplication set to true.
HIVE-15851: SHOW COMPACTIONS doesn't show JobI.
HIVE-15871: Add cross join check in SQL MERGE stm.
HIVE-15871: enable cardinality check by defaul.
HIVE-15872: The PERCENTILE_APPROX UDAF does not work with empty se.
HIVE-15879: Fix HiveMetaStoreChecker.checkPartitionDirs metho.
HIVE-15889: Some tasks still run after hive cli is shutdow.
HIVE-15891: Detect query rewrite scenario for UPDATE/DELETE/MERGE and fail fast.
HIVE-15917: incorrect error handling from BackgroundWork can cause beeline query to hang.
HIVE-15935: ACL is not set in ATS dat.
HIVE-15936: ConcurrentModificationException in ATSHoo.
HIVE-15941: Fix o.a.h.hive.ql.exec.tez.TezTask compilation issue with tez maste.
HIVE-15950: Make DbTxnManager use Metastore client consistently with caller.
HIVE-15970: Merge statement implementation clashes with AST rewrite.
HIVE-15999: Fix flakiness in TestDbTxnManager2.
HIVE-16014: HiveMetastoreChecker should use hive.metastore.fshandler.threads instead of hive.mv.files.thread for pool siz.
HIVE-16028: Fail UPDATE/DELETE/MERGE queries when Ranger authorization manager is use.
HIVE-16045: Print progress bar along with operation lo.
HIVE-16050: Regression: Union of null with non-nul.
HIVE-16070: fix nonReserved list in IdentifiersParser..
HIVE-16086: Fix HiveMetaStoreChecker.checkPartitionDirsSingleThreaded metho.
HIVE-16090: Addendum to HIVE-1601.
HIVE-16102: Grouping sets do not conform to SQL standar.
HIVE-16114: NullPointerException in TezSessionPoolManager when getting the sessio.
HIVE-16160: OutOfMemoryError: GC overhead limit exceeded on Hs2 longevity test.
HIVE-16170: Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone ja.
HIVE-16172: Switch to a fairness lock to synchronize HS2 thrift clien.
HIVE-16175: Possible race condition in InstanceCache.
HIVE-16181: Make logic for hdfs directory location extraction more generic, in webhcat test drive.
Hive 2.1.0 Apache Patches:
HIVE-9941: sql std authorization on partitioned table: truncate and insert.
HIVE-12492: Inefficient join ordering in TPCDS query19 causing 50-70% slowdown.
HIVE-14214: ORC Schema Evolution and Predicate Push Down do not work together (no rows returned).
HIVE-14278: Migrate TestHadoop23SAuthBridge.java from Unit3 to Unit4.
HIVE-14360: Starting BeeLine after using !save, there is an error logged: "Error setting configuration: conf".
HIVE-14362: Support explain analyze in Hive.
HIVE-14367: Estimated size for constant nulls is 0.
HIVE-14405: Have tests log to the console along with hive.log.
HIVE-14432: LLAP signing unit test may be timing-dependent.
HIVE-14445: upgrade maven surefire to 2.19.1.
HIVE-14612: org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout failure .
HIVE-14655: LLAP input format should escape the query string being passed to getSplits().
HIVE-14929: Adding JDBC test for query cancellation scenario.
HIVE-14935: Add tests for beeline force option.
HIVE-14959: Fix DISTINCT with windowing when CBO is enabled/disabled.
HIVE-14959: Fix DISTINCT with windowing when CBO is enabled/disabled.
HIVE-15002: HiveSessionImpl#executeStatementInternal may leave locks in an inconsistent state.
HIVE-15069: Optimize MetaStoreDirectSql:: aggrColStatsForPartitions during query compilation.
HIVE-15084: Flaky test: TestMiniTezCliDriver:explainanalyze_1, 2, 3, 4, 5.
HIVE-15099: PTFOperator.PTFInvocation didn't properly reset the input partition.
HIVE-15570: LLAP: Exception in HostAffinitySplitLocationProvider when running in container mode.
HIVE-15668: change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword.
HIVE-15789: Vectorization: limit reduce vectorization to 32Mb chunks.
HIVE-15799: LLAP: rename VertorDeserializeOrcWriter.
HIVE-15809: Typo in the PostgreSQL database name for druid service.
HIVE-15830: Allow additional ACLs for tez jobs.
HIVE-15847: In Progress update refreshes seem slow.
HIVE-15848: count or sum distinct incorrect when hive.optimize.reducededuplication set to true.
HIVE-15851: SHOW COMPACTIONS doesn't show JobID.
HIVE-15872: The PERCENTILE_APPROX UDAF does not work with empty set.
HIVE-15874: Invalid position alias in Group By when CBO failed.
HIVE-15877: Upload dependency jars for druid storage handler.
HIVE-15879: Fix HiveMetaStoreChecker.checkPartitionDirs method.
HIVE-15884: Optimize not between for vectorization.
HIVE-15903: Compute table stats when user computes column stats.
HIVE-15928: Druid/Hive integration: Parallelization of Select queries in Druid handler.
HIVE-15935: ACL is not set in ATS data.
HIVE-15938: position alias in order by fails for union queries.
HIVE-15941: Fix o.a.h.hive.ql.exec.tez.TezTask compilation issue with tez master.
HIVE-15948: Failing test: TestCliDriver, TestSparkCliDriver join31.
HIVE-15951: Make sure base persist directory is unique and deleted.
HIVE-15955: Provide additional explain plan info to facilitate display of runtime filtering and lateral joins.
HIVE-15958: LLAP: Need to check why 1000s of ipc threads are created.
HIVE-15959: LLAP: fix headroom calculation and move it to daemon.
HIVE-15969: Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer.
HIVE-15971: LLAP: logs urls should use daemon container id instead of fake container id.
HIVE-15991: Flaky Test: TestEncryptedHDFSCliDriver encryption_join_with_different_encryption_keys.
HIVE-15994: Grouping function error when grouping sets are not specified.
HIVE-15999: Fix flakiness in TestDbTxnManager2 .
HIVE-16002: Correlated IN subquery with aggregate asserts in sq_count_check UDF.
HIVE-16005: miscellaneous small fixes to help with llap debuggability.
HIVE-16010: incorrect conf.set in TezSessionPoolManager.
HIVE-16012: BytesBytes hash table - better capacity exhaustion handling.
HIVE-16013: Fragments without locality can stack up on nodes.
HIVE-16014: HiveMetastoreChecker should use hive.metastore.fshandler.threads instead of hive.mv.files.thread for pool size.
HIVE-16015: LLAP: some Tez INFO logs are too noisy II.
HIVE-16015: Modify Hive log settings to integrate with tez reduced logging.
HIVE-16018: Add more information for DynamicPartitionPruningOptimization.
HIVE-16020: LLAP: Reduce IPC connection misses.
HIVE-16022: BloomFilter check not showing up in MERGE statement queries.
HIVE-16023: Bad stats estimation in TPCH Query 12.
HIVE-16028: Fail UPDATE/DELETE/MERGE queries when Ranger authorization manager is used.
HIVE-16033: LLAP: Use PrintGCDateStamps for gc logging.
HIVE-16034: Hive/Druid integration: Fix type inference for Decimal DruidOutputFormat.
HIVE-16040: union column expansion should take aliases from the leftmost branch.
HIVE-16045: Print progress bar along with operation log.
HIVE-16050: Regression: Union of null with non-null.
HIVE-16054: AMReporter should use application token instead of ugi.getCurrentUser.
HIVE-16065: Vectorization: Wrong Key/Value information used by Vectorizer.
HIVE-16067: LLAP: send out container complete messages after a fragment completes.
HIVE-16068: BloomFilter expectedEntries not always using NDV when it's available during runtime filtering.
HIVE-16070: fix nonReserved list in IdentifiersParser.g.
HIVE-16072: LLAP: Add some additional jvm metrics for hadoop-metrics2.
HIVE-16078: improve abort checking in Tez/LLAP.
HIVE-16082: Allow user to change number of listener thread in LlapTaskCommunicator.
HIVE-16086: Fix HiveMetaStoreChecker.checkPartitionDirsSingleThreaded method.
HIVE-16090: Addendum to HIVE-16014.
HIVE-16094: queued containers may timeout if they don't get to run for a long time.
HIVE-16097: minor fixes to metrics and logs in LlapTaskScheduler.
HIVE-16098: Describe table doesn't show stats for partitioned tables.
HIVE-16102: Grouping sets do not conform to SQL standard.
HIVE-16103: LLAP: Scheduler timeout monitor never stops with slot nodes.
HIVE-16104: LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately.
HIVE-16114: NullPointerException in TezSessionPoolManager when getting the session.
HIVE-16115: Stop printing progress info from operation logs with beeline progress bar.
HIVE-16122: NPE Hive Druid split introduced by HIVE-15928.
HIVE-16132: DataSize stats don't seem correct in semijoin opt branch.
HIVE-16133: Footer cache in Tez AM can take too much memory.
HIVE-16135: Vectorization: unhandled constant type for scalar argument.
HIVE-16137: Default value of hive config hive.auto.convert.join.hashtable.max.entries should be set to 40m instead of 4m.
HIVE-16140: Stabilize few randomly failing tests.
HIVE-16142: ATSHook NPE via LLAP.
HIVE-16150: LLAP: HiveInputFormat:getRecordReader: Fix log statements to reduce memory pressure.
HIVE-16154: Determine when dynamic runtime filtering should be disabled.
HIVE-16160: OutOfMemoryError: GC overhead limit exceeded on Hs2 longevity tests.
HIVE-16161: Standalone hive jdbc jar throws ClassNotFoundException.
HIVE-16167: Remove transitive dependency on mysql connector jar.
HIVE-16168: log links should use the NM nodeId port instead of web port.
HIVE-16170: Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone jar.
HIVE-16172: Switch to a fairness lock to synchronize HS2 thrift client.
HIVE-16175: Possible race condition in InstanceCache.
HIVE-16180: LLAP: Native memory leak in EncodedReader.
HIVE-16190: Support expression in merge statement.
HIVE-16211: MERGE statement failing with ClassCastException.
HIVE-16215: counter recording for text cache may not fully work.
HIVE-16229: Wrong result for correlated scalar subquery with aggregate.
HIVE-16236: BuddyAllocator fragmentation - short-term fix.
HIVE-16238: LLAP: reset/end has to be invoked for o.a.h.hive.q.io.orc.encoded.EncodedReaderImpl.
HIVE-16245: Vectorization: Does not handle non-column key expressions in MERGEPARTIAL mode.
HIVE-16260: Remove parallel edges of semijoin with map joins.
HIVE-16274: Support tuning of NDV of columns using lower/upper bounds.
HIVE-16278: LLAP: metadata cache may incorrectly decrease memory usage in mem manager.
HIVE-16282: Semijoin: Disable slow-start for the bloom filter aggregate task.
HIVE-16298: Add config to specify multi-column joins have correlated columns.
HIVE-16305: Additional Datanucleus ClassLoaderResolverImpl leaks causing HS2 OOM.
HIVE-16310: Get the output operators of Reducesink when vectorization is on.
HIVE-16318: LLAP cache: address some issues in 2.2/2.3.
HIVE-16319: Fix NPE in ShortestJobFirstComparator.
HIVE-16323: HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204.
HIVE-16325: Sessions are not restarted properly after the configured interval.