Release Notes
Also available as:
PDF

Hive

This release provides Hive 1.2.1 and Hive 2.1.0 in addition to the following patches:

Hive 1.2.1 Apache patches:

  • HIVE-4577: hive CLI can't handle hadoop dfs command with space and quotes.

  • HIVE-6990: Direct SQL fails when the explicit schema setting is different from the default one.

  • HIVE-10319: Hive CLI startup takes a long time with a large number of databases.

  • HIVE-10495: Hive index creation code throws NPE if index table is null.

  • HIVE-10616: TypeInfoUtils doesn't handle DECIMAL with just precision specified.

  • HIVE-11481: hive incorrectly set extended ACLs for unnamed group for new databases/tables with inheritPerms enabled.

  • HIVE-11721: non-ascii characters shows improper with insert into.

  • HIVE-12207: Query fails when non-ascii characters are used in string literals.

  • HIVE-14389: Beeline should not output query and prompt to stdout.

  • HIVE-14864: Distcp is not called from MoveTask when src is a directory.

  • HIVE-15294: Capture additional metadata to replicate a simple insert at destination.

  • HIVE-15519: BitSet not computed properly for ColumnBuffer subset.

  • HIVE-15587: Using ChangeManager to copy files in ReplCopyTask.

  • HIVE-16164: Provide mechanism for passing HMS notification ID between transactional and non-transactional listeners..

  • HIVE-16272: support for drop function in incremental replication.

  • HIVE-16323: HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204.

  • HIVE-16591: DR for function Binaries on HDFS.

  • HIVE-16642: New Events created as part of replv2 potentially break replv1.

  • HIVE-16644: Hook Change Manager to Insert Overwrite.

  • HIVE-16684: Bootstrap REPL DUMP shouldn't fail when table is dropped after fetching the table names.

  • HIVE-16686: repl invocations of distcp needs additional handling.

  • HIVE-16703: Hive may add the same file to the session and vertex in Tez.

  • HIVE-16706: Bootstrap REPL DUMP shouldn't fail when a partition is dropped/renamed when dump in progress.

  • HIVE-16727: REPL DUMP for insert event should't fail if the table is already dropped..

  • HIVE-16750: Support change management for rename table/partition.

  • HIVE-16785: Ensure replication actions are idempotent if any series of events are applied again..

  • HIVE-16808: WebHCat statusdir parameter doesn't properly handle Unicode characters when using relative path.

  • HIVE-16813: Incremental REPL LOAD should load the events in the same sequence as it is dumped.

  • HIVE-16866: existing available UDF is used in TestReplicationScenariosAcrossInstances#testDropFunctionIncrementalReplication.

  • HIVE-16892: Move creation of _files from ReplCopyTask to analysis phase for boostrap replication.

  • HIVE-16893: move replication dump related work in semantic analysis phase to execution phase using a task.

  • HIVE-16895: Multi-threaded execution of bootstrap dump of partitions.

  • HIVE-16896: move replication load related work in semantic analysis phase to execution phase using a task.

  • HIVE-16901: Distcp optimization - One distcp per ReplCopyTask.

  • HIVE-16918: Skip ReplCopyTask distcp for _metadata copying. Also enable -pb for distcp.

  • HIVE-16973: Fetching of Delegation tokens.

  • HIVE-17005: Ensure REPL DUMP and REPL LOAD are authorized properly.

  • HIVE-17021: Support replication of concatenate operation.

  • HIVE-17047: Allow table property to be populated to jobConf to make FixedLengthInputFormat work.

  • HIVE-17068: HCatalog: Add parquet support.

  • HIVE-17085: ORC file merge/concatenation should do full schema check.

  • HIVE-17113: Duplicate bucket files can get written to table by runaway task - ed to hive1.

  • HIVE-17144: export of temporary tables not working and it seems to be using distcp rather than filesystem copy.

  • HIVE-17144: export table query failing for temporary table.

  • HIVE-17208: Repl dump should pass in db/table information to authorization API.

  • HIVE-17212: Dynamic add partition by insert shouldn't generate INSERT event.

  • HIVE-17254: Skip updating AccessTime of recycled files in ReplChangeManager.

  • HIVE-17289: EXPORT and IMPORT shouldn't perform distcp with doAs privileged user.

  • HIVE-17301: Make JSONMessageFactory.getTObj method thread safe.

Hive 2.1.0 Apache Patches:

  • HIVE-4577: hive CLI can't handle hadoop dfs command with space and quotes.

  • HIVE-6990: Direct SQL fails when the explicit schema setting is different from the default one.

  • HIVE-10495: Hive index creation code throws NPE if index table is null.

  • HIVE-10616: TypeInfoUtils doesn't handle DECIMAL with just precision specified.

  • HIVE-11481: hive incorrectly set extended ACLs for unnamed group for new databases/tables with inheritPerms enabled.

  • HIVE-14214: ORC Schema Evolution and Predicate Push Down do not work together (no rows returned).

  • HIVE-14389: Beeline should not output query and prompt to stdout.

  • HIVE-14864: Distcp is not called from MoveTask when src is a directory.

  • HIVE-15081: RetryingMetaStoreClient.getProxy(HiveConf, Boolean) doesn't match constructor of HiveMetaStoreClient.

  • HIVE-15294: Capture additional metadata to replicate a simple insert at destination.

  • HIVE-15471: LLAP fails to start with NPE in application log.

  • HIVE-15519: BitSet not computed properly for ColumnBuffer subset.

  • HIVE-15587: Using ChangeManager to copy files in ReplCopyTask.

  • HIVE-15724: getPrimaryKeys and getForeignKeys in metastore does not normalize db and table name.

  • HIVE-16164: Provide mechanism for passing HMS notification ID between transactional and non-transactional listeners..

  • HIVE-16272: support for drop function in incremental replication.

  • HIVE-16323: HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204.

  • HIVE-16571: HiveServer2: Prefer LIFO over round-robin for Tez session reuse.

  • HIVE-16591: DR for function Binaries on HDFS.

  • HIVE-16642: New Events created as part of replv2 potentially break replv1.

  • HIVE-16644: Hook Change Manager to Insert Overwrite.

  • HIVE-16654: Optimize a combination of avg(), sum(), count(distinct) etc.

  • HIVE-16671: LLAP IO: BufferUnderflowException may happen in very rare(?) cases due to ORC end-of-CB estimation.

  • HIVE-16683: of ORC-125 to fix incorrect handling of future WriterVersions in ORC..

  • HIVE-16684: Bootstrap REPL DUMP shouldn't fail when table is dropped after fetching the table names.

  • HIVE-16686: repl invocations of distcp needs additional handling.

  • HIVE-16690: NPE during analyze column stats.

  • HIVE-16703: Hive may add the same file to the session and vertex in Tez.

  • HIVE-16706: Bootstrap REPL DUMP shouldn't fail when a partition is dropped/renamed when dump in progress.

  • HIVE-16727: REPL DUMP for insert event should't fail if the table is already dropped..

  • HIVE-16750: Support change management for rename table/partition.

  • HIVE-16761: LLAP IO: SMB joins fail elevator .

  • HIVE-16775: Fix HiveFilterAggregateTransposeRule when filter is always false.

  • HIVE-16776: Strange cast behavior for table backed by druid.

  • HIVE-16785: Ensure replication actions are idempotent if any series of events are applied again..

  • HIVE-16788: ODBC call SQLForeignKeys leads to NPE if you use PK arguments rather than FK arguments.

  • HIVE-16797: Enhance HiveFilterSetOpTransposeRule to remove union branches.

  • HIVE-16804: Semijoin hint : Needs support for target table..

  • HIVE-16808: WebHCat statusdir parameter doesn't properly handle Unicode characters when using relative path.

  • HIVE-16809: Improve filter condition for correlated subqueries .

  • HIVE-16813: Incremental REPL LOAD should load the events in the same sequence as it is dumped.

  • HIVE-16837: MetadataOnly optimizer conflicts with count distinct rewrite.

  • HIVE-16838: Improve plans for subqueries with non-equi co-related predicates.

  • HIVE-16847: LLAP queue order issue.

  • HIVE-16848: NPE during CachedStore refresh.

  • HIVE-16864: add validation to stream position search in LLAP IO.

  • HIVE-16866: existing available UDF is used in TestReplicationScenariosAcrossInstances#testDropFunctionIncrementalReplication.

  • HIVE-16867: Extend shared scan optimizer to reuse computation from other operators.

  • HIVE-16871: CachedStore.get_aggr_stats_for has side affect.

  • HIVE-16892: Move creation of _files from ReplCopyTask to analysis phase for boostrap replication.

  • HIVE-16893: move replication dump related work in semantic analysis phase to execution phase using a task.

  • HIVE-16895: Multi-threaded execution of bootstrap dump of partitions.

  • HIVE-16896: move replication load related work in semantic analysis phase to execution phase using a task.

  • HIVE-16901: Distcp optimization - One distcp per ReplCopyTask.

  • HIVE-16918: Skip ReplCopyTask distcp for _metadata copying. Also enable -pb for distcp.

  • HIVE-16926: LlapTaskUmbilicalExternalClient should not start new umbilical server for every fragment request.

  • HIVE-16947: Semijoin Reduction : Task cycle created due to multiple semijoins in conjunction with hashjoin.

  • HIVE-16965: SMB join may produce incorrect results.

  • HIVE-16973: Fetching of Delegation tokens.

  • HIVE-16985: LLAP IO: enable SMB join in elevator after the former is fixed.

  • HIVE-16996: Add HLL as an alternative to FM sketch to compute stats.

  • HIVE-17005: Ensure REPL DUMP and REPL LOAD are authorized properly.

  • HIVE-17007: NPE introduced by HIVE-16871.

  • HIVE-17021: Support replication of concatenate operation.

  • HIVE-17066: Query78 filter wrong estimatation is generating bad plan causing query failures.

  • HIVE-17073: Incorrect result with vectorization and SharedWorkOptimizer.

  • HIVE-17083: Merge credentials in DagUtils instead of overwriting.

  • HIVE-17085: ORC file merge/concatenation should do full schema check.

  • HIVE-17091: "Timed out getting readerEvents" error from external LLAP client.

  • HIVE-17093: LLAP ssl configs need to be localized to talk to a wire encrypted hdfs..

  • HIVE-17095: Long chain repl loads do not complete in a timely fashion.

  • HIVE-17097: Fix SemiJoinHint parsing in SemanticAnalyzer.

  • HIVE-17113: Duplicate bucket files can get written to table by runaway task.

  • HIVE-17137: Fix javolution conflict.

  • HIVE-17144: export of temporary tables not working and it seems to be using distcp rather than filesystem copy.

  • HIVE-17144: export table query failing for temporary table.

  • HIVE-17172: add ordering checks to DiskRangeList.

  • HIVE-17208: Repl dump should pass in db/table information to authorization API.

  • HIVE-17209: ObjectCacheFactory should return null when tez shared object registry is not setup.

  • HIVE-17212: Dynamic add partition by insert shouldn't generate INSERT event.

  • HIVE-17254: Skip updating AccessTime of recycled files in ReplChangeManager.

  • HIVE-17281: LLAP external client not properly handling KILLED notification that occurs when a fragment is rejected.

  • HIVE-17283: Enable parallel edges of semijoin along with mapjoins.

  • HIVE-17289: EXPORT and IMPORT shouldn't perform distcp with doAs privileged user.

  • HIVE-17301: Make JSONMessageFactory.getTObj method thread safe.

HDP 2.6.1 provided Hive 1.2.1 and Hive 2.1.0 in addition to the following patches:

Hive 1.2.1 Apache patches:

  • HIVE-11976: Extend CBO rules to being able to apply rules only once on a given operator.

  • HIVE-12657: selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8.

  • HIVE-12958: Make embedded Jetty server more configurable.

  • HIVE-12958: Make embedded Jetty server more configurable.

  • HIVE-13652: Import table change order of dynamic partitions.

  • HIVE-14204: Optimize loading dynamic partitions.

  • HIVE-14210: ExecDriver should call jobclient.close() to trigger cleanup.

  • HIVE-14743: ArrayIndexOutOfBoundsException - HBASE-backed views' query with JOINs.

  • HIVE-15556: Replicate views.

  • HIVE-15642: Replicate Insert Overwrites, Dynamic Partition Inserts and Loads.

  • HIVE-15646: Column level lineage is not available for table Views.

  • HIVE-15754: exchange partition is not generating notifications.

  • HIVE-15766: DBNotificationlistener leaks JDOPersistenceManager.

  • HIVE-15792: Hive should raise SemanticException when LPAD/RPAD pad character's length is 0.

  • HIVE-15947: Enhance Templeton service job operations reliability.

  • HIVE-15947: Enhance Templeton service job operations reliability.

  • HIVE-15993: Hive REPL STATUS is not returning last event ID.

  • HIVE-16006: Incremental REPL LOAD Inserts doesn't operate on the target database if name differs from source database.

  • HIVE-16006: Incremental REPL LOAD Inserts doesn't operate on the target database if name differs from source database.

  • HIVE-16060: GenericUDTFJSONTuple's json cache could overgrow beyond its limit.

  • HIVE-16119: HiveMetaStoreChecker: remove singleThread logic duplication.

  • HIVE-16171: Support replication of truncate table.

  • HIVE-16186: REPL DUMP shows last event ID of the database even if we use LIMIT option.

  • HIVE-16193: Hive show compactions not reflecting the correct status of the application.

  • HIVE-16197: Incremental insert into a partitioned table doesn't get replicated.

  • HIVE-16225: Memory leak in webhcat service (FileSystem CACHE entries).

  • HIVE-16225: Memory leak in webhcat service (FileSystem CACHE entries).

  • HIVE-16254: metadata for values temporary tables for INSERTs are getting replicated during bootstrap.

  • HIVE-16266: Enable function metadata to be written during bootstrap.

  • HIVE-16267: Enable bootstrap function metadata to be loaded in repl load.

  • HIVE-16268: enable incremental repl dump to handle functions metadata.

  • HIVE-16269: enable incremental function dump to be loaded via repl load.

  • HIVE-16287: Alter table partition rename with location - moves partition back to hive warehouse.

  • HIVE-16290: Stats: StatsRulesProcFactory::evaluateComparator estimates are wrong when minValue == filterValue.

  • HIVE-16291: Hive fails when unions a parquet table with itself.

  • HIVE-16299: MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions.

  • HIVE-16321: Possible deadlock in metastore with Acid enabled.

  • HIVE-16347: HiveMetastoreChecker should skip listing partitions which are not valid when hive.msck.path.validation is set to skip or ignore.

  • HIVE-16372: Enable DDL statement for non-native tables (add/remove table properties).

  • HIVE-16427: Fix multi-insert query and write qtests.

  • HIVE-16461: DagUtils checks local resource size on the remote fs.

  • HIVE-16461: DagUtils checks local resource size on the remote fs.

  • HIVE-16473: Hive-on-Tez may fail to write to an HBase table.

  • HIVE-16488: Support replicating into existing db if the db is empty.

  • HIVE-16497: FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file system operations should be impersonated.

  • HIVE-16530: Add HS2 operation logs and improve logs for REPL commands.

  • HIVE-16567: parquet: tolerate when metadata is not set.

  • HIVE-16673: Create table as select does not check ownership of the location.

  • HIVE-16678: Truncate on temporary table fails with table not found error.

  • HIVE-16710: Make MAX_MS_TYPENAME_LENGTH configurable.

Hive 2.1.0 Apache Patches:

  • HIVE-11133: Support hive.explain.user for Spar.

  • HIVE-13652: Import table change order of dynamic partition.

  • HIVE-13673: LLAP: handle case where no service instance is found on the host specified in the input spli.

  • HIVE-14052: Cleanup structures when external clients use LLAP.

  • HIVE-14731: Cross product running with 1 reducer even when it's fed by 4 mappers and 1 reduce.

  • HIVE-14743: ArrayIndexOutOfBoundsException - HBASE-backed views' query with JOIN.

  • HIVE-15231: query on view with CTE and alias fails with table not found erro.

  • HIVE-15556: Replicate view.

  • HIVE-15642: Replicate Insert Overwrites, Dynamic Partition Inserts and Load.

  • HIVE-15702: Test timeout : TestDerbyConnecto.

  • HIVE-15708: Upgrade calcite version to 1.1.

  • HIVE-15754: exchange partition is not generating notification.

  • HIVE-15766: DBNotificationlistener leaks JDOPersistenceManage.

  • HIVE-15792: Hive should raise SemanticException when LPAD/RPAD pad character's length is .

  • HIVE-15964: LLAP: Llap IO codepath not getting invoked due to file column id mismatc.

  • HIVE-15993: Hive REPL STATUS is not returning last event I.

  • HIVE-16006: Incremental REPL LOAD Inserts doesn't operate on the target database if name differs from source databas.

  • HIVE-16006: Incremental REPL LOAD Inserts doesn't operate on the target database if name differs from source databas.

  • HIVE-16044: LLAP: Shuffle Handler keep-alive connections are closed from the server sid.

  • HIVE-16053: Remove newRatio from llap JAVA_OPTS_BAS.

  • HIVE-16060: GenericUDTFJSONTuple's json cache could overgrow beyond its limi.

  • HIVE-16119: HiveMetaStoreChecker: remove singleThread logic duplicatio.

  • HIVE-16120: Use jvm temporary tmp dir by defaul.

  • HIVE-16123: Let user pick the granularity of bucketing and max in row memor.

  • HIVE-16124: Drop the segments data as soon it is pushed to HDF.

  • HIVE-16171: Support replication of truncate tabl.

  • HIVE-16186: REPL DUMP shows last event ID of the database even if we use LIMIT optio.

  • HIVE-16193: Hive show compactions not reflecting the correct status of the applicatio.

  • HIVE-16197: Incremental insert into a partitioned table doesn't get replicate.

  • HIVE-16219: metastore notification_log contains serialized message with non functional field.

  • HIVE-16249: With column stats, mergejoin.q throws NP.

  • HIVE-16254: metadata for values temporary tables for INSERTs are getting replicated during bootstra.

  • HIVE-16266: Enable function metadata to be written during bootstra.

  • HIVE-16266: Enable function metadata to be written during bootstra.

  • HIVE-16267: Enable bootstrap function metadata to be loaded in repl loa.

  • HIVE-16268: enable incremental repl dump to handle functions metadat.

  • HIVE-16269: enable incremental function dump to be loaded via repl loa.

  • HIVE-16276: Fix NoSuchMethodError: com.amazonaws.services.s3.transfer.TransferManagerConfiguration.setMultipartUploadThreshol.

  • HIVE-16287: Alter table partition rename with location - moves partition back to hive warehous.

  • HIVE-16290: Stats: StatsRulesProcFactory::evaluateComparator estimates are wrong when minValue == filterValu.

  • HIVE-16291: Hive fails when unions a parquet table with itsel.

  • HIVE-16296: use LLAP executor count to configure reducer auto-parallelis.

  • HIVE-16299: MSCK REPAIR TABLE should enforce partition key order when adding unknown partition.

  • HIVE-16321: Possible deadlock in metastore with Acid enable.

  • HIVE-16330: Improve plans for scalar subquer.

  • HIVE-16341: Tez Task Execution Summary has incorrect input record counts on some operator.

  • HIVE-16347: HiveMetastoreChecker should skip listing partitions which are not valid when hive.msck.path.validation is set to skip or ignor.

  • HIVE-16371: Add bitmap selection strategy for druid storage handle.

  • HIVE-16372: Enable DDL statement for non-native table.

  • HIVE-16380: removing global test dependency of jsonasser.

  • HIVE-16385: StatsNoJobTask could exit early before all partitions have been processe.

  • HIVE-16386: Add debug logging to describe why runtime filtering semijoins are remove.

  • HIVE-16390: LLAP IO should take job config into account; also LLAP config should load default.

  • HIVE-16403: LLAP UI shows the wrong number of executor.

  • HIVE-16413: Create table as select does not check ownership of the locatio.

  • HIVE-16421: Runtime filtering breaks user-level explai.

  • HIVE-16423: Add hints for semijoi.

  • HIVE-16427: Fix multi-insert query and write qtest.

  • HIVE-16436: Response times in 'Task Execution Summary' at the end of the job is not correc.

  • HIVE-16441: De-duplicate semijoin branches in n-way join.

  • HIVE-16444: ATSHook should log AppID/DagID for Te.

  • HIVE-16448: Vectorization: Vectorized order_null.q fails with deserialize EOF exception below TEZ ReduceRecordSource.processVectorGrou.

  • HIVE-16457: vector_order_null.q failing in hive.

  • HIVE-16461: DagUtils checks local resource size on the remote f.

  • HIVE-16462: Vectorization: Enabling hybrid grace disables specialization of all reduce side join.

  • HIVE-16473: Hive-on-Tez may fail to write to an HBase tabl.

  • HIVE-16482: Druid Ser/Des need to use dimension output nam.

  • HIVE-16485: Enable outputName for RS operator in explain formatte.

  • HIVE-16488: Support replicating into existing db if the db is empt.

  • HIVE-16497: FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file system operations should be impersonate.

  • HIVE-16503: LLAP: Oversubscribe memory for noconditional task siz.

  • HIVE-16503: LLAP: Oversubscribe memory for noconditional task siz.

  • HIVE-16518: Insert override for druid does not replace all existing segment.

  • HIVE-16519: Fix exception thrown by checkOutputSpec.

  • HIVE-16520: Cache hive metadata in metastor.

  • HIVE-16523: VectorHashKeyWrapper hash code for strings is not so goo.

  • HIVE-16530: Add HS2 operation logs and improve logs for REPL command.

  • HIVE-16533: Vectorization: Avoid evaluating empty groupby key.

  • HIVE-16545: LLAP: bug in arena size determination logi.

  • HIVE-16546: LLAP: Fail map join tasks if hash table memory exceeds threshol.

  • HIVE-16547: LLAP: may not unlock buffers in some case.

  • HIVE-16550: Semijoin Hints should be able to skip the optimization if neede.

  • HIVE-16550: Semijoin Hints should be able to skip the optimization if neede.

  • HIVE-16553: Change default value for hive.tez.bigtable.minsize.semijoin.reductio.

  • HIVE-16568: Support complex types in external LLAP InputForma.

  • HIVE-16576: Fix encoding of intervals when fetching select query candidates from drui.

  • HIVE-16578: Semijoin Hints should use column name, if provided for partition key chec.

  • HIVE-16579: CachedStore: improvements to partition col stats caching and cache column stats for unpartitioned tabl.

  • HIVE-16581: bug in HIVE-1652.

  • HIVE-16586: Fix Unit test failures when CachedStore is enable.

  • HIVE-16588: Ressource leak by druid http clien.

  • HIVE-16588: Ressource leak by druid http clien.

  • HIVE-16598: LlapServiceDriver - create directories and warn of error.

  • HIVE-16599: NPE in runtime filtering cost when handling SMB Join.

  • HIVE-16602: Implement shared scans with Te.

  • HIVE-16610: Semijoin Hint : Should be able to handle more than one hint per alia.

  • HIVE-16628: Fix query25 when it uses a mix of MergeJoin and MapJoi.

  • HIVE-16628: Fix query25 when it uses a mix of MergeJoin and MapJoi.

  • HIVE-16633: username for ATS data shall always be the uid who submit the jo.

  • HIVE-16634: LLAP Use a pool of connections to a single A.

  • HIVE-16635: Progressbar: Use different timeouts for running querie.

  • HIVE-16637: Improve end-of-data checking for LLAP input forma.

  • HIVE-16639: LLAP: Derive shuffle thread counts and keep-aliv.

  • HIVE-16651: LlapProtocolClientProxy stack trace when using llap input forma.

  • HIVE-16652: LlapInputFormat: Seeing "output error" WARN messag.

  • HIVE-16655: LLAP: Avoid preempting fragments before they enter th.

  • HIVE-16673: Test for HIVE-1641.

  • HIVE-16678: Truncate on temporary table fails with table not found erro.

  • HIVE-16690: Configure Tez cartesian product edge based on LLAP cluster siz.

  • HIVE-16691: Add test for more datatypes for LlapInputForma.

  • HIVE-16692: LLAP: Keep alive connection in shuffle handle.

  • HIVE-16702: Use LazyBinarySerDe for LLAP InputForma.

  • HIVE-16710: Make MAX_MS_TYPENAME_LENGTH configurabl.

  • HIVE-16717: Extend shared scan optimizer to handle partition.

  • HIVE-16724: increase session timeout for LLAP ZK token manage.

  • HIVE-16737: LLAP: Shuffle handler TCP listen queue overflow.

  • HIVE-16742: cap the number of reducers for LLAP at the configured valu.

  • HIVE-16751: Hive-Druid Storagehandler: Tests failed as there is output-diff for query on timestamp datatyp.

  • HIVE-16776: Strange cast behavior for table backed by drui.

  • HIVE-16777: LLAP: Use separate tokens and UGI instances when an external client is use.

  • HIVE-16779: cachedStore leak PersistenceManager resource.

HDP 2.6.0 provided Hive 1.2.1 and Hive 2.1.0 in addition to the following patches:

Hive 1.2.1 Apache patches:

  • HIVE-10562: Add versioning/format mechanism to NOTIFICATION_LOG entries, expand MESSAGE siz.

  • HIVE-10924: add support for MERGE statemen.

  • HIVE-11030: Enhance storage layer to create one delta file per writ.

  • HIVE-11293: HiveConnection.setAutoCommit(true) throws exception .

  • HIVE-11594: Analyze Table for column names with embedded space.

  • HIVE-11616: DelegationTokenSecretManager reuses the same objectstore, which has concurrency issue.

  • HIVE-11935: Race condition in HiveMetaStoreClient: isCompatibleWith and clos.

  • HIVE-12077: MSCK Repair table should fix partitions in batche.

  • HIVE-12594: X lock on partition should not conflict with S lock on DB.

  • HIVE-12664: Bug in reduce deduplication optimization causing ArrayOutOfBoundException.

  • HIVE-12968: genNotNullFilterForJoinSourcePlan: needs to merge predicates into the multi-AND.

  • HIVE-13014: RetryingMetaStoreClient is retrying too aggressiveley .

  • HIVE-13083: Writing HiveDecimal to ORC can wrongly suppress present strea.

  • HIVE-13185: orc.ReaderImp.ensureOrcFooter() method fails on small text files with IndexOutOfBoundsException.

  • HIVE-13423: Handle the overflow case for decimal datatype for sum().

  • HIVE-13527: Using deprecated APIs in HBase client causes zookeeper connection leaks.

  • HIVE-13539: HiveHFileOutputFormat searching the wrong directory for HFiles .

  • HIVE-13756: Map failure attempts to delete reducer _temporary dir on pig multi-quer.

  • HIVE-13836: DbNotifications giving an error = Invalid state. Transaction has already started.

  • HIVE-13872: Queries failing with java.lang.ClassCastException when vectorization is enable.

  • HIVE-13936: Add streaming support for row_numbe.

  • HIVE-13966: DbNotificationListener: can loose DDL operation notification.

  • HIVE-14037: java.lang.ClassNotFoundException for the jar in hive.reloadable.aux.jars.path in mapreduc.

  • HIVE-14170: Beeline IncrementalRows should buffer rows and incrementally re-calculate width if TableOutputFormat is used .

  • HIVE-14229: the jars in hive.aux.jar.paths are not added to session classpath.

  • HIVE-14229: the jars in hive.aux.jar.paths are not added to session classpath .

  • HIVE-14251: Union All of different types resolves to incorrect data.

  • HIVE-14278: Migrate TestHadoop20SAuthBridge.java from Unit3 to Unit.

  • HIVE-14279: fix mvn test TestHiveMetaStore.testTransactionalValidatio.

  • HIVE-14290: Refactor HIVE-14054 to use Collections#newSetFromMap.

  • HIVE-14375: hcatalog-pig-adaptor pom.xml uses joda-time 2.2 instead of ${joda.version} that uses 2.8.1.

  • HIVE-14399: Fix test flakiness of org.apache.hive.hcatalog.listener.TestDbNotificationListener.cleanupNotifs.

  • HIVE-14436: Hive 1.2.1/Hitting "ql.Driver: FAILED: IllegalArgumentException Error: , expected at the end of 'decimal(9'" after enabling hive.optimize.skewjoin and with MR engine .

  • HIVE-14445: upgrade maven surefire to 2.19.1.

  • HIVE-14457: Partitions in encryption zone are still trashed though an exception is returned.

  • HIVE-14519: Multi insert query bug .

  • HIVE-14520: We should set a timeout for the blocking calls in TestMsgBusConnection.

  • HIVE-14591: HS2 is shut down unexpectedly during the startup time .

  • HIVE-14607: ORC split generation failed with exception: java.lang.ArrayIndexOutOfBoundsException: 1.

  • HIVE-14659: OutputStream won't close if caught exception in funtion unparseExprForValuesClause in SemanticAnalyzer.java .

  • HIVE-14690: Query fail when hive.exec.parallel=true, with conflicting session di.

  • HIVE-14693: Some paritions will be left out when partition number is the multiple of the option hive.msck.repair.batch.size.

  • HIVE-14715: Hive throws NumberFormatException with query with Null value.

  • HIVE-14762: Add logging while removing scratch spac.

  • HIVE-14773: NPE aggregating column statistics for date column in partitioned table .

  • HIVE-14774: Canceling query using Ctrl-C in beeline might lead to stale locks.

  • HIVE-14805: Subquery inside a view will have the object in the subquery as the direct input.

  • HIVE-14837: JDBC: standalone jar is missing hadoop core dependencie.

  • HIVE-14865: Fix comments after HIVE-14350.

  • HIVE-14922: Add perf logging for post job completion step.

  • HIVE-14924: MSCK REPAIR table with single threaded is throwing null pointer exception.

  • HIVE-14928: Analyze table no scan mess up schema.

  • HIVE-14929: Adding JDBC test for query cancellation scenari.

  • HIVE-14935: Add tests for beeline force optio.

  • HIVE-14943: Base Implementation (merge statement.

  • HIVE-14948: properly handle special characters in identifier.

  • HIVE-14959: Fix DISTINCT with windowing when CBO is enabled/disable.

  • HIVE-14966: Backport: JDBC: HiveConnction never saves HTTP cookies.

  • HIVE-14992: Relocate several common libraries in hive jdbc uber ja.

  • HIVE-14993: make WriteEntity distinguish writeTyp.

  • HIVE-15002: HiveSessionImpl#executeStatementInternal may leave locks in an inconsistent state.

  • HIVE-15010: Make LockComponent aware if it's part of dynamic partition operatio.

  • HIVE-15060: Remove the autoCommit warning from beeline .

  • HIVE-15099: PTFOperator.PTFInvocation didn't properly reset the input partition.

  • HIVE-15124: Fix OrcInputFormat to use reader's schema for include boolean arra.

  • HIVE-15137: metastore add partitions background thread should use current username.

  • HIVE-15151: Bootstrap support for replv2.

  • HIVE-15178: ORC stripe merge may produce many MR jobs and no merge if split size is small .

  • HIVE-15180: Extend JSONMessageFactory to store additional information about metadata objects on different table events.

  • HIVE-15231: query on view with CTE and alias fails with table not found error .

  • HIVE-15232: Add notification events for functions and indexes.

  • HIVE-15284: Add junit test to test replication scenarios.

  • HIVE-15291: Comparison of timestamp fails if only date part is provided.

  • HIVE-15294: Capture additional metadata to replicate a simple insert at destination.

  • HIVE-15307: Hive MERGE: "when matched then update" allows invalid column names.

  • HIVE-15322: Skipping "hbase mapredcp" in hive script for certain service.

  • HIVE-15327: Outerjoin might produce wrong result depending on joinEmitInterval value .

  • HIVE-15332: REPL LOAD & DUMP support for incremental CREATE_TABLE/ADD_PTN.

  • HIVE-15333: Add a FetchTask to REPL DUMP plan for reading dump uri, last repl id as ResultSet.

  • HIVE-15355: Concurrency issues during parallel moveFile due to HDFSUtils.setFullFileStatu.

  • HIVE-15365: Add new methods to MessageFactory API (corresponding to the ones added in JSONMessageFactory).

  • HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events.

  • HIVE-15426: Fix order guarantee of event executions for REPL LOAD.

  • HIVE-15437: avro tables join fails when - tbl join tbl_postfix .

  • HIVE-15448: ChangeManager.

  • HIVE-15466: REPL LOAD & DUMP support for incremental DROP_TABLE/DROP_PTN.

  • HIVE-15469: Fix REPL DUMP/LOAD DROP_PTN so it works on non-string-ptn-key tables.

  • HIVE-15472: JDBC: Standalone jar is missing ZK dependencie.

  • HIVE-15473: Progress Bar on Beeline clien.

  • HIVE-15478: Add file + checksum list for create table/partition during notification creation (whenever relevant.

  • HIVE-15522: REPL LOAD & DUMP support for incremental ALTER_TABLE/ALTER_PTN including renames.

  • HIVE-15525: Hooking ChangeManager to "drop table", "drop partition.

  • HIVE-15534: Update db/table repl.last.id at the end of REPL LOAD of a batch of events.

  • HIVE-15542: NPE in StatsUtils::getColStatistics when all values in DATE column are NULL.

  • HIVE-15550: fix arglist logging in schematool .

  • HIVE-15551: memory leak in directsql for mysql+bonecp specific initialization.

  • HIVE-15551: memory leak in directsql for mysql+bonecp specific initialization .

  • HIVE-15569: failures in RetryingHMSHandler.

  • HIVE-15579: Support HADOOP_PROXY_USER for secure impersonation in hive metastore client.

  • HIVE-15588: Vectorization: Fix deallocation of scratch columns in VectorUDFCoalesce, etc to prevent wrong reus.

  • HIVE-15589: Flaky org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testHeartbeater .

  • HIVE-15668: change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword.

  • HIVE-15684: Wrong posBigTable used in VectorMapJoinOuterFilteredOperato.

  • HIVE-15714: backport HIVE-11985 (and HIVE-12601) to branch-1 .

  • HIVE-15717: JDBC: Implement rowDeleted, rowInserted and rowUpdated to return false.

  • HIVE-15752: MSCK should add output WriteEntity for table in semantic analysis .

  • HIVE-15755: NullPointerException on invalid table name in ON clause of Merge statemen.

  • HIVE-15774: Ensure DbLockManager backward compatibility for non-ACID resources .

  • HIVE-15803: msck can hang when nested partitions are present.

  • HIVE-15830: Allow additional ACLs for tez jobs.

  • HIVE-15839: Don't force cardinality check if only WHEN NOT MATCHED is specifie.

  • HIVE-15840: Webhcat test TestPig_5 failing with Pig on Tez at check for percent complete of jo.

  • HIVE-15846: CNF error without hadoop jars, Relocate more dependencies (e.g. org.apache.zookeeper) for JDBC uber ja.

  • HIVE-15846: Relocate more dependencies (e.g. org.apache.zookeeper) for JDBC uber ja.

  • HIVE-15847: In Progress update refreshes seem slo.

  • HIVE-15848: count or sum distinct incorrect when hive.optimize.reducededuplication set to true.

  • HIVE-15851: SHOW COMPACTIONS doesn't show JobI.

  • HIVE-15871: Add cross join check in SQL MERGE stm.

  • HIVE-15871: enable cardinality check by defaul.

  • HIVE-15872: The PERCENTILE_APPROX UDAF does not work with empty se.

  • HIVE-15879: Fix HiveMetaStoreChecker.checkPartitionDirs metho.

  • HIVE-15889: Some tasks still run after hive cli is shutdow.

  • HIVE-15891: Detect query rewrite scenario for UPDATE/DELETE/MERGE and fail fast.

  • HIVE-15917: incorrect error handling from BackgroundWork can cause beeline query to hang.

  • HIVE-15935: ACL is not set in ATS dat.

  • HIVE-15936: ConcurrentModificationException in ATSHoo.

  • HIVE-15941: Fix o.a.h.hive.ql.exec.tez.TezTask compilation issue with tez maste.

  • HIVE-15950: Make DbTxnManager use Metastore client consistently with caller.

  • HIVE-15970: Merge statement implementation clashes with AST rewrite.

  • HIVE-15999: Fix flakiness in TestDbTxnManager2.

  • HIVE-16014: HiveMetastoreChecker should use hive.metastore.fshandler.threads instead of hive.mv.files.thread for pool siz.

  • HIVE-16028: Fail UPDATE/DELETE/MERGE queries when Ranger authorization manager is use.

  • HIVE-16045: Print progress bar along with operation lo.

  • HIVE-16050: Regression: Union of null with non-nul.

  • HIVE-16070: fix nonReserved list in IdentifiersParser..

  • HIVE-16086: Fix HiveMetaStoreChecker.checkPartitionDirsSingleThreaded metho.

  • HIVE-16090: Addendum to HIVE-1601.

  • HIVE-16102: Grouping sets do not conform to SQL standar.

  • HIVE-16114: NullPointerException in TezSessionPoolManager when getting the sessio.

  • HIVE-16160: OutOfMemoryError: GC overhead limit exceeded on Hs2 longevity test.

  • HIVE-16170: Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone ja.

  • HIVE-16172: Switch to a fairness lock to synchronize HS2 thrift clien.

  • HIVE-16175: Possible race condition in InstanceCache.

  • HIVE-16181: Make logic for hdfs directory location extraction more generic, in webhcat test drive.

  • HIVE-7224: Set incremental printing to true by default in Beeline.

  • HIVE-7239: Fix bug in HiveIndexedInputFormat implementation that causes incorrect query result when input backed by Sequence/RC files.

  • HIVE-9941: sql std authorization on partitioned table: truncate and inser.

Hive 2.1.0 Apache Patches:

  • HIVE-9941: sql std authorization on partitioned table: truncate and insert.

  • HIVE-12492: Inefficient join ordering in TPCDS query19 causing 50-70% slowdown.

  • HIVE-14278: Migrate TestHadoop23SAuthBridge.java from Unit3 to Unit4.

  • HIVE-14360: Starting BeeLine after using !save, there is an error logged: "Error setting configuration: conf".

  • HIVE-14362: Support explain analyze in Hive.

  • HIVE-14367: Estimated size for constant nulls is 0.

  • HIVE-14405: Have tests log to the console along with hive.log.

  • HIVE-14432: LLAP signing unit test may be timing-dependent.

  • HIVE-14445: upgrade maven surefire to 2.19.1.

  • HIVE-14612: org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout failure .

  • HIVE-14655: LLAP input format should escape the query string being passed to getSplits().

  • HIVE-14929: Adding JDBC test for query cancellation scenario.

  • HIVE-14935: Add tests for beeline force option.

  • HIVE-14959: Fix DISTINCT with windowing when CBO is enabled/disabled.

  • HIVE-14959: Fix DISTINCT with windowing when CBO is enabled/disabled.

  • HIVE-15002: HiveSessionImpl#executeStatementInternal may leave locks in an inconsistent state.

  • HIVE-15069: Optimize MetaStoreDirectSql:: aggrColStatsForPartitions during query compilation.

  • HIVE-15084: Flaky test: TestMiniTezCliDriver:explainanalyze_1, 2, 3, 4, 5.

  • HIVE-15099: PTFOperator.PTFInvocation didn't properly reset the input partition.

  • HIVE-15570: LLAP: Exception in HostAffinitySplitLocationProvider when running in container mode.

  • HIVE-15668: change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword.

  • HIVE-15789: Vectorization: limit reduce vectorization to 32Mb chunks.

  • HIVE-15799: LLAP: rename VertorDeserializeOrcWriter.

  • HIVE-15809: Typo in the PostgreSQL database name for druid service.

  • HIVE-15830: Allow additional ACLs for tez jobs.

  • HIVE-15847: In Progress update refreshes seem slow.

  • HIVE-15848: count or sum distinct incorrect when hive.optimize.reducededuplication set to true.

  • HIVE-15851: SHOW COMPACTIONS doesn't show JobID.

  • HIVE-15872: The PERCENTILE_APPROX UDAF does not work with empty set.

  • HIVE-15874: Invalid position alias in Group By when CBO failed.

  • HIVE-15877: Upload dependency jars for druid storage handler.

  • HIVE-15879: Fix HiveMetaStoreChecker.checkPartitionDirs method.

  • HIVE-15884: Optimize not between for vectorization.

  • HIVE-15903: Compute table stats when user computes column stats.

  • HIVE-15928: Druid/Hive integration: Parallelization of Select queries in Druid handler.

  • HIVE-15935: ACL is not set in ATS data.

  • HIVE-15938: position alias in order by fails for union queries.

  • HIVE-15941: Fix o.a.h.hive.ql.exec.tez.TezTask compilation issue with tez master.

  • HIVE-15948: Failing test: TestCliDriver, TestSparkCliDriver join31.

  • HIVE-15951: Make sure base persist directory is unique and deleted.

  • HIVE-15955: Provide additional explain plan info to facilitate display of runtime filtering and lateral joins.

  • HIVE-15958: LLAP: Need to check why 1000s of ipc threads are created.

  • HIVE-15959: LLAP: fix headroom calculation and move it to daemon.

  • HIVE-15969: Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer.

  • HIVE-15971: LLAP: logs urls should use daemon container id instead of fake container id.

  • HIVE-15991: Flaky Test: TestEncryptedHDFSCliDriver encryption_join_with_different_encryption_keys.

  • HIVE-15994: Grouping function error when grouping sets are not specified.

  • HIVE-15999: Fix flakiness in TestDbTxnManager2 .

  • HIVE-16002: Correlated IN subquery with aggregate asserts in sq_count_check UDF.

  • HIVE-16005: miscellaneous small fixes to help with llap debuggability.

  • HIVE-16010: incorrect conf.set in TezSessionPoolManager.

  • HIVE-16012: BytesBytes hash table - better capacity exhaustion handling.

  • HIVE-16013: Fragments without locality can stack up on nodes.

  • HIVE-16014: HiveMetastoreChecker should use hive.metastore.fshandler.threads instead of hive.mv.files.thread for pool size.

  • HIVE-16015: LLAP: some Tez INFO logs are too noisy II.

  • HIVE-16015: Modify Hive log settings to integrate with tez reduced logging.

  • HIVE-16018: Add more information for DynamicPartitionPruningOptimization.

  • HIVE-16020: LLAP: Reduce IPC connection misses.

  • HIVE-16022: BloomFilter check not showing up in MERGE statement queries.

  • HIVE-16023: Bad stats estimation in TPCH Query 12.

  • HIVE-16028: Fail UPDATE/DELETE/MERGE queries when Ranger authorization manager is used.

  • HIVE-16033: LLAP: Use PrintGCDateStamps for gc logging.

  • HIVE-16034: Hive/Druid integration: Fix type inference for Decimal DruidOutputFormat.

  • HIVE-16040: union column expansion should take aliases from the leftmost branch.

  • HIVE-16045: Print progress bar along with operation log.

  • HIVE-16050: Regression: Union of null with non-null.

  • HIVE-16054: AMReporter should use application token instead of ugi.getCurrentUser.

  • HIVE-16065: Vectorization: Wrong Key/Value information used by Vectorizer.

  • HIVE-16067: LLAP: send out container complete messages after a fragment completes.

  • HIVE-16068: BloomFilter expectedEntries not always using NDV when it's available during runtime filtering.

  • HIVE-16070: fix nonReserved list in IdentifiersParser.g.

  • HIVE-16072: LLAP: Add some additional jvm metrics for hadoop-metrics2.

  • HIVE-16078: improve abort checking in Tez/LLAP.

  • HIVE-16082: Allow user to change number of listener thread in LlapTaskCommunicator.

  • HIVE-16086: Fix HiveMetaStoreChecker.checkPartitionDirsSingleThreaded method.

  • HIVE-16090: Addendum to HIVE-16014.

  • HIVE-16094: queued containers may timeout if they don't get to run for a long time.

  • HIVE-16097: minor fixes to metrics and logs in LlapTaskScheduler.

  • HIVE-16098: Describe table doesn't show stats for partitioned tables.

  • HIVE-16102: Grouping sets do not conform to SQL standard.

  • HIVE-16103: LLAP: Scheduler timeout monitor never stops with slot nodes.

  • HIVE-16104: LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately.

  • HIVE-16114: NullPointerException in TezSessionPoolManager when getting the session.

  • HIVE-16115: Stop printing progress info from operation logs with beeline progress bar.

  • HIVE-16122: NPE Hive Druid split introduced by HIVE-15928.

  • HIVE-16132: DataSize stats don't seem correct in semijoin opt branch.

  • HIVE-16133: Footer cache in Tez AM can take too much memory.

  • HIVE-16135: Vectorization: unhandled constant type for scalar argument.

  • HIVE-16137: Default value of hive config hive.auto.convert.join.hashtable.max.entries should be set to 40m instead of 4m.

  • HIVE-16140: Stabilize few randomly failing tests.

  • HIVE-16142: ATSHook NPE via LLAP.

  • HIVE-16150: LLAP: HiveInputFormat:getRecordReader: Fix log statements to reduce memory pressure.

  • HIVE-16154: Determine when dynamic runtime filtering should be disabled.

  • HIVE-16160: OutOfMemoryError: GC overhead limit exceeded on Hs2 longevity tests.

  • HIVE-16161: Standalone hive jdbc jar throws ClassNotFoundException.

  • HIVE-16167: Remove transitive dependency on mysql connector jar.

  • HIVE-16168: log links should use the NM nodeId port instead of web port.

  • HIVE-16170: Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone jar.

  • HIVE-16172: Switch to a fairness lock to synchronize HS2 thrift client.

  • HIVE-16175: Possible race condition in InstanceCache.

  • HIVE-16180: LLAP: Native memory leak in EncodedReader.

  • HIVE-16190: Support expression in merge statement.

  • HIVE-16211: MERGE statement failing with ClassCastException.

  • HIVE-16215: counter recording for text cache may not fully work.

  • HIVE-16229: Wrong result for correlated scalar subquery with aggregate.

  • HIVE-16236: BuddyAllocator fragmentation - short-term fix.

  • HIVE-16238: LLAP: reset/end has to be invoked for o.a.h.hive.q.io.orc.encoded.EncodedReaderImpl.

  • HIVE-16245: Vectorization: Does not handle non-column key expressions in MERGEPARTIAL mode.

  • HIVE-16260: Remove parallel edges of semijoin with map joins.

  • HIVE-16274: Support tuning of NDV of columns using lower/upper bounds.

  • HIVE-16278: LLAP: metadata cache may incorrectly decrease memory usage in mem manager.

  • HIVE-16282: Semijoin: Disable slow-start for the bloom filter aggregate task.

  • HIVE-16298: Add config to specify multi-column joins have correlated columns.

  • HIVE-16305: Additional Datanucleus ClassLoaderResolverImpl leaks causing HS2 OOM.

  • HIVE-16310: Get the output operators of Reducesink when vectorization is on.

  • HIVE-16318: LLAP cache: address some issues in 2.2/2.3.

  • HIVE-16319: Fix NPE in ShortestJobFirstComparator.

  • HIVE-16323: HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204.

  • HIVE-16325: Sessions are not restarted properly after the configured interval.