Table 1.12. Apache HBase
Apache JIRA | |
Hortonworks Bug ID | BUG-42355 |
Description |
Moved application from HDP 2.2 to HDP 2.3 and now ACLs don't appear to be functioning the same Workaround:
Set
<property> <name>hbase.security.access.early_out</name> <value>false</value> </property> |
Apache JIRA | HBASE-13330, HBASE-13647 |
Hortonworks Bug ID | BUG-36817 |
Description | test_IntegrationTestRegionReplica Replication[IntegrationTestRegion ReplicaReplication] fails with READ FAILURES |
Apache JIRA | |
Hortonworks Bug ID | BUG-39322 |
Description |
The HBase bulk load process is a MapReduce job that typically runs under the
user ID who owns the source data. HBase data files created as a result of the job
are then bulk-loaded into HBase RegionServers. During this process, HBase
RegionServers move the bulk-loaded files from the user's directory, and moves
(renames) the files under the HBase Workaround: Run the MapReduce job as the
|
Apache JIRA | HBASE-13832, HDFS-8510 |
Hortonworks Bug ID | BUG-40536 |
Description |
When rolling upgrade is performed for HDFS, sometimes the HBase Master might run out of datanodes on which to keep its write-pipeline active. When this occurs, the HBase Master Aborts after a few attempts to keep the pipeline going. To avoid this situation: Workaround:
Note: There is a window of time during the rolling upgrade of HDFS when the HBase Master might be working with just one node and if that node fails, the WAL data might be lost. In practice, this is an extremely rare situation. Alternatively, the HBase Master can be turned off during the rolling upgrade of HDFS to avoid the above procedure. If this strategy is taken, client DDL operations and RegionServer failures cannot be handled during this time. A final alternative if the HBase Master fails during rolling upgrade of HDFS, a manual start can be performed. |
Apache JIRA | |
Hortonworks Bug ID | BUG-42186 |
Description |
HDP 2.3 HBase install needs MapReduce class path modified for HBase functions to work Cluster that have Phoenix enabled placed the following config in hbase-site.xml: Property: hbase.rpc.controllerfactory.class Value:org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory This property points to a class found only in phoenix-server jar. To resolve this class at run time for the above listed Mapreduce Jobs, it needs to be part of the MapReduce classpath. Workaround: Update mapreduce.application.classpath property in mapred-site.xml file to point to /usr/hdp/current/phoenix-client/phoenix-server.jar file. |
Table 1.13. Apache Hive
Apache JIRA | HIVE-11587 |
Hortonworks Bug ID | BUG-42500 |
Description |
Hive Hybrid Grace MapJoin can cause OutOfMemory Issues Hive Hybrid Grace Mapjoin is a new feature in HDP 2.3 (Hive 1.2). Mapjoin joins two tables, holding the smaller one in memory. Grace Hybrid Mapjoin spills parts of the small table to disk when the Map Join does not fit in memory at runtime. Right now there is a bug in the code that can cause this implementation to use too much memory, causing an OutOfMemory error. This applies to the Tez execution engine only. Workaround: Turn off hybrid grace map join by setting this property in hive-site.xml:
|
Apache JIRA | HIVE-11110 |
Hortonworks Bug ID | BUG-39988 |
Description | CBO: Default partition filter is from MetaStore query causing TPC-DS to regress by 3x. |
Apache JIRA | |
Hortonworks Bug ID | BUG-39412 |
Description |
Users should not use Setting |
Apache JIRA | HIVE-10978 |
Hortonworks Bug ID | BUG-39282 |
Description |
When HDFS is encrypted (data at rest encryption is enabled) and the Hadoop Trash feature is enabled, DROP TABLE and DROP PARTITION have unexpected behavior. (The Hadoop Trash feature is enabled by setting When Trash is enabled, the data file for the table should be "moved" to the Trash bin, but if the table is inside an Encryption Zone, this "move" operation is not allowed. Workaround: Here are two ways to work around this issue: 1. Use PURGE, as in DROP TABLE ... PURGE. This skips the Trash bin even if Trash is enabled. 2. set |
Apache JIRA | |
Hortonworks Bug ID | BUG-38785 |
Description |
With RHEL7, the Workaround: Create your own directory(such as
If you wish to mount the |
Apache JIRA | |
Hortonworks Bug ID | BUG-37042 |
Description |
Limitations while using timestamp.formats serde parameter. Two issues involving the timestamp.formats SerDe parameter:
|
Table 1.14. Apache Oozie
Apache JIRA | OOZIE-2311 |
Hortonworks Bug ID | BUG-39265 |
Description | NPE in oozie logs while running feed replication tests causes jobs to fail. |
Table 1.15. Apache Ranger
Apache JIRA | RANGER_577 |
Hortonworks Bug ID | BUG-38054 |
Description | Ranger should not change Hive config if authorization is disabled |
Table 1.16. Apache Slider
Apache JIRA | SLIDER-909 |
Hortonworks Bug ID | BUG-40682 |
Description | Slider HBase app package fails in secure cluster with wire-encryption on |
Table 1.17. Apache Spark
Apache JIRA | |
Hortonworks Bug ID | BUG-41644, BUG-41484 |
Description | Apache and custom Spark builds need an HDP specific configuration. See the Troubleshooting Spark: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_spark-quickstart/content/ch_troubleshooting-spark-quickstart.html section for more details. |
Apache JIRA | |
Hortonworks Bug ID | BUG-38046 |
Description |
Spark ATS is missing Kill event If a running Spark application is killed in the YARN ATS ( |
Apache JIRA | |
Hortonworks Bug ID | BUG-39468 |
Description |
When accessing an HDFS file from pyspark, the HADOOP_CONF_DIR environment must be set. For example: export HADOOP_CONF_DIR=/etc/hadoop/conf [hrt_qa@ip-172-31-42-188 spark]$ pyspark [hrt_qa@ip-172-31-42-188 spark]$ >>>lines = sc.textFile("hdfs://ip-172-31-42-188.ec2.internal:8020/tmp/PySparkTest/file-01") ....... If HADOOP_CONF_DIR is not set properly, you might receive the following error:
Py4JJavaError: An error occurred while calling z:org.apache.spark.api. python.PythonRDD.collectAndServe. org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
|
Apache JIRA | |
Hortonworks Bug ID | BUG-39674 |
Description | Spark does not yet support wire encryption, dynamic executor allocation, SparkR, GraphX, Spark Streaming, iPython, or Zeppelin. |
Table 1.18. Apache Tez
Apache JIRA | |
Hortonworks Bug ID | BUG-40608 |
Description |
Tez UI View/Download link fails if URL does not match cookie. Workaround: Tez UI View/Download link will work if a browser accesses a URL that matches the cookie. Example: MapReduce JHS cookie is set with an
external IP address. If a user clicks on the link from their internal cluster, the
URL will differ and the request will fail with a |
Table 1.19. Apache YARN
Apache JIRA | YARN-2194 |
Hortonworks Bug ID | BUG-39424 |
Description | NM fails to come with error "Not able to enforce cpu weights; cannot write to cgroup." |
Apache JIRA | |
Hortonworks Bug ID | BUG-39756 |
Description | NM web UI cuts ?user.name when redirecting URL to MR JHS. |
Apache JIRA | |
Hortonworks Bug ID | BUG-35942 |
Description |
Users must manually configure ZooKeeper security with ResourceManager High Availability. Right now, the default value of To make it more secure, we can rely on Kerberos to do the authentication for us. We could configure sasl authentication and only Kerberos authenticated user can access to zkrmstatestore. ZooKeeper Configuration Note: This step of securing ZooKeeper is to be done once for the HDP cluster. If this has been done to secure HBase, for example, then you do not need to repeat these ZooKeeper steps if Apache YARN ResourceManager High Availability is to use the same ZooKeeper.
Apache YARN Configuration The following applies to HDP 2.2 and HDP 2.3. Note: All nodes which launched the ResourceManager (active / standby) should make these changes.
HDFS Configuration Note: This applies to HDP 2.1, 2.2, and 2.3.
|
Table 1.20. HDFS and Cloud Deployment
Apache JIRA | HADOOP-11618, HADOOP-12304 |
Hortonworks Bug ID | BUG-42065 |
Description |
HDP 2.3: Cannot set non HDFS FS as default. This prevents S3, WASB, and GCC from working. HDP cannot be configured to use an external file system as the default file system - such as Azure WASB, Amazon S3, Google Cloud Storage. The default file system is configured in core-site.xml using the fs.defaultFS property. Only HDFS can be configured as the default file system. These external file systems can be configured for access as an optional file system, just not as the default file system. |
Table 1.21. Upgrade
Apache JIRA | HDFS-8782 |
Hortonworks Bug ID | BUG-41215 |
Description |
Upgrade to block ID-based DN storage layout delays DN registration. When upgrading from a pre-HDP-2.2 release, a DataNode with a lot of disks, or with blocks that have random block IDs, can take a long time (potentially hours). The DataNode will not register to the NameNode until it finishes upgrading the storage directory. |
Apache JIRA | |
Hortonworks Bug ID | BUG-32401 |
Description | Rolling upgrade/downgrade should not be used if truncate is turned on. Workaround: Before starting a rolling upgrade or downgrade process, turn truncate off. |