Known issues and technical limitations for Ozone are addressed in Cloudera Runtime 7.3.2, its service packs, and cumulative
hotfixes.
Known issues identified in Cloudera Runtime 7.3.2
- CDPD-63350: Force deleting a FSO bucket and its
contents while running rb --force from AWS S3 API is
failing
- 7.3.2
- Force deleting a FSO bucket and its contents
while running the rb --force from the AWS S3 API might
fail with an error, because the S3 client sends individual delete requests
for each key in the bucket. It may delete or fail to delete individual keys
or directories, depending on the availability of leaf elements. It can
completely delete the bucket only when all the keys or directories are
cleaned up, if some keys are not deleted, the bucket will not be
deleted.
Sample error
message:
# aws s3 rm s3://buck-fso --recursive
delete: s3://buck-fso/dir1/
delete: s3://buck-fso/dir1/dir2/
delete: s3://buck-fso/dir3/dir4/dir5/
delete: s3://buck-fso/dir3/dir4/
delete failed: s3://buck-fso/dir3/
# Rerun the same command again
# aws s3 rm s3://buck-fso --recursive
delete: s3://buck-fso/dir1/
delete: s3://buck-fso/dir3/
# To Confirm
# ozone sh key list s3v/buck-fso
[ ]
- Run the rb --force command
multiple times to completely cleanup the keys and directories.
- Apache JIRA:
HDDS-9637
- CDPD-98751: Mismatched Replicas tab in the Recon UI
fails to display containers with inconsistent replica checksums
- 7.3.2
- In the Ozone Recon UI, Mismatched
Replicas tab does not update or display containers when one
or more replicas have differing checksums. Instead, they are displayed in
the Under-Replicated tab.
- The replica can be checked in the
Under-Replicated tab, or can be cross-checked
against API response.
- CDPD-97512: Mismatch in Open Key count between
Overview page and OM DB Insight in the Recon UI
- 7.3.2
- In the Ozone Recon UI, the Open Key count on
the Overview page (Summary section) may not match the
count on the tab. Users might see different Open Key values for the same
cluster across these two views.
- Use the tab for the most accurate Open Key count.
- CDPD-97376: Container replication counts mismatch
between new and old Container pages in the Recon UI
- 7.3.2
- In the Ozone Recon UI, Container replication
counts—including Under-Replicated, Over-Replicated, and
Mis-Replicated—differ between the new and old Container pages for the same
cluster.
- Refer to the old Container page to view accurate
replication counts.
- CDPD-97311: Incorrect Creation
Time and Modification Time displayed
on the Namespace Usage page in the Recon UI
- 7.3.2
- In the Ozone Recon UI, the Namespace
Usage page displays incorrect Created and Modified
timestamps. These values do not accurately reflect the actual creation or
last modification times or dates of the namespace, resulting in inaccurate
metadata information.
- Retrieve the correct timestamp values directly
from the API response.
- CDPD-97312: Mismatch between cluster State Container
count and Container Summary totals
- 7.3.2
- Ozone Recon does not sync all container states
and can have discrepancies between the Storage Container Manager (SCM)
container count and the Recon container count due to QUASI_CLOSED and other
container states.
- There is no workaround from the Ozone Recon. Use
ozone admin container report CLI to get the correct
container counts for various states in the Ozone cluster.
- CDPD-99248: After cdh upgrade, Ozone encountered
failure while executing the Finalize Upgrade for SCM on role
Storage Container command
- 7.3.2
- When finalizing an Ozone upgrade for the first
time from Cloudera Manager, the finalize command
Finalize Upgrade for SCM on role Storage Container
may fail with the following message in
stderr:
"Invalid response from Storage Container Manager.
Current finalization status is: FINALIZATION_IN_PROGRESS"
Despite
this, upgrade finalization is still running on the Storage Container
Manager (SCM).
- Even though the Cloudera Manager command failed, finalization is still running on SCM as the message
indicates. Use the command ozone admin scm
finalizationstatus to check the status of SCM finalization and
wait for it to complete even if the Cloudera Manager
command fails.
- CDPD-98892: File Size
Distribution bucket size range calculation is not
correct
- 7.3.2
- There are large number of buckets within a
volume. The Ozone Recon UI might display incorrect file size distribution
bucket size range calculation in theFile Size
Distribution chart in the Insights
page.
- None
- Apache JIRA:
HDDS-14827
- CDPD-93116: Ozone client hangs for approximately five
minutes intermittently when the disk is full
- 7.3.2
- When the Ozone client writes data to the
DataNode and there is an exception due to disk full condition or other error
on the DataNode, then the client hangs for approximately five minutes due to
continuous retry to write data till pipeline over the DataNode is
closed.
- Retry of the requests must be controlled using
below configuration from the client side:
Table 1.
| Configuration |
Recommended value |
| hdds.ratis.raft.client.rpc.request.timeout |
30s |
| hdds.ratis.client.multilinear.random.retry.policy |
1s, 1 |
| hdds.ratis.client.exponential.backoff.max.sleep |
5s |
| hdds.ratis.client.exponential.backoff.base.sleep |
1s |
| hdds.ratis.client.exponential.backoff.max.retries |
2 |
- Apache JIRA:
HDDS-14040
Known Issues identified before Cloudera Runtime 7.3.2
Known issues identified before Cloudera Runtime 7.3.2 include only
unresolved issues from previous releases that continue to affect the Cloudera Runtime 7.3.2 base release.
- CDPD-91562:
test_validate_certs_configs configuration is failing
with the maximum lifetime validation
- 7.3.2, 7.3.1.600
- In daylight saving time zones, the autogenerated Ozone
certificate duration might differ from the expected duration. This
discrepancy is minor, because the default certificate duration is 365 days
or five years, depending on the Ozone component.
- CDPD-54885: Ozone Prometheus does not work with
TLS
- 7.3.2, 7.3.1 and it's higher versions
- The Prometheus service shipped by Ozone does not
support TLS mode. So, Prometheus is not able to gather metrics from Ozone
endpoints when TLS is enabled.
- Go to and update the Ozone Prometheus Endpoint
Token with any random string. This makes a token available
in the process directory of the Ozone endpoint in plaintext, which
Prometheus can use to get around the TLS limitation.
- CDPD-56684: Keys get deleted when you do not have
permission on volume
- 7.3.2, 7.3.1 and it's higher versions
- When a volume is deleted, it recursively deletes the
buckets and keys inside it and only then deletes the volume. The volume
delete ACL check is done only in the end, due to which you may end up
deleting all the data inside the volume without having delete permission on
the volume.
- CDPD-50610: Large file uploads are slow with OPEN and
stream data approach
- 7.3.2, 7.3.1 and it's higher versions
- Hue file browser uses the append operation for large
files. This API is not supported by Ozone in 7.1.9 and therefore large file
uploads can be slow or timeout from the browser.
- Use native Ozone client to upload large files
instead of the Hue file browser.
- OPSAPS-66469: Ozone-site.xml is missing if the host
does not contain HDFS roles
- 7.3.2, 7.3.1 and it's higher versions
- The client side ozone-site.xml
(/etc/hadoop/conf/ozone-site.xml) is not generated
by Cloudera Manager if the host does not have any HDFS
role. Because of this, issuing Ozone commands from that host fails because
it cannot find the service name to host name mapping. The error message is
similar to this: # ozone sh volume list o3://ozoneabc 23/03/06
18:46:15 WARN ha.OMProxyInfo: OzoneManager address ozoneabc:9862 for
serviceID null remains unresolved for node ID null Check your
ozone-site.xml file to ensure ozone manager addresses are configured
properly.
- Add the HDFS gateway role on that host.
- CDPD-74016: Running Ozone sh token print on token generated
using ozone dtutil command results in Null Pointer
Exception.
- Print the token generated by dtutil using
dtutil.
- CDPD-74013: Ozone dtutil get token fails when using o3 or
ofs schemas.
- Use ozone sh token get to get a token for ozone
file system.
- CDPD-63144: Key rename inside the FSO bucket fails and
discplays the Failed to get parent dir error. This happens
when running impala workloads with ozone.
- None.
- CDPD-74331: Key put fails and displays the Failed to
write chunk error when there is a volume failure during
configuration.
- None.
- CDPD-74475: YCSB test with Hbase on Ozone degrades
performance.
- 7.3.2, 7.3.1 and it's higher versions
- None.
- CDPD-74884: Exclusive size of snapshot is always 0 when you
run the info command on the Ozone snapshots. It is a
statistics issue and does not impact the functionality of Ozone
snapshot.
- 7.3.2, 7.3.1 and it's higher versions
- The
exclusiveSize and
exclusiveReplicationSize stats presented by
snapshot info are 0 even though the snapshot contains
exclusive keys or files that are not present in other snapshots.
- None.
- Apache JIRA:
HDDS-11528
- CDPD-75042: AWS CLI rm or
delete command fails to delete all files and
directories on the Ozone FSO bucket. It only deletes the leaf node.
- 7.3.2, 7.3.1 and it's higher versions
- None.
- CDPD-75204: Namenode restart fails after
dfs.namenode.fs-limits.max-component-length is set
to a lower value and there is existing data present which exceeds the length
limit.
- 7.3.2, 7.3.1 and it's higher versions
- Increase the value for the
dfs.namenode.fs-limits.max-component-length
parameter and restart the namenode.
- CDPD-75635: Ozone write fails intermittently as SCM remains
in safemode.
- 7.3.2, 7.3.1 and it's higher versions
- You must wait for SCM to come out of samemode or
exit from safemode through CLI options.