Documentation Errata in Cloudera Runtime 7.1.7 SP1
You must be aware of the major enhancements or changes and additions or corrections to the components in Cloudera Runtime 7.1.7 SP1. Learn how the new improvements benefit you.
HWC Secure access mode: As part of the Cloudera Runtime 7.1.7 SP1 release, Hive Warehouse Connector (HWC) introduces the secure access mode that offers fine-grained access control (FGAC) column masking and row filtering to secure managed (ACID), or even external, Hive table data that you query from Spark. Secure access mode requires you to set up an HDFS staging location to temporarily store Hive files that users need to read from Spark. For details, see Reading data through HWC secure access mode.
- You can now create a username for Hue that is 150 characters long. There is no longer a restriction of 30 characters for the Hue username.
- You may not be able to use the
pipcommand in CDP releases 7.1.7 and above and may see the following error when using
pipin a command: “ImportError: cannot import name chardet”. For a workaround, see Unable to use pip command in CDP.
A new tool
kudu master unsafe_rebuild is added to reconstruct the master
catalog from tablet metadata collected from tablet servers. This can be used in emergencies
to restore access to tables when all masters are unavailable.
Ranger has added support for the GCP Cloud and CipherTrust HSMs and enhanced the encryption algorithms supported for the Luna HSM. For details, see Integrating Components for Encrypting Data at Rest.
Platform Support Enhancements
New DB Versions: Maria DB 10.5 and PostgreSQL 14. For more information, see Cloudera Support Matrix.
Streams Replication Manager
- The SRM Driver can now write the origin offset into the record header
- SRM now supports a diagnostic feature in which the source offset of the replicated
records are written into the headers. The feature can be turned on by setting
copy.source.offset.in.header.enabledto true. When enabled, the source offset is written into a header named
mm2-source-offsetin binary format. The schema of the header payload is available in the
connect:mirror-clientpackage, the class name is
org.apache.kafka.connect.mirror.SourceOffsets. This feature is only recommended for diagnostic purposes, as the header change increases the size of the replica topic.
- SRM now waits for latest offset syncs and does not set the consumer offset into the future
- The MirrorCheckpointConnector now checks the latest message in the offset sync topic
at startup, and does not emit a checkpoint message until it has read from the beginning
all the messages prior and including that last message.
As a part of this improvement, a new configuration property,
emit.checkpoints.end.offset.protectionis introduced. When this property is enabled, the MirrorCheckpointTask checks the end offset of the replicated topic prior to emitting a checkpoint, and limits the replicated offset to be maximum that value. With this behavior enabled, SRM no longer encounters an issue where in certain situations the replicated offset could be higher than the end offset of the replicated topic, producing a negative lag. The property is enabled by default, but can be configured using the Streams Replication Manager's Replication Configs property.