Functional adjustments and behavioral updates for Atlas are introduced in Cloudera Runtime 7.3.2, its service packs, and cumulative hotfixes.
Cloudera Runtime 7.3.2 introduces functional adjustments, behavioral
updates for Atlas, and includes all service packs and cumulative hotfixes from 7.3.1.100
through 7.3.1.706. For a comprehensive record of all functional adjustments in Cloudera Runtime 7.3.1.x, see Behavioral Changes.
Cloudera Runtime 7.3.2
- Summary: Automatic purging of soft-deleted
entities is introduced
- Previous
behavior:
Previously, only the API call DELETE
/api/atlas/admin/purge was available to manually purge soft-deleted entities.
Additionally, the DELETE api/atlas/v2/entity/guid/{{guid}} API call could
not delete the column lineages entities of Hive, Impala and Spark process entities. This
could lead to sparse graphs resulting in reduced query performance.
- New
behavior:
A built-in auto-purge mechanism is introduced, deleted entities
are purged in two stages. The first stage is a soft-delete at each Atlas startup. In the
second-stage, soft-deleted process entities are purged based on a cron job. For more
information, see Atlas Auto-Purging
overview.
- Summary:
Entity attributes
details and sparkPlanDescription are no longer sent in the Spark
process entity
- Previous behavior:
The spark_process entity
attributes details and sparkPlanDescription are
populated with query plan details, which can contain a large amount of text, often in
megabytes. This amount of data can incur unnecessary processing costs.
- New behavior:
The atlas.spark.plan.enabled
is set to false by default. Set it to true to send the details and
sparkPlanDescription attributes in the Spark process entity. When
these attributes are not sent, the cost of having large amount of data processed in
Atlas is avoided.