Tracking dataset versions [Technical Preview]

Cloudera Data Visualization supports dataset versioning, a key feature for tracking changes, supporting a collaborative dataset development, and ensuring the integrity of your datasets over time.

With dataset versioning, every modification to a dataset automatically creates a new version. This allows you to:

  • Track how data structures and configurations evolve.
  • Restore previous versions if errors are introduced or a rollback is needed.
  • Manage collaborative changes more confidently, with the ability to reinstate earlier states.

To use this feature, you must first enable dataset version control in the site settings. For instructions on how to enable and configure dataset version control, see Managing version control site settings.

  1. On the main navigation bar, click DATA.
    The Data view opens, displaying the Datasets tab.
  2. Locate your dataset using search or scrolling.
  3. Click the dataset name to open the Dataset Detail page.
  4. Click Version Control in the side navigation to view version history.

    The Version Control page shows the details of the current dataset version and a history of previous versions (if available).

    Current version

    The current version is either the latest saved version after a dataset modification, or a previous version that was reinstated.

    This version reflects the dataset’s current structure, content, and configuration. The first version of a dataset is always the active or current version, and it remains the current version, even if unnamed, until it is replaced by a new version.

    Previous versions
    Each dataset modification (for example editing a dataset field, the data model or the time model, or adding new segments) triggers the creation of a new version of the dataset. A snapshot of the dataset’s state prior to the change is saved as a previous version, and the new modification becomes the current version.
    You can:
    • Sort versions by any column by clicking its header.
    • Filter versions using the search bar.
    • Delete all previous versions in bulk, or cancel individual versions for finer control.

    Named and unnamed versions: New versions are unnamed by default and display a timestamp as the name. These unnamed versions are subject to automatic deletion based on site settings. To preserve a version, you must assign it a name. To name a dataset version, click next to its name and assign a custom name.

  5. Optional: You can revert to a previous version.
    1. To change the current dataset version, locate the previous version on the Version Control page that you want to reinstate as the current version and click .
    2. In the Change Dataset Versions modal window, review the details.
    3. Click Resume to make the selected version active or Cancel to discard the action.