Datasets in Data Visualization

Datasets are the foundation and starting point for visualizing your data. They are defined on the connections to your data and provide access to the specific tables in the data store.

A dataset is the logical representation of the data you want to use to build visuals. It is a logical pointer to a physical table or a defined structure in your data source. Datasets may represent the contents of a single data table or a data matrix from several tables that may be in different data stores on the same connection.

Other than providing access to data, datasets enhance data access and use in many ways, including (but not limited to):

  • Table joins allow you to supplement the primary data with information from various other data sources. For more information, see Data modeling.

  • Derived fields/attributes support flexible expressions, both for dimensions and for aggregates. For more information, see Creating calculated fields.

  • Hiding fields enables you to eliminate the fields that are unnecessary to the business use case or to obscure sensitive data without affecting the base tables. For more information, see Hiding dataset fields from applications.

  • Changing data types of the field attributes often helps you to deal with data types, or to ensure that numeric codes (like event ids) are processed correctly. For more information, see Changing data type.

  • Changing the default aggregation of fields at the dataset level prevents common mistakes when building visuals. For more information, see Changing field aggregation.

  • Providing user-friendly names for native columns or derived attributes often makes the visuals more accessible and saves some of the efforts of applying aliases to each field of the visual. For more information, see Automatically renaming dataset fields and Custom renaming dataset fields.