Profiler data testing
You must note the important information about profiler services.
Test data for Compute Cluster enabled environments
The following dataset has been validated and works as expected for Compute Cluster enabled environments.
- Supported asset types:
- External Tables:
-
- CSV
- Parquet
- Iceberg
- Avro
- ORC
- Managed Tables:
-
- CSV
- Parquet
- Iceberg
- Avro
- ORC
- Scheduled profiling results:
- Total data tested:
- Total of 5 TB of data
- Total table count: 7700
- Tested tables:
- 600 Parquet tables
- 6500 CSV tables
- 300 Iceberg tables
- 150 Avro tables
- 150 ORC tables
- Total data tested:
- On-demand profiling results:
- Parquet table: 250 GB, sample (17 GB)
- Statistics Collector
profiler:
- 30 Executors / 9 mins
- 20 executors / 14 mins
- 10 executors / 30 mins
- Data Compliance
profiler:
- 30 Executors / 13 mins
- 20 executors / 18 mins
- 10 executors / 35 mins
- Statistics Collector
profiler:
- CSV table: 430 GB, sample (28 GB)
-
- Statistics Collector
profiler:
- 30 Executors / 23 mins
- 20 executors / 40 mins
- 10 executors / 86 mins
- Data Compliance
profiler:
- 30 Executors / 42 mins
- 20 executors / 63 mins
- 10 executors / 70 mins
- Statistics Collector
profiler:
- Parquet table: 250 GB, sample (17 GB)