How To Format CSV Workload Files
Typically, CSV workload files are extracted from logs or databases and, at a minimum, must have a column that contains the full text of queries. CSV files must be plain text files. For more details about using CSV files with Navigator Optimizer, see the FAQs about formatting upload files.
However, Navigator Optimizer can produce better results if uploaded CSV files also contain query identifiers and elapsed query execution times. If you also add additional columns that contain attributes you are interested in analyzing, you can use the Workload Slice & Dice Feature.
Columns in CSV Files that Navigator Optimizer Recognizes by Default
By default, Navigator Optimizer recognizes three types of information in plain text CSV files. Set the Navigator Optimizer column type label in the Map Columns for the CSV File page of the Upload wizard.
|Columns that contain...||Navigator Optimizer expects...||Set Navigator Optimizer column type label to...||Required/Optional|
|Query identifiers||Alphanumeric strings||QUERY ID||Optional|
|Elapsed query execution times||Numeric values||ELAPSED TIME||Optional|
|Full text of queries||
Strings that contain the full text of SQL or Hive queries enclosed in double quotation marks. See Handling Double Quotation Marks and Commas in Query Text for details.
Handling Double Quotation Marks and Commas in Query Text
Double Quotation Marks within Queries
If queries contain double quotation marks (" ") within the query text, change the double quotation marks to single quotation marks. For example, if you have:
select emps.id from emps where emps.name = "Joe" group by emps.mgr, emps.id;
Enclose the query text with double quotation marks and change the double quotation marks around Joe to single quotation marks:
"select emps.id from emps where emps.name = 'Joe' group by emps.mgr, emps.id;"
Commas within Queries
If queries contain commas (,) within the query text, enclose the query text with double quotation marks to differentiate from the commas used as delimiters in the CSV file. For example, if you have:
select orders.key, orders.id from orders where orders.price < 9999
Enclose the query text with double quotation marks to ensure the commas are parsed as part of the query text and not confused as delimiters:
"select orders.key, orders.id from orders where orders.price < 9999"
Column Headings Are Required and Must Be Unique
The first row of your CSV files must contain column headings and each column heading must be unique for the CSV file. Navigator Optimizer assumes that the first row of values in a CSV file contains column headings and does not include those values in its analysis. If each column heading within a CSV file is not unique, the file cannot be processed by Navigator Optimizer.
You can include additional columns that you require in your workload and set tags for them so Navigator Optimizer displays the columns with meaningful names. For example, you might need to identify the user that ran the query, or the report or application for which the query was run. In this example, you might set the column tags to “USER,” “REPORTNAME,” and “APPLICATION” in the Map Columns for the CSV File page of the Upload wizard. These tags enable Slice and Dice, allowing you to view information about each subset of queries in the Navigator Optimizer Dashboard.
Uploading Multiple CSV Files in .tar Files
If you upload multiple CSV files in an archive file, make sure that each CSV file in the archive contains the same columns.