Importing the passenger data

Use Hue to import data which is not present in your data warehouse, but is available on your computer or cloud storage such as Amazon S3.

In this exercise, the passenger data is not already available in Hue or in CDP but instead is available on your computer in a CSV file. Import the passenger data into Hue from your computer as follows:

  1. Go to Importer and click .. at the end of the Path field.


    A pop-up window is dispalyed where you can choose a file.
  2. Type s3a:// in the address text box and press enter.
    The S3 buckets associated with the CDW environment are displayed. You can narrow down the list of results using the search option.


  3. Navigate to the bucket in which you want to upload the file and click Upload a file.
  4. Select the CSV file that you want to import into Hue.
    Hue displays the preview of the table along with the format.
    Hue automatically detects the field separator, record separator, and the quote character from the CSV file. If you want to override a specific setting, select a different value from the drop-down list.


  5. Click Next.
    On this page under DESTINATION, enter airline_ontime_orc.unique_tickets in the Name field.
    The unique_tickets table is created in the airline_ontime_orc database.
  6. Select Format as Text.
  7. Expand Extras and select the Store in Default location option.


  8. Change all the “bigint” column types to “int” under FIELDS and click Submit.
    The CREATE TABLE query is triggered.


    Hue displays the logs and opens the Table Browser from which you can view the newly created table when the operation completes successfully.


  9. Run the following query to find passengers with international connecting flights:
    -- Query to explore passenger manifest data:  do we have international connecting flights?
    SELECT 
    *
    FROM 
       `airline_ontime_orc`.unique_tickets a, 
       `airline_ontime_orc`.flights o, 
       `airline_ontime_orc`.flights d,
       `airline_ontime_orc`.airports oa, 
       `airline_ontime_orc`.airports da  
    WHERE
       a.leg1flightnum = o.flightnum
       AND a.leg1uniquecarrier = o.uniquecarrier 
       AND a.leg1origin = o.origin 
       AND a.leg1dest = o.dest 
       AND a.leg1month = o.month 
       AND a.leg1dayofmonth = o.dayofmonth
       AND a.leg1dayofweek = o.`dayofweek` 
       AND a.leg2flightnum = d.flightnum
       AND a.leg2uniquecarrier = d.uniquecarrier 
       AND a.leg2origin = d.origin 
       AND a.leg2dest = d.dest 
       AND a.leg2month = d.month 
       AND a.leg2dayofmonth = d.dayofmonth
       AND a.leg2dayofweek = d.`dayofweek` 
       AND d.origin = oa.iata 
       AND d.dest = da.iata 
       AND oa.country <> da.country 
    ;