Google BigQuery

Learn how to configure Google BigQuery as a metadata source for Cloudera Octopai.

Tool Permissions Prerequisites

Google Service Account (SA) integrated with Cloudera Octopai.

How to establish a Google Cloud Service Account

Creating a Google Cloud Service Account (SA)

  1. Open the Google Cloud Console and navigate to your desired project.
  2. Access the IAM & Admin menu and select 'Service Accounts'.
  3. Click on 'Create Service Account' to initiate the creation process.
  4. Assign a unique, identifiable name to your Service Account. Upon clicking 'Create and Continue', an Identity and Access Management (IAM) principal will be instantiated automatically.
  5. From the 'roles' dropdown menu, select both 'BigQuery Data Viewer' and 'BigQuery Job User' to assign necessary permissions to your Service Account.
  6. Complete the Service Account creation process by clicking 'Done'.
  7. Access the newly created Service Account and navigate to the 'KEYS' tab.

Generating a Service Account Key

  • Click on 'ADD KEY', then 'Create new key'. Make sure to select the JSON format. The key will automatically be downloaded to your local system.

Configuring Cloudera Octopai with BigQuery:

  1. Open Cloudera Octopai and start the creation of a new metadata source, choosing 'BigQuery' as your source.
  2. Assign a descriptive name to this metadata source for easy reference in the future.
  3. Input the Project ID associated with your Google Cloud project. This can be found within the downloaded JSON file (under the 'project_id' field) or in the project selector on the Google Cloud Console.
  4. Specify the file path where your downloaded JSON key file is stored in the 'Key Path' field.
  5. Save your settings and initiate the connection by clicking 'Save and Run'.

Setting up Google BigQuery Metadata Source

Metadata Sources are set on the Cloudera Octopai Client

How to verify the extracted Metadata File

Access the Cloudera Octopai Target Folder (TGT)

  1. Go to the TGT Folder located on the Server where the Cloudera Octopai Client is installed. By default: C:\Program Files (x86)\Octopai\Service\TGT
  2. Open the zip file having your Connector Name Example:
  3. Verify its content Quantity & Quality of inner files

Troubleshoot

Error during the extraction:

  • Check the permissions
  • Send the log with the connector number and name to Cloudera Support - C:\Program Files (x86)\Octopai\Service\log