GreenPlum

Learn how to configure GreenPlum as a metadata source for Cloudera Octopai.

Tool Permissions Prerequisites

Add the following line to the pg_hba.conf file located under /data/master/gpseg-1.

Enable remote access, allowing the Cloudera Octopai Client to perform the MetaData extraction:
TYPE DATABASE USER ADDRESS METHOD
host databasename octopai-user octopai-client-ip-address md5

Replace "databasename", "octopai-user", and "octopai-client-ip-address" with the actual values for your database and client. This line will allow the Cloudera Octopai Client to access the specified database using md5 authentication.

Ensure the following prerequisites are met:

  • Postgres ODBC driver installed on the machine running the Cloudera Octopai Client.
  • Open Server Port for each Postgre Database Connection.
  • Existing/New Postgres user (OCTOPAI_USER) for each connection with grant select permission for the following dictionary tables:
    • information_schema.tables
    • information_schema.views
    • information_schema.columns
    • pg_catalog.pg_proc
    • pg_catalog.pg_namespace
    • pg_catalog.pg_attribute
    • pg_catalog.pg_constraint
    • pg_catalog.pg_class
    • pg_catalog.pg_database

How to set up the permissions

psqlodbc - PostgreSQL ODBC driver

Setting up GreenPlum Metadata Source

Metadata Sources are set on the Cloudera Octopai Client

How to verify the extracted Metadata File

Access the Cloudera Octopai Target Folder (TGT)

  1. Go to the TGT Folder located on the Server where the Cloudera Octopai Client is installed. By default: C:\Program Files (x86)\Octopai\Service\TGT
  2. Open the zip file having the Connector Name. Example:
  3. Verify its content: Quantity & Quality of inner files

Troubleshoot

Error during the extraction:

  • Check the permissions
  • Send the log with the connector number and name to Cloudera Support - C:\Program Files (x86)\Octopai\Service\log