Create Atlas entity type definitions

As an administrator, you must create the Atlas entity type definitions before submitting a Flink job, and enable Atlas in Cloudera Manager to use Atlas for metadata management.

The Flink Atlas integration requires that users running Flink jobs have the privileges to write to the ATLAS_HOOK topic. You need to make sure that the ATLAS_HOOK has either public or publish privileges in Ranger. In case this is not set, the metadata collection cannot be performed.
  1. Copy and paste the following command:
    curl -k -u <atlas_admin>:<atlas_admin_pwd> --location --request
  2. Provide your workload username and password as atlas admin.
  3. Navigate to Management Console > Environments, and select the environment where you have created your cluster.
  4. Click Data Lake.
  5. Click Endpoints.
  6. Click on the copy icon beside the Atlas endpoint.
  7. Replace the atlas server URL with the copied Atlas endpoint.
    POST '<atlas_endpoint_url>/v2/types/typedefs'
  8. Copy and paste the entity type definitions.
    --header 'Content-Type: application/json' \
    --data-raw '{
        "enumDefs": [],
        "structDefs": [],
        "classificationDefs": [],
        "entityDefs": [
            {
                "name": "flink_application",
                "superTypes": [
                    "Process"
                ],
                "serviceType": "flink",
                "typeVersion": "1.0",
                "attributeDefs": [
                    {
                        "name": "id",
                        "typeName": "string",
                        "cardinality": "SINGLE",
                        "isIndexable": true,
                        "isOptional": false,
                        "isUnique": true
                    },
                    {
                        "name": "startTime",
                        "typeName": "date",
                        "cardinality": "SINGLE",
                        "isIndexable": false,
                        "isOptional": true,
                        "isUnique": false
                    },
                    {
                        "name": "endTime",
                        "typeName": "date",
                        "cardinality": "SINGLE",
                        "isIndexable": false,
                        "isOptional": true,
                        "isUnique": false
                    },
                    {
                        "name": "conf",
                        "typeName": "map<string,string>",
                        "cardinality": "SINGLE",
                        "isIndexable": false,
                        "isOptional": true,
                        "isUnique": false
                    },
                    {
                        "name": "inputs",
                        "typeName": "array<string>",
                        "cardinality": "LIST",
                        "isIndexable": false,
                        "isOptional": false,
                        "isUnique": false
                    },
                    {
                        "name": "outputs",
                        "typeName": "array<string>",
                        "cardinality": "LIST",
                        "isIndexable": false,
                        "isOptional": false,
                        "isUnique": false
                    }
                ]
            }
        ],
        "relationshipDefs": []
    }'

    The following short video also details the procedure how to create the Atlas entity type definitons.

  9. Navigate to Management Console > Environments, and select the environment where you have created your cluster.
  10. Click on the Streaming Analytics cluster.
  11. Select Cloudera Manager UI from the Services.
  12. Select Flink from the list of clusters.
  13. Click Configuration.
  14. Search for enable atlas in the search bar.
  15. Make sure that Atlas is enabled in the configurations.

    The following short video also details the procedure how to enable Atlas for Flink in Cloudera Manager.