Create Atlas entity type definitions

As an administrator, you must create the Atlas entity type definitions before submitting a Flink job, and enable Atlas in Cloudera Manager to use Atlas for metadata management.

The Flink Atlas integration requires that users running Flink jobs have the privileges to write to the ATLAS_HOOK Kafka topic. You need to make sure that the ATLAS_HOOK has publish privileges in Ranger. In case this is not set, the metadata collection cannot be performed. For more information, see the Atlas documentation and the Grant permission for the ATLAS_HOOK topic documentation.
  1. Copy and paste the following command:
    curl -k -u <atlas_admin>:<atlas_admin_pwd> --location --request
  2. Provide your workload username and password as atlas admin.
  3. Navigate to Management Console > Environments, and select the environment where you have created your cluster.
  4. Click Data Lake.
  5. Click Endpoints.
  6. Click on the copy icon beside the Atlas endpoint.
  7. Replace the atlas server URL with the copied Atlas endpoint.
    POST '<atlas_endpoint_url>/v2/types/typedefs'
  8. Copy and paste the entity type definitions.
    --header 'Content-Type: application/json' \
    --data-raw '{
        "enumDefs": [],
        "structDefs": [],
        "classificationDefs": [],
        "entityDefs": [
            {
                "name": "flink_application",
                "superTypes": [
                    "Process"
                ],
                "serviceType": "flink",
                "typeVersion": "1.0",
                "attributeDefs": [
                    {
                        "name": "id",
                        "typeName": "string",
                        "cardinality": "SINGLE",
                        "isIndexable": true,
                        "isOptional": false,
                        "isUnique": true
                    },
                    {
                        "name": "startTime",
                        "typeName": "date",
                        "cardinality": "SINGLE",
                        "isIndexable": false,
                        "isOptional": true,
                        "isUnique": false
                    },
                    {
                        "name": "endTime",
                        "typeName": "date",
                        "cardinality": "SINGLE",
                        "isIndexable": false,
                        "isOptional": true,
                        "isUnique": false
                    },
                    {
                        "name": "conf",
                        "typeName": "map<string,string>",
                        "cardinality": "SINGLE",
                        "isIndexable": false,
                        "isOptional": true,
                        "isUnique": false
                    }
                ]
            }
        ],
        "relationshipDefs": []
    }'

    The following short video shows how you can create the Atlas entity type definitons:

  9. Navigate to Management Console > Environments, and select the environment where you have created your cluster.
  10. Click on the Streaming Analytics cluster.
  11. Select Cloudera Manager UI from the Services.
  12. Select Flink from the list of clusters.
  13. Click Configuration.
  14. Search for enable atlas in the search bar.
  15. Make sure that Atlas is enabled in the configurations.

    The following short video shows how you can enable Atlas for Flink in Cloudera Manager: