Known issues in Flow Management

Learn about the known issues and limitations in Flow Management clusters, their impact on functionality, and any available workarounds to mitigate these issues.

7.3.1.400

NiFi 1.28.1 with Cloudera Flow Management 2.2.9.400

There are no known issues in this release.

NiFi 2.3.0 with Cloudera Flow Management 4.2.1.400

NiFi service fails to start after upgrading to 7.3.1.400 from an earlier 7.3.1 version due to missing flow.json.gz file

When upgrading a Data Hub cluster from version 7.3.1.0, 7.3.1.100, 7.3.1.200, or 7.3.1.300 to 7.3.1.400 with NiFi 2, the upgrade process may fail, leaving NiFi instances in an unhealthy state and preventing the NiFi service from starting.

The issue occurs when only a flow.xml.gz file is present in the affected clusters, as the upgrade process expects a flow.json.gz file, the default format in NiFi 2.

This causes the post-upgrade validation script to fail with an error:
ERROR: please provide correct path to flow.json.gz file, current one is empty or invalid: "/hadoopfs/fs1/working-dir/flow.json.gz"

Resolve the issue using the following steps:

  1. Stop the NiFi service.
  2. Check your working directory.
  3. If only a flow.xml.gz file is present, rename it to flow.json.gz on all NiFi nodes.
  4. Start the NiFi service.

7.3.1.0

NiFi 1.28.1 with Cloudera Flow Management 2.2.9

CFM-4331: HBase 1.1.2 components incompatible with JDK17

HBase 1.1.2 components are not compatible with JDK 17.

To ensure full functionality and compatibility:
  1. Upgrade HBase 1.1.2 components to their corresponding versions in HBase 2.
  2. Upgrade your HBase servers.

If upgrading the servers is not feasible, the HBase 2 client can still interact with HBase 1 servers, but compatibility is limited. While basic functionalities work, new features introduced in the HBase 2 client are supported when interacting with an HBase 1 server.

Unused NiFi configuration values
The following NiFi configuration values are no longer in use. They are still visible in the UI, but they are obsolete and have no effect on functionality.
  • nifi.nar.hotfix.provider.file.list.identifier
  • nifi.nar.hotfix.provider.location.identifier
  • nifi.nar.hotfix.provider.last.modification.identifier
  • nifi.nar.hotfix.provider.directory.identifier
  • nifi.nar.hotfix.provider.date.time.format
  • nifi.nar.hotfix.provider.proxy.user
  • nifi.nar.hotfix.provider.proxy.password
  • nifi.nar.hotfix.provider.proxy.server
  • nifi.nar.hotfix.provider.proxy.server.port
  • nifi.nar.hotfix.provider.connect.timeout
  • nifi.nar.hotfix.provider.read.timeout
  • nifi.nar.hotfix.provider.nar.location
  • nifi.nar.hotfix.provider.poll.interval
  • nifi.nar.hotfix.provider.implementation
  • nifi.nar.hotfix.provider.user.name
  • nifi.nar.hotfix.provider.password
  • nifi.nar.hotfix.provider.base.url
  • nifi.nar.hotfix.provider.required.version
  • nifi.nar.hotfix.provider.enabled
Unable to view NiFi or NiFi Registry user interface after upgrade due to authorization provider change

After upgrading Flow Management Data Hub clusters to Cloudera on cloud 7.3.1 (or 7.2.18), you may encounter an issue where the NiFi or NiFi Registry user interface is inaccessible, displaying the following error:

Unable to view the user interface
In versions prior to 7.2.18, NiFi group authorization relied on the host’s SSSD configuration to synchronize groups using the SHELL user group provider. Starting in Cloudera on cloud 7.2.18, the SHELL user group provider is deprecated, and newly deployed clusters default to the LDAP user group provider. The impacted components are NiFi and NiFi Registry.

To resolve this issue in upgraded clusters, you must manually reconfigure the authorization provider to use LDAP. A script is available to automate the configuration update for both NiFi and NiFi Registry.

Follow the steps below to manually configure LDAP on your Flow Management Data Hub cluster:

  1. Identify the management node of the Flow Management cluster and copy the Fully Qualified Domain Name (FQDN).
  2. SSH into the management node.
  3. Copy the script content provided below and save it to a file on the management node.
  4. Set executable permissions on the script file: chmod 755 [***script_name***].sh
  5. Run the script passing the management node’s FQDN as an argument by using the following command: ./[***script_name***].sh FQDN_OF_MANAGEMENT_NODE
  6. When prompted, enter your Cloudera username and password to authorize the changes.

After completing these steps, NiFi and/or NiFi Registry will be configured to use the LDAP user group provider.

The script includes two functions, nifi and nifiregistry, which configure the LDAP user group provider for their respective services. Running the script updates NiFi and/or NiFi Registry to use LDAP, resolving the “Unable to view the user interface” error.

#!/bin/bash
clear
#init incoming variables
GREEN="\033[1;32m"
ORANGE="\033[38;2;255;165;0m"
RESET="\033[0m"
RED="\033[1;31m"
CM_HOST=$('hostname')

#Get my auth
echo -ne "${ORANGE}User Name: " # Need to bracket this var to avoid a space in front
read -s USERNAME
echo -ne "\nEnter Password: $RESET"
read -s PASSWORD
echo #to prevent weird need to hit enter twice
AUTH="$USERNAME:$PASSWORD"

# GetMY LDAP INFO

#extract password and ldap info from cm.settings
if [[ ! -f /etc/cloudera-scm-server/cm.settings ]]; then
  echo -ne "${RED}Error: File /etc/cloudera-scm-server/cm.settings does not exist." \
"\nMust be on Management node of DataHub\n"
exit 1
fi
LDAP_URL=$(awk '/setsettings LDAP_URL/ {print $NF}' /etc/cloudera-scm-server/cm.settings)
LDAP_BIND_DN=$(awk '/setsettings LDAP_BIND_DN/ {print $NF}' /etc/cloudera-scm-server/cm.settings)
LDAP_BIND_PW=$(awk '/setsettings LDAP_BIND_PW/ {print $NF}' /etc/cloudera-scm-server/cm.settings)
LDAP_USER_SEARCH_BASE=$(awk '/setsettings LDAP_USER_SEARCH_BASE/ {print $NF}' /etc/cloudera-scm-server/cm.settings)
LDAP_GROUP_SEARCH_BASE=$(awk '/setsettings LDAP_GROUP_SEARCH_BASE/ {print $NF}' /etc/cloudera-scm-server/cm.settings)


# Get My CM_API and if this fails it could be bad password or host so I will ERROR
CM_API=$(curl -s -k -u "$AUTH" https://$CM_HOST:7183/api/version)
if [[ ${#CM_API} -gt 4 ]]; then # This means probably bad user or password
    echo -ne "$RED Error! Most likely bad credentials below is response\n\n$CM_API $RESET"
    exit 1
fi
CM_HOST_API_URL="https://$CM_HOST:7183/api/$CM_API"
CM_CLUSTER_NAME=$(curl -s -k -u "$AUTH" -X GET "$CM_HOST_API_URL/clusters?clusterType=any&view=SUMMARY" |\
jq -r '.items[].name')


nifi ()
{
SERVICE="nifi-NIFI-BASE"

  mapfile -t CM_ROLES < <(curl --header "Content-Type: application/json" --silent --insecure  --request GET \
  "$CM_HOST_API_URL/clusters/$CM_CLUSTER_NAME/services/$SERVICE/roleConfigGroups" \
  -u $AUTH | jq -r '.items[].name' | grep -v "GATEWAY")

  cat > .cloudera-$SERVICE.json <<- EOF
  {"items":[
    {"name":"nifi.ldap.url","value":"$LDAP_URL"},
{"name":"nifi.ldap.manager.dn","value":"$LDAP_BIND_DN"},
    {"name":"nifi.ldap.manager.password","value":"$LDAP_BIND_PW"},
    {"name":"nifi.ldap.user.search.base","value":"$LDAP_USER_SEARCH_BASE"},
{"name":"xml.authorizers.userGroupProvider.ldap-user-group-provider.property.Group Search Base","value":"$LDAP_GROUP_SEARCH_BASE"},
    {"name":"nifi.ldap.enabled","value":"true"},
    {"name":"xml.authorizers.userGroupProvider.shell-user-group-provider.enabled","value":"false"},
    {"name":"nifi.ldap.authentication.strategy","value":"LDAPS"},
    {"name":"nifi.ldap.tls.protocol","value":"TLS"},
    {"name":"nifi.ldap.tls.keystore.type","value":"jks"},
    {"name":"nifi.ldap.tls.truststore.type","value":"jks"},
    {"name":"nifi.ldap.tls.keystore","value":"\${nifi.security.keystore}"},
    {"name":"nifi.ldap.tls.truststore","value":"\${nifi.security.truststore}"},
    {"name":"xml.authorizers.userGroupProvider.ldap-user-group-provider.property.Group Object Class","value":"top"},
    {"name":"xml.authorizers.userGroupProvider.ldap-user-group-provider.property.User Group Name Attribute","value":"memberOf"},
    {"name":"xml.authorizers.userGroupProvider.ldap-user-group-provider.property.User Identity Attribute","value":"uid"},
    {"name":"xml.authorizers.userGroupProvider.ldap-user-group-provider.property.Group Name Attribute","value":"cn"},
    {"name":"xml.authorizers.userGroupProvider.composite-user-group-provider.property.User Group Provider 2","value":"ldap-user-group-provider"},
    {"name":"staging/login-identity-providers.xml_role_safety_valve","value":"<property><name>xml.loginIdentityProviders.provider.ldap-provider.property.TLS - Keystore Password</name><value>\${nifi.security.keystorePasswd}</value></property><property><name>xml.loginIdentityProviders.provider.ldap-provider.property.TLS - Truststore Password</name><value>\${nifi.security.truststorePasswd}</value></property>"},
    {"name":"staging/authorizers.xml_role_safety_valve","value":"<property><name>xml.authorizers.userGroupProvider.ldap-user-group-provider.property.TLS - Keystore Password</name><value>\${nifi.security.keystorePasswd}</value></property><property><name>xml.authorizers.userGroupProvider.ldap-user-group-provider.property.TLS - Truststore Password</name><value>\${nifi.security.truststorePasswd}</value></property>"}
  ]}
EOF
updateService
}

updateService ()
{
  echo -ne "\n$GREEN Calling CM api to change $SERVICE config\n $RESET"
  for role in "${CM_ROLES[@]}"; do
    echo -ne "$ORANGE \n Updating role $role *** $RESET\n"
    curl -s --header "Content-Type: application/json" --insecure  --request PUT --data @.cloudera-$SERVICE.json \
    -u $AUTH "$CM_HOST_API_URL/clusters/$CM_CLUSTER_NAME/services/$SERVICE/roleConfigGroups/$role/config" > /dev/null
  done
  curl -s --header "Content-Type: application/json" --insecure  --request POST \
  -u $AUTH "$CM_HOST_API_URL/clusters/$CM_CLUSTER_NAME/services/$SERVICE/commands/restart" > /dev/null
  rm -f .cloudera-$SERVICE.json
  echo -ne "\n$GREEN Configured $ORANGE ldap-user-group provider $GREEN on SERVICE: $ORANGE $SERVICE\n\n\n"
}

nifiregistry ()
{
  SERVICE="nifiregistry"
  mapfile -t CM_ROLES < <(curl --header "Content-Type: application/json" --silent --insecure  --request GET \
    "$CM_HOST_API_URL/clusters/$CM_CLUSTER_NAME/services/$SERVICE/roleConfigGroups" \
    -u $AUTH | jq -r '.items[].name' | grep -v "GATEWAY")
  cat > .cloudera-$SERVICE.json <<- EOF
  {"items":[
    {"name": "nifi.registry.ldap.url","value":"$LDAP_URL"},
    {"name": "nifi.registry.ldap.manager.dn","value":"$LDAP_BIND_DN"},
    {"name":"nifi.registry.ldap.manager.password","value":"$LDAP_BIND_PW"},
    {"name":"nifi.registry.ldap.user.search.base","value":"$LDAP_USER_SEARCH_BASE"},
    {"name":"xml.authorizers.userGroupProvider.ldap-user-group-provider.property.Group Search Base","value":"$LDAP_GROUP_SEARCH_BASE"},
    {"name":"nifi.registry.ldap.enabled","value":"true"},
    {"name":"xml.authorizers.userGroupProvider.shell-user-group-provider.enabled","value":"false"},
    {"name":"nifi.registry.ldap.authentication.strategy","value":"LDAPS"},
    {"name":"nifi.registry.ldap.tls.protocol","value":"TLS"},
    {"name":"nifi.registry.ldap.tls.keystore.type","value":"jks"},
    {"name":"nifi.registry.ldap.tls.truststore.type","value":"jks"},
    {"name":"nifi.registry.ldap.tls.keystore","value":"\${nifi.security.keystore}"},
    {"name":"nifi.registry.ldap.tls.truststore","value":"\${nifi.security.truststore}"},
    {"name":"xml.authorizers.userGroupProvider.ldap-user-group-provider.property.Group Object Class","value":"top"},
    {"name":"xml.authorizers.userGroupProvider.ldap-user-group-provider.property.User Group Name Attribute","value":"memberOf"},
    {"name":"xml.authorizers.userGroupProvider.ldap-user-group-provider.property.User Identity Attribute","value":"uid"},
    {"name":"xml.authorizers.userGroupProvider.ldap-user-group-provider.property.Group Name Attribute","value":"cn"},
    {"name":"xml.authorizers.userGroupProvider.composite-user-group-provider.property.User Group Provider 2","value":"ldap-user-group-provider"},
    {"name":"staging/identity-providers.xml_role_safety_valve","value":"<property><name>xml.loginIdentityProviders.provider.ldap-provider.property.TLS - Keystore Password</name><value>\${nifi.security.keystorePasswd}</value></property><property><name>xml.loginIdentityProviders.provider.ldap-provider.property.TLS - Truststore Password</name><value>\${nifi.security.truststorePasswd}</value></property>"},
    {"name":"staging/authorizers.xml_role_safety_valve","value":"<property><name>xml.authorizers.userGroupProvider.ldap-user-group-provider.property.TLS - Keystore Password</name><value>\${nifi.security.keystorePasswd}</value></property><property><name>xml.authorizers.userGroupProvider.ldap-user-group-provider.property.TLS - Truststore Password</name><value>\${nifi.security.truststorePasswd}</value></property>"}
  ]}
EOF
updateService
}
nifi
nifiregistry
echo -ne "$GREEN\nRestarting $ORANGE NiFi / NiFi Registry $RESET\n"
sleep 10
exit 0
PutIcebergCDC processor error: Unable to specify server’s Kerberos Principal name
When using the PutIcebergCDC processor, you may encounter an error if the Hadoop Configuration Resources property specified for the Catalog Service only includes the standard Hadoop configuration files from Cloudera environment (/etc/hadoop/conf/core-site.xml, /etc/hadoop/conf/ssl-client.xml, and /etc/hive/conf/hive-site.xml). The error message states:
Failed to specify server’s Kerberos principal name.

To resolve this issue, simply add the hdfs-site.xml file to the Hadoop Configuration Resources of the PutIcebergCDC processor’s Catalog Service.

Incomplete Ranger policy for NiFi metrics in Cloudera Manager

Cloudera Manager does not accurately reflect NiFi metrics for the NiFi service due to incomplete Flow NiFi access policies in Ranger. The required 'nifi' group is not included in the access policies, resulting in restricted access to the metrics data.

To ensure that Cloudera Manager accurately reflects the NiFi metrics for the NiFi service:
  1. Log in to Ranger and navigate to the Flow NiFi access policies.
  2. Add the 'nifi' group to the relevant access policies to ensure that Cloudera Manager can access the metrics data.
  3. Confirm and save the updated policies.
InferAvroSchema may fail when inferring schema for JSON data

In Apache NiFi 1.17, the dependency on Apache Avro has been upgraded to 1.11.0. However, the InferAvroSchema processor depends on the hadoop-libraries NAR from which the Avro version comes from, causing a NoSuchMethodError exception.

Having well defined schemas ensures consistent behavior, allows for proper schema versioning, and prevents downstream systems from generating errors because of unexpected schema changes. Besides, schema inference may not always be 100% accurate and can be an expensive operation in terms of performances.

Use the ExtractRecordSchema processor with the proper Reader to infer the Avro schema for your data.

NiFi 2.0.0 with Cloudera Flow Management 4.2.1

Processors using OpenAI library may not work
When using Flow Management clusters, several processors relying on the OpenAI library are not functional due to compatibility issues caused by OpenAI API changes. The affected processors use an outdated OpenAI library version that is no longer supported. The impacted processors are:
  • PutChroma
  • QueryChroma
  • PromptChatGPT
  • PutOpenSearchVector
  • QueryOpenSearchVector
  • PutPinecone
  • QueryPinecone
  • PutQdrant
  • QueryQdrant

These processors require an updated OpenAI library version (1.56.2 or later) to function correctly.

To restore functionality for the impacted processors, follow these steps:
  1. Update the OpenAI library version in the associated Python code to version 1.56.2.

    1. Locate the processor's py file.

      For example: /opt/cloudera/parcels/CFM-4.0.0.0-382/NIFI-2/python/extensions/openai/PromptChatGPT.py

    2. Find "openai==1.9.0" and replace it with "openai==1.56.2".

  2. Navigate to the NiFi work directory and delete the folder for the affected processors.

    The directory path for an affected processor typically follows this structure: /var/lib/nifi/python_artifacts/extensions/<ProcessorName>/<Version>

    For example, for the PromptChatGPT processor, the path would be: /var/lib/nifi/python_artifacts/extensions/PromptChatGPT/2.0.0.4.0.0.0-382

    So in this case, delete the entire PromptChatGPT folder, including its version folder.

  3. Restart the NiFi service to apply the changes.
Invalid Python version

Due to the invalid Python version defined for the NiFi service, the Python API based processors (such as PromptChatGPT, QueryPinecone, and so on) will remain invalid as the NiFi service will be unable to download the associated dependencies. The issue can be resolved by changing the version for the nifi.python.command property.

  1. Go to your cluster in Cloudera Manager.
  2. Select NiFi from the list of services.
  3. Select Configuration.
  4. Review the value defined for nifi.python.command property.
  5. Change the value to python3.11 if the current value is python3.9.
  6. Click Save changes.
  7. Stop the NiFi service.
  8. Delete the /hadoopfs/fs4/working-dir/python_artifacts directory from all NiFi nodes.
  9. Restart the NiFi service.
PutIcebergCDC processor error: Unable to specify server’s Kerberos Principal name

When using the PutIcebergCDC processor, you may encounter an error if the Hadoop Configuration Resources property specified for the Catalog Service only includes the standard Hadoop configuration files from Cloudera environment (/etc/hadoop/conf/core-site.xml, /etc/hadoop/conf/ssl-client.xml, and /etc/hive/conf/hive-site.xml). The error message states: Failed to specify server’s Kerberos principal name.

To resolve this issue, simply add the hdfs-site.xml file to the Hadoop Configuration Resources of the PutIcebergCDC processor’s Catalog Service.

NiFi service fails to start after upgrading from 7.3.1.0 to a higher version due to missing flow.json.gz file

When upgrading a Data Hub cluster from version 7.3.1.0 to 7.3.1.100, 7.3.1.200, or 7.3.1.300 with NiFi 2, the upgrade process may fail, leaving NiFi instances in an unhealthy state and preventing the NiFi service from starting.

The issue occurs because the upgrade process expects a flow.json.gz file (the default for NiFi 2), but the affected clusters only contain a flow.xml.gz file. This mismatch causes the post-upgrade validation script to fail with the following error:
ERROR: please provide correct path to flow.json.gz file, current one is empty or invalid: "/hadoopfs/fs1/working-dir/flow.json.gz"

Resolve the issue using the following steps:

  1. Stop the NiFi service.
  2. Replace the existing NiFi CSD JAR on the Cloudera Manager node with a patched version. Contact Cloudera Support to obtain the correct file.
  3. Restart Cloudera Manager Services.
  4. Rename the existing flow.xml.gz file to flow.json.gz on all NiFi nodes.
  5. Start the NiFi service.

This issue was addressed in Cloudera on cloud 7.3.1.400 with Cloudera Flow Management 2.2.9.400.