Troubleshooting for RAZ-enabled AWS environment

This section includes common errors that might occur while using a RAZ-enabled AWS environment and the steps to resolve the issues.

Why does the AccessDeniedException error appear? How do I resolve it?

Complete error snippet:
org.apache.hadoop.hive.ql.metadata.HiveException: java.nio.file.AccessDeniedException: 
s3a://abc-eu-central/abc/hive: getFileStatus on s3a://abc-eu-central/abc/hive: 
com.amazonaws.services.signer.model.AccessDeniedException: Ranger result: DENIED, Audit: 
[AuditInfo={auditId={null} accessType={read} result={NOT_DETERMINED} policyId={-1} policyVersion={null} }], 
Username: hive/demo-dh-master0.abc.xcu2-8y8x.dev.cldr.work@abc.XCU2-8Y8X.DEV.CLDR.WORK (Service: null; 
Status Code: 403; Error Code: AccessDeniedException; Request ID: f137c7b2-4448-4599-b75e-d96ae9308c5b; 
Proxy: null):AccessDeniedException

Cause: This error appears if the hive user does not have the required S3 policy.

Solution: To resolve this error, add the required policy for the user. For more information, see Ranger policy options for RAZ-enabled AWS environment.

Why does the Permission denied error appear? How do I resolve it?

Complete error snippet:
Error: Error while compiling statement: 
FAILED: HiveAccessControlException Permission denied: user [csso_abc] does not have 
[READ] privilege on [s3a://abc-eu-central/abc/hive] (state=42000,code=40000)

Cause: This error appears if you did not add the required Hive URL authorization policy to the end user.

Solution: To resolve this error, add the required policy for the user. For more information, see Ranger policy options for RAZ-enabled AWS environment

Why does the S3 access in hive/spark/mapreduce fails despite having an "allow" policy error message appear? How do I resolve it?

Complete error snippet:
S3 access in hive/spark/mapreduce fails despite having an "allow" policy defined for the path 
with ‘java.nio.file.AccessDeniedException: s3a://sshiv-cdp-bucket/data/exttable/000000_0: getFileStatus on 
s3a://sshivalingamurthy-cdp-bucket/data/exttable/000000_0: com.amazonaws.services.signer.model.AccessDeniedException: 
Ranger result: NOT_DETERMINED, Audit: [AuditInfo={auditId={a6904660-1a0f-3149-8a0b-c0792aec3e19} accessType={read} 
result={NOT_DETERMINED} policyId={-1} policyVersion={null} }] (Service: null; Status Code: 403; Error Code: AccessDeniedException; 
Request ID: null):AccessDeniedException’
Cause: In some cases, RAZ S3 requires a non-recursive READ access on the parent S3 path. This error appears when the non-recursive READ access is not provided to the parent S3 path. To resolve this issue, perform the following steps:
  1. On the Ranger > Audit page, track the user and the path where authorization failed.

  2. Add the missing Ranger policy to the end user.
The following sample image shows the Access tab on the Ranger > Audit page where you can track the user and the path:
The image shows the Access tab on the Ranger Audit page where you can track the user and the path.

The following error appears after the job fails. How do I resolve this issue?

Complete error snippet:
2021-06-01 00:00:18,872 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         
- Shutting YarnJobClusterEntrypoint down with application status FAILED. Diagnostics java.nio.file.AccessDeniedException: 
s3a://jon-s3-raz/data/flink/araujo-flink/ha/application_1622436888784_0031/blob: getFileStatus on 
s3a://jon-s3-raz/data/flink/araujo-flink/ha/application_1622436888784_0031/blob: com.amazonaws.services.signer.model.AccessDeniedException: 
Ranger result: DENIED, Audit: [AuditInfo={auditId={84aba1b3-82b5-3d0c-a05b-8f5512e0fd36} accessType={read} result={NOT_DETERMINED} 
policyId={-1} policyVersion={null} }], Username: srv_kafka-client@PM-AWS-R.A465-9Q4K.CLOUDERA.SITE (Service: null; Status Code: 403; 
Error Code: AccessDeniedException; Request ID: null; Proxy: null):AccessDeniedException

Cause: This error appears if there is no policy granting the necessary access on the /data/flink/araujo-flink/ha path.

Solution: To resolve this issue, add the policy as described in Configuring Ranger policies for Flink.

What do I do to display the Thread ID in logs for RAZ?

When you enable the DEBUG level for RAZ, a detailed verbose log is generated. It is cumbersome and tedious to identify the issue pertaining to a specific RAZ authorization request; therefore, Thread ID is good information to capture in the debug log.

To capture Thread ID in the debug log, perform the following steps:

  • In Cloudera Manager, go to the Ranger RAZ service on the Configuration tab.
  • Search for Ranger Raz Server Logging Advanced Configuration Snippet (Safety Valve) property and enter the following information:

    log4j.logger.org.apache.ranger.raz=DEBUG

    log4j.appender.RFA.layout.ConversionPattern=[%p] %d{yy/MM/dd HH:mm:ss,SSS} [THREAD ID=%t]
              [CLASS=(%C:%L)] %m%
    n
  • Search for Ranger Raz Server Max Log Size. Enter 200 and choose MiB.
  • Click Save Changes.
  • Restart the Ranger RAZ service.

The service prints the debug log details with Thread ID. To debug an issue, you can filter the logs based on the Thread ID.

The sample snippet shows the log generated with Thread ID:
The image shows the shows the log generated with Thread ID.

What do I do when a long-running job consistently fails with the expired token error?

Sample error snippet - “Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The provided token has expired. (Service: Amazon S3; Status Code: 400; Error Code: ExpiredToken; Request ID: xxxx; S3 Extended Request ID:xxxx...“

This issue appears when the S3 Security token has expired for the s3 call.

To resolve this issue, perform the following steps:
  1. Log in to the CDP Management Console as an Administrator and go to your environment.
  2. From the Data Lake tab, open Cloudera Manager.
  3. Go to the Clusters > HDFS service > Configurations tab.
  4. Search for the core-site.xml file corresponding to the required Data Hub cluster.
  5. Open the file and enter fs.s3a.signature.cache.max.size=0 to disable the signature caching in Ranger RAZ.
  6. Save and close the file.
  7. On the Home > Status tab, click Actions to the right of the cluster name and select Deploy Client Configuration.
  8. Click Deploy Client Configuration.
  9. Restart the HDFS service.