Post-upgrade - Ozone Gateway validation
If you are using CDE, after upgrading CDP Private Cloud Data Services you must validate that the Ozone Gateway is working as expected. This procedure applies to both 1.5.2 and 1.5.3 to 1.5.4 upgrades.
You can run the following commands to get the types of logs that are available with the job run.
Command 1
cde run logs --id <run_id> --show-types --vcluster-endpoint <job_api_url> --cdp-endpoint <cdp_control_plane_enpoint> --tls-insecure
For example,
cde run logs --id 8 --show-types --vcluster-endpoint https://76fsk4rz.cde-fmttv45d.apps.apps.shared-rke-dev-01.kcloud.cloudera.com/dex/api/v1 --cdp-endpoint https://console-cdp-keshaw.apps.shared-rke-dev-01.kcloud.cloudera.com --tls-insecure
Log:
TYPE | ENTITY | STREAM | ENTITY DEFAULT |
---|---|---|---|
driver/stderr | Driver | stderr | True |
driver/stdout | Driver | stdout | False |
executor_1/stderr | Executor 1 | stderr | True |
executor_2/stdout | Executor 2 | stdout | False |
Command 2
cde run logs --id <run_id> --type <log_type> --vcluster-endpoint <job_api_url> --cdp-endpoint <cdp_control_plane_enpoint> --tls-insecure
For example,
cde run logs --id 8 --type driver/stderr --vcluster-endpoint https://76fsk4rz.cde-fmttv45d.apps.apps.shared-rke-dev-01.kcloud.cloudera.com/dex/api/v1 --cdp-endpoint https://console-cdp-keshaw.apps.shared-rke-dev-01.kcloud.cloudera.com --tls-insecure
Log:
Setting spark.hadoop.yarn.resourcemanager.principal to hive
23/05/22 09:27:28 INFO SparkContext: Running Spark version 3.2.3.1.20.7172000.0-38
23/05/22 09:27:28 INFO ResourceUtils: ==============================================================
23/05/22 09:27:28 INFO ResourceUtils: No custom resources configured for spark.driver.
23/05/22 09:27:28 INFO ResourceUtils: ==============================================================
23/05/22 09:27:28 INFO SparkContext: Submitted application: PythonPi
23/05/22 09:27:28 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
23/05/22 09:27:29 INFO ResourceProfile: Limiting resource is cpus at 1 tasks per executor
23/05/22 09:27:29 INFO ResourceProfileManager: Added ResourceProfile id: 0
23/05/22 09:27:29 INFO SecurityManager: Changing view acls to: sparkuser,cdpuser1
23/05/22 09:27:29 INFO SecurityManager: Changing modify acls to: sparkuser,cdpuser1
23/05/22 09:27:29 INFO SecurityManager: Changing view acls groups to:
23/05/22 09:27:29 INFO SecurityManager: Changing modify acls groups to:
23/05/22 09:27:29 INFO SecurityManager: SecurityManager: authentication enabled; ui acls disabled; users with view permissions: Set(sparkuser, cdpuser1); groups with view permissions: Set(); users with modify permissions: Set(sparkuser, cdpuser1); groups with modify permissions: Set()
..................
.................
- If you can see the driver pod logs, then Ozone Gateway is working as expected and you can go ahead with the upgrade.
- If the logs do not appear, then you can try restarting the Ozone Gateway and get Spark job's driver log to validate if Ozone gateway is healthy or not.
- If you do not get the Spark job driver log, then you must contact Cloudera Support.
- For more information about configuring CDE CLI, see Using the Cloudera Data Engineering command line interface