CDP resource roles and other prerequisites

To get started in Cloudera Data Warehouse (CDW), your data must conform to supported compression codecs, and you must obtain CDP resource roles to grant users access to a private cloud environment. Users can then get started on CDW tasks, such as activating the environment from CDW.

Unsupported compression

CDW does not support LZO compression due to licensing of the LZO library. You cannot query tables having LZO compression in Virtual Warehouses, which use CDW Impala or Hive LLAP engines.

CDP resource roles

Required role: PowerUser

The following CDP resource roles are associated with the CDW service. A CDP PowerUser must assign these roles to users who require access to the Database Catalogs and Virtual Warehouses that are associated with specific environments. After granting these roles to users and groups, they then have access to the Data Catalogs and Virtual Warehouses that are associated with the environment.
  • DWAdmin: This role enables users or groups to grant a CDP user or group the ability to activate, terminate, launch, stop, or update services in Database Catalogs and Virtual Warehouses.
  • DWUser: This role enables users or groups to view and use CDW clusters (Virtual Warehouses) that are associated with specific environments.

Requirements for Hue

Hue in CDW requires WebHDFS to be enabled on the CDP Private Cloud Base cluster. Worker nodes for both, Embedded Container Service (ECS) and OpenShift Container Platform (OCP), must have access to the WebHDFS (HTTPFS) port 14000.

Recommended HAProxy timeout for HA deployments

If you have enabled High Availability (HA) for CDP Private Cloud Data Services on ECS or OCP, then set the HAProxy timeout values to 10 minutes or more, depending on how long your queries run. Setting a higher timeout value is needed to support long-running queries and prevent timeouts.