Cloudera resource roles and other prerequisites
To get started in Cloudera Data Warehouse, your data must conform to supported compression codecs, and you must obtain Cloudera resource roles to grant users access to a private cloud environment. Users can then get started on tasks, such as activating the environment from Cloudera Data Warehouse.
Unsupported compression
Cloudera Data Warehouse does not support LZO compression due to licensing of the LZO library. You cannot query tables having LZO compression in Virtual Warehouses, which use Cloudera Data Warehouse Impala or Hive LLAP engines.
Cloudera resource roles
Required role: PowerUser
- DWAdmin: This role enables users or groups to grant a Cloudera user or group the ability to activate, terminate, launch, stop, or update services in Database Catalogs and Virtual Warehouses.
- DWUser: This role enables users or groups to view and use Cloudera Data Warehouse clusters (Virtual Warehouses) that are associated with specific environments.
Requirements for Hue
Hue in Cloudera Data Warehouse requires WebHDFS to be enabled on the Cloudera Base on premises cluster. Worker nodes for both, Embedded Container Service (ECS) and OpenShift Container Platform (OCP), must have access to the WebHDFS (HTTPFS) port 14000.
Recommended HAProxy timeout for HA deployments
If you have enabled High Availability (HA) for Cloudera Data Services on premises on ECS or OCP, then set the HAProxy timeout values to 10 minutes or more, depending on how long your queries run. Setting a higher timeout value is needed to support long-running queries and prevent timeouts.