Update Cloudera Data Warehouse on premises log configuration to point to Ozone
This topic describes how to configure Cloudera Data Warehouse on premises to store logs on Ozone.
To configure Cloudera Data Warehouse on premises and the underlying OpenShift cluster to store Hive and Impala logs on Ozone, you must gather some information and prepare a block of code that you will insert into the Virtual Warehouse ConfigMap on the OpenShift pod. These preliminary steps are described in the following section.
Get the following information and prepare the block of code for the Virtual Warehouse ConfigMap before you start the steps of updating the configuration:
- Get the Cloudera Data Warehouse namespace for your Virtual
Warehouse:
- From the Management Console home page left menu, click Data Warehouse in the left menu. You are taken to the Overview page of Cloudera Data Warehouse on premises service.
- Locate the Virtual Warehouse you want to configure log storage for in the right-most
column of the page, and locate the Cloudera Data Warehouse namespace,
which starts with
compute-
as shown below:
- Prepare the code block that must be pasted into the OpenShift ConfigMap:
Here is an example:
<match **> @type s3 @log_level debug aws_key_id <access-id> aws_sec_key <sec-key> s3_bucket <bucket-name> s3_endpoint <ozone-s3-gateway-endpoint> ssl_verify_peer false s3_object_key_format "<warehouse_prefix>/warehouse/tablespace/external/hive/sys.db/logs/dt=%Y-%m-%d/${path_tag}/%{time_slice}_${unique_file_key}.log.%{file_extension}" time_slice_format %Y-%m-%d-%H-%M store_as gzip auto_create_bucket false check_apikey_on_start false force_path_style true check_bucket false check_object false <buffer path_tag, unique_file_key, time, warehouse> @type file path /tmp/fluentd-buffers/%{unique_file_key}-s3.buffer timekey 900 # minute precision for time_slice_format to have minute in file name timekey_use_utc true chunk_limit_size 265m flush_mode interval flush_interval "900s" flush_thread_count 8 flush_at_shutdown true </buffer> <format> @type single_value message_key log add_newline true </format> </match>
In the above code block example:
<bucket-name>
indicates the name of the Ozone bucket used for storing the Cloudera Data Warehouse on premises logs.<ozone-s3-gateway-endpoint>
indicates the endpoint of the Ozone S3 Gateway. Get this value from the Ozone S3 Gateway Web UI page of Cloudera Manager.<access_id>
and<sec_key>
are the AWS access credentials for the Ozone S3 Gateway. Get these values by using thekinit -kt
and theozone s3 getsecre
commands on the Cloudera Base on premises OpenShift cluster.