Enabling browsing Ozone from Hue on Cloudera Base on premises
Hue can read and write files on the Ozone filesystem, similar to S3 or ADLS. To
access Ozone from Hue, you must add the additional configurations in the
hue_safety_valve section, under Hue Service Advanced
Configuration Snippet (Safety Valve) for hue_safety_valve.ini in the Cloudera Manager.
You can perform this task on any Hue instances of any
environment in which you want to enable the Ozone File Browser. If multiple Hue
instances exist within the same cluster, completing this task on one instance per
environment is sufficient.
Ensure that the Ozone HttpFS Gateway role is running in a healthy state.
Go to Cloudera Manager > Ozone > Configuration and add the following entries in the HttpFS Gateway
Advanced Configuration Snippet (Safety Valve) for
ozone-conf/httpfs-site.xml field:
Field name
Value
Name
httpfs.proxyuser.[***PRINCIPAL-NAME***].hosts
Value
*
Name
httpfs.proxyuser.[***PRINCIPAL-NAME***].groups
Value
*
Replace the [***PRINCIPAL-NAME***] with the
actual Kerberos principal name. The hive
principal is the default principal required for communication between Ozone
and Hue. If this principal name is changed during installation (e.g., to a
custom principal for the Hive service), use that modified principal name
here instead.
Obtain the following values from the Ozone service. You need it to construct the
fs_defaultfs and webhdfs_url URLs:
HttpFS Gateway host name (Gateway node)
Ozone HttpFS Gateway TTP Web UI Port
(ozone.httpfs.http-port)
The default port is 9778. Ensure that the
port used by the Ozone HttpFS Gateway (or any other configured port
for that HttpFS instance) is accessible from the Hue node to the
Gateway node where the Ozone HttpFS Gateway is installed.
Ozone Service ID (ozone.service.id).
Log in to Cloudera Manager as an Administrator.
Go to Clusters > Hue > Configuration and add the following lines in the Hue Service
Advanced Configuration Snippet (Safety Valve) for
hue_safety_valve.ini field:
This optional configuration extends the default data push
timeout in Ozone from 120 seconds to 300 seconds. This can enhance
performance when transferring larger data chunks to Ozone,
especially over slower network connections.
This configuration is necessary because Ozone does not support
chunked uploads or the `/append` HttpFS API. Therefore, files must
be uploaded as a single, complete chunk into Ozone
[[ozone]]
[[[default]]]
fs_defaultfs=ofs://[***SERVICE-ID***] \\The ozone.service.id value
webhdfs_url=https://[***OZONE-HTTPFS-HOST***]:[***OZONE-HTTPFS-PORT***]/webhdfs/v1
ssl_cert_ca_verify=true
security_enabled=true
This configuration enables you to browse objects within Ozone
DB buckets from Hue.
[hadoop]
upload_chunk_size=2147483648
This configuration increases the default file upload chunk size
from 64 MB to 2 GB, which is the maximum supported by the Django
code used by Hue when uploading files.
[[database]]
port=0
options={"threaded":true}
This configuration is required only if the metastore is set on
an external Oracle database.
This configuration enables ZooKeeper to handle HiveServer2
failovers.
Click Save Changes.
Log in to Hue as an Administrator on any one instance within the
environment.
Click your username in the lower-left corner of the interface, and select
Administer Users.
Navigate to the Groups tab, select the default
group, and ensure filebrowser.ofs_access:Access to OFS from filebrowser
and filepicker permission is selected.
Click Update group to save the changes.
Restart the Hue service.
After configuring the Hue safety valve and restarting the
Virtual Warehouse, you may see that the Ozone file browser may take 10 - 15 minutes to
display on the Hue web interface, as shown in the following image.