HDFS namespace mapping to Ozone
Mapping HDFS directories to Ozone volumes, buckets, and keys is extremely important to understand as you migrate the content. Apart from how to perform the mapping, the number of Ozone elements to create for the migration of data is also an important consideration.
Ozone volumes and HDFS mapping
You can follow either of two approaches to map Ozone volumes with HDFS directories. You can create a volume called ‘/user’ in Ozone and create buckets that map to the second-level HDFS directories. This approach ensures that all HDFS existing paths in Hive or Spark applications can migrate to using Ozone by merely adding the ofs://<service-id> prefix.
Number of Ozone volumes, buckets, and keys for the migration
- Volumes in the range of 1-20
- Buckets per volume in the range of 1-1000
- Keys under
/volume/bucket
in the range of one to a few hundred million
Using OFS to abstract volumes and buckets
The Ozone Filesystem (OFS) connector abstracts out the volume and the bucket from the FS client and works internally with the Ozone back-end to create them. For example, the following command creates the volume, bucket, and then automatically creates the two directories, if you have the required permissions:
hadoop fs -mkdir -p ofs://ozone1/spark-volume/spark-bucket/application1/instance1