In Cloudera Runtime 7.2.14, a new feature called “Storefile Tracking” (SFT) is available
as an optional feature delivered through the Cloudera Operational Database (COD) service.
Cloudera has worked with the Apache HBase project to deliver the first version of this feature
through HBASE-26067, and has delivered this feature as a part of CDP.
When using S3 for HBase data, COD can dynamically scale the number of workers based on the
compute resources required, rather than the workers required to host the data in HDFS. To deliver
this ability to you in a reasonable timeframe, Cloudera built HBOSS. This feature is the next evolution of HBase using S3 which no
longer requires the HBOSS solution. The storefile tracking feature for HBase with S3 prevents
unwanted I/O due to renames on S3. With HDFS, a rename is a constant-time operation, but on S3 a
rename requires a full copy of the file. Because of this, using S3 doubles the I/O costs for
HBase operations like compactions, flushes, and snapshot-based operations. The storefile tracking
feature removes the reliance of renames for S3-backed HBase data which should make S3 function
more like HDFS does.