Managing Data Storage

Optimizing data storage
Optimizing performance
Using DistCp to copy files
Using the NFS Gateway for accessing HDFS
Configuring Proxy Users to Access HDFS
- Proxy users for Kerberos-enabled clusters
APIs for accessing HDFS
- Set up WebHDFS on a secure cluster
Using HttpFS to provide access to HDFS
- Add the HttpFS role
- Using Load Balancer with HttpFS
HttpFS authentication
- Use curl to access a URL protected by Kerberos HTTP SPNEGO
Data storage metrics
- Using JMX for accessing HDFS metrics
- Configure the G1GC garbage collector
  - Recommended settings for G1GC
  - Switching from CMS to G1GC
HDFS Metrics