Scaling Namespaces and Optimizing Data Storage
Also available as:
PDF
loading table of contents...

Introduction

Hadoop Distributed File System (HDFS) is a Java-based file system that provides scalable and reliable data storage. An HDFS cluster contains a NameNode to manage the cluster namespace and DataNodes to store data.