BigData / Hadoop basics
How Data is stored in HDFS?
First the files are divided into blocks and then those Blocks are stored on different DataNodes. NameNode stores the metadata. (file information, location etc.)
There are 2 terms used in HDFS Data Storage.
Blocks are the smallest unit of storage in HDFS and ranges from 64MB to 128MB. Replication is the copying factor that supports high tolerance features of Hadoop. Default Replication factor is 3, i.e. block would be redundant 3 times on different DataNodes.
More Related questions...