BigData / Hadoop basics
What is Replication factor in HDFS?
Replication factor facilitates fault tolerance in Hadoop cluster.
HDFS stores files as data blocks and distributes these blocks across the entire cluster. As HDFS was designed to be fault-tolerant and to run on commodity hardware, blocks are replicated a number of times to ensure high data availability. The replication factor is a property that can be set in the HDFS configuration file that will allow you to adjust the global replication factor for the entire cluster.For each block stored in HDFS, there will be n 1 duplicated blocks distributed across the cluster. For example, the default replication factor is 3, so there would be 1 original block and 2 replicas.
More Related questions...