BigData / Hadoop basics
Explain HDFS Read a file workflow.
Client opens the file it wishes to read by calling open() on the Distributed FileSystem (HDFS).
DistributedFileSystem makes an RPC call to the name node to determine the locations of the blocks for the first few blocks in the file.
For each block, the name node returns the addresses of the data nodes that have a copy of that block and data nodes are sorted according to their proximity to the client.
DistributedFileSystem returns an FSDataInputStream to the client for it to read data from. FSDataInputStream in turns wraps the DFSInputStream which manages the data node and name node I/O.
Client calls read() on the stream. DFSInputStream which has stored the data node addresses then connects to the closest data node for the first block in the file.
Data is streamed from the data node back to the client, which calls read() repeatedly on the stream. When the end of the block is reached.
DFSInputStream will close the connection to the data node and then finds the best data node for the next block.
Invest now in Acorns!!! 🚀
Join Acorns and get your $5 bonus!
Acorns is a micro-investing app that automatically invests your "spare change" from daily purchases into diversified, expert-built portfolios of ETFs. It is designed for beginners, allowing you to start investing with as little as $5. The service automates saving and investing. Disclosure: I may receive a referral bonus.
Invest now!!! Get Free equity stock (US, UK only)!
Use Robinhood app to invest in stocks. It is safe and secure. Use the Referral link to claim your free stock when you sign up!.
The Robinhood app makes it easy to trade stocks, crypto and more.
Webull! Receive free stock by signing up using the link: Webull signup.
More Related questions...
