How can a distributed file system such as HDFS provide opportunities
for the optimisation of a Map Reduce operation?
1) Data represented in a distributed file system is already sorted
2) Distributed file system must always be resident in memory, which is much faster then disk
3) Data storage and processing can be co located on the same node, so that most input data relevant to Map or Reduce will be present on local disks or cache
4) A distributed file system makes random access faster because of the presence of a dedicated node serving file metadata.
Answers
Answered by
1
Hey there!!!
Option B is correct answer
Answered by
0
Answer:
Option 3) is the correct option
Explanation:
- HDFS or Hadoop distributed file system is designed to implement its own Mapreduce run time, and work as a programming model by replicating GFS.
- Data storage and processing can be co-located on the same node so that most input data relevant to Map or Reduce will be present on local disks or Cache as HDFS is a distributed file system that provides locality information, which can be further used to schedule Map & Reduce operations that occurs on the node on which they store their input data.
Similar questions