Science, asked by saibhargav974, 8 months ago

UltLTIEU THUC
A duster
1 point
7. In MapReduce, one writes a program for Map that processes one input line at a time and outputs a (key, value) or nothing
and one writes a program for Reduce that processes an input of (key, all values for key). The iteration over input lines is
done automatically by the MapReduce framework. You are given an input file containing information from an
asymmetrical social network (eg. Twitter) about which users follow which other users. If user a follows b. the entry line
is (a, b). In an input line, a and b are different from each other. You need to write a MapReduce program that outputs all
pairs of users (a,b) who follow each other. Someone has written half of a MapReduce program where the shuffle traffic
has key = (a,b), i.e., a pair of users. The Reduce function invoked on this key counts the number of occurrences of this key
and if this count is 2, outputs the key as its final result
You're asked to write the Map function that takes as input (a,b) lines. The Map function should do the following: (1 point)
if (a bhren output <key.value>= (,b).22, otherwise output <key.value>= <b.a),2>
Output <key,value>= <ab) 2>
if (a<b) then output <key,value>= <ab),1>, otherwise output <key value>= <(b, a), 1>
Output <key value>= <(b.a), 1>​

Answers

Answered by sneha112251
0

Answer:

MapReduce is a processing technique and a program model for distributed computing based on java. The MapReduce algorithm contains two important tasks, namely Map and Reduce. Map takes a set of data and converts it into another set of data, where individual elements are broken down into tuples (key/value pairs). Secondly, reduce task, which takes the output from a map as an input and combines those data tuples into a smaller set of tuples. As the sequence of the name MapReduce implies, the reduce task is always performed after the map job.

The major advantage of MapReduce is that it is easy to scale data processing over multiple computing nodes. Under the MapReduce model, the data processing primitives are called mappers and reducers. Decomposing a data processing application into mappers and reducers is sometimes nontrivial. But, once we write an application in the MapReduce form, scaling the application to run over hundreds, thousands, or even tens of thousands of machines in a cluster is merely a configuration change. This simple scalability is what has attracted many programmers to use the MapReduce model.

Explanation:

hope you like it better

and plzz mark me as brainlist ❤

Similar questions