Big Data Hadoop & Spark

Understanding Reducer Class in Hadoop Map Reduce

In this video tutorial, we will be discussing about the functioning of Reducer class in Hadoop Map Reduce.

In our previous blog we have discussed about the working of Mapper class and Sort and shuffle phase in MapReduce programming paradigm. In this blog we will be discussing about the working of the word count reducer class function in Java MapReduce program.

The main task of the reducer class is to perform user operation on all the mapper key value pairs sort and shuffle results and to combine these results into one output.

We expect the readers to have basic knowledge on Big Data and MapReduce mapper class function, and Sort and shuffle phase, refer the below links to get the basics of Big data, Mapper class function

https://acadgild.com/blog/understanding-big-data/

Understanding Mapper Class in hadoop
Hadoop

Reducer Class

As we know the reducer code reads the outputs generated by the different mappers as <Key,Value> pairs. The Reducer interface expects four generics, which define the types of the input and output key value pairs. The first two parameters define the intermediate key and value types, the second two define the final output key and value types.

Expected output of Word Count Reducer class

The main goal of the word count reducer class is to find number of occurrences of the each word in the input dataset file.

The reduce phase task for the word count class is to sum up the number of times each word was seen and write that sum count together with the word as output.

Example consider there are two lines of text in the provided input file:

Input dataset:

Hello Good Morning
Hello Good Evening

The map class output will be:

<Hello,1>
<Good,1>
<Morning,1>
<Hello,1>
<Good,1>
<Evening,1>

The sort and shuffle process output will be:

<Good,1,1>
<Hello,1,1>
<Evening,1>
<Morning,1>

The reducer class final output will be:

<Good,2>
<Hello,2>
<Evening,1>
<Morning,1>

Reducer Class Code

PROBLEM STATEMENT

To obtain sum of values for each key as <key, result (i.e, number of times a word is repeated in the input data set)> and push the result to the output context.

SOURCE CODE

public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> {
    private IntWritable result = new IntWritable();
   public void reduce(Text key, Iterable<IntWritable> values,Context context ) throws IOException, InterruptedException {
    int sum = 0;
    for (IntWritable val : values) {
    sum += val.get();  }
    result.set(sum);
    context.write(key, result);
    }
}
  • line 1     extends the default Reducer class with arguments KeyIn as Text and ValueIn as IntWritable which are same as the outputs of the mapper class and KeyOut as Text and ValueOut as IntWritable which will be     final outputs of our MapReduce program.
  • In line 2 we are declaring a IntWritable variable result which will store the number of occurrences of a word in the input dataset file.    
  • In line 3 we are overriding the Reduce method which will run each time for     every key.
  • In line 4 we are declaring a variable ‘sum’ of type intWritable and Initialized as 0. Which will store the sum of all the individual     repeated words into it.
  • In line 5 a foreach loop is taken which will run each time for the values inside the “Iterable values” which are coming from the shuffle and sort     phase after the mapper phase. We are taking another variable as ‘val’ which will be incremented every time as many values are there for that key.
  • In line 6 Iterate through all the values with respect to a key and sum up all     of them.
  • In line 7 we are storing the sum of the values in ‘result’ variable.
  • In line 8 Form key value pairs for each word as <key,result> and push it to the output context

Thus, the reducer method for the word count class will sum up the values of each key and stores it in the context.

From the above steps we believe this blog helped you to understand the working of word count Reducer class Program.
Keep visiting our site www.acadgild.com for more updates on Bigdata and other technologies. Click here to learn Bigdata Hadoop from our Expert Mentors

Hadoop

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles

Close
Close