3 MapReduce program ex code

The document explains the structure of a MapReduce program, which consists of three main parts: Mapper Phase Code, Reducer Phase Code, and Driver Code. It provides detailed Java code for both the Mapper and Reducer classes, illustrating how to tokenize input text and aggregate word counts. Additionally, it outlines the configuration and execution of the MapReduce job using Hadoop, including input/output paths and job settings.

Uploaded by

kajalyadav102703

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

3 MapReduce program ex code

Uploaded by

kajalyadav102703

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 14

MapReduce Program

Explanation of MapReduce Program

The entire MapReduce program can be fundamentally divided into three

parts:

Mapper Phase Code

Reducer Phase Code

Driver Code
Mapper code:
public static class Map extends
Mapper<LongWritable,Text,Text,IntWritable> {
public void map(LongWritable key, Text value, Context context)
throws IOException,InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
value.set(tokenizer.nextToken());
context.write(value, new IntWritable(1));
}
 We have created a class Map that extends the class Mapper which is
already defined in the MapReduce Framework.
 We define the data types of input and output key/value pair after the
class declaration using angle brackets.
 Both the input and output of the Mapper is a key/value pair.
 Input:
 The key is nothing but the offset of each line in the text file: LongWritable
 The value is each individual line (as shown in the figure at the right): Text
 Output:
 The key is the tokenized words: Text
 We have the hardcoded value in our case which is 1: IntWritable
 Example – Dear 1, Bear 1, etc.
 We have written a java code where we have tokenized each word and
assigned them a hardcoded value equal to 1.
Reducer Code:
public static class Reduce extends Reducer<Text,IntWritable,Text,IntWritable>
{
public void reduce(Text key, Iterable<IntWritable> values,Context context)
throws IOException,InterruptedException {
int sum=0;
for(IntWritable x: values)
{
sum+=x.get();
}
context.write(key, new IntWritable(sum));
}
}
 We have created a class Reduce which extends class Reducer like that of Mapper.
 We define the data types of input and output key/value pair after the class
declaration using angle brackets as done for Mapper.
 Both the input and the output of the Reducer is a key-value pair.
 Input:
 The key nothing but those unique words which have been generated after the sorting
and shuffling phase: Text
 The value is a list of integers corresponding to each key: IntWritable
 Example – Bear, [1, 1], etc.
 Output:
 The key is all the unique words present in the input text file: Text
 The value is the number of occurrences of each of the unique words: IntWritable
 Example – Bear, 2; Car, 3, etc.
 We have aggregated the values present in each of the list corresponding to each
key and produced the final answer.
 In general, a single reducer is created for each of the unique words, but, you can
specify the number of reducer in mapred-site.xml.
Driver Code:

 Configuration conf= new Configuration();

 Job job = new Job(conf,"My Word Count Program");
 job.setJarByClass(WordCount.class);
 job.setMapperClass(Map.class);
 job.setReducerClass(Reduce.class);
 job.setOutputKeyClass(Text.class);
 job.setOutputValueClass(IntWritable.class);
 job.setInputFormatClass(TextInputFormat.class);
 job.setOutputFormatClass(TextOutputFormat.class);
 Path outputPath = new Path(args[1]);
 //Configuring the input/output path from the filesystem into the job
 FileInputFormat.addInputPath(job, new Path(args[0]));
 FileOutputFormat.setOutputPath(job, new Path(args[1]));
 In the driver class, we set the configuration of our MapReduce job to
run in Hadoop.
 We specify the name of the job, the data type of input/output of the
mapper and reducer.
 We also specify the names of the mapper and reducer classes.
 The path of the input and output folder is also specified.
 The method setInputFormatClass () is used for specifying how a
Mapper will read the input data or what will be the unit of work. Here,
we have chosen TextInputFormat so that a single line is read by the
mapper at a time from the input text file.
 The main () method is the entry point for the driver. In this method,
we instantiate a new Configuration object for the job.
 package co.edureka.mapreduce;

 import java.io.IOException;

 import java.util.StringTokenizer;

 import org.apache.hadoop.io.IntWritable;

 import org.apache.hadoop.io.LongWritable;

 import org.apache.hadoop.io.Text;

 import org.apache.hadoop.mapreduce.Mapper;

 import org.apache.hadoop.mapreduce.Reducer;

 import org.apache.hadoop.conf.Configuration;

 import org.apache.hadoop.mapreduce.Job;

 import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;

 import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

 import org.apache.hadoop.fs.Path;

 public class WordCount{

 public static class Map extends Mapper<LongWritable,Text,Text,IntWritable> {
 public void map(LongWritable key, Text value,Context context) throws
IOException,InterruptedException{
 String line = value.toString();
 StringTokenizer tokenizer = new StringTokenizer(line);
 while (tokenizer.hasMoreTokens()) {
 value.set(tokenizer.nextToken());
 context.write(value, new IntWritable(1));}}

 public static class Reduce extends
Reducer<Text,IntWritable,Text,IntWritable> {
 public void reduce(Text key, Iterable<IntWritable>
values,Context context) throws IOException,InterruptedException {
 int sum=0;
 for(IntWritable x: values)
 {
 sum+=x.get();
 }
 context.write(key, new IntWritable(sum));
 }
 }
public static void main(String[] args) throws Exception {
Configuration conf= new Configuration();
Job job = new Job(conf,"My Word Count Program");
job.setJarByClass(WordCount.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
Path outputPath = new Path(args[1]);
//Configuring the input/output path from the filesystem into the job
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
//deleting the output path automatically from hdfs so that we don't have to delete it
explicitly
outputPath.getFileSystem(conf).delete(outputPath);
//exiting the job only if the flag value becomes false
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
Run the MapReduce code:

 The command for running a MapReduce code is:

hadoop jar hadoop-mapreduce-example.jar WordCount /sample/input

/sample/output
References

https://www.edureka.co/blog/mapreduce-tutorial/

https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html

https://www.tutorialspoint.com/hadoop/hadoop_mapreduce.htm

https://www.geeksforgeeks.org/mapreduce-understanding-with-real-life-e
xample/

SPHL Emergency Evacuation and Rescue Procedures
100% (3)
SPHL Emergency Evacuation and Rescue Procedures
30 pages
MYP Spanish 3 IB Unit Planner - Cuban Revolution
100% (1)
MYP Spanish 3 IB Unit Planner - Cuban Revolution
4 pages
Worksheet 9 Solution Miner S Rule
No ratings yet
Worksheet 9 Solution Miner S Rule
3 pages
BDC Output 3
No ratings yet
BDC Output 3
4 pages
Experiment-4 BDA LAB
No ratings yet
Experiment-4 BDA LAB
7 pages
Palak
No ratings yet
Palak
10 pages
Big Data 4 Vivek
No ratings yet
Big Data 4 Vivek
3 pages
Hadoop Wordcount Program
No ratings yet
Hadoop Wordcount Program
20 pages
Advanced Mapreduce
No ratings yet
Advanced Mapreduce
37 pages
Big Data Practical 2
No ratings yet
Big Data Practical 2
11 pages
Ravikant_Hadoop_file
No ratings yet
Ravikant_Hadoop_file
22 pages
Word Count Example
No ratings yet
Word Count Example
4 pages
wc
No ratings yet
wc
13 pages
Developing A Mapreduce Application: by Dr. K. Venkateswara Rao Professor Department of Cse
No ratings yet
Developing A Mapreduce Application: by Dr. K. Venkateswara Rao Professor Department of Cse
83 pages
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
13 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements
7 pages
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
No ratings yet
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
5 pages
Hadoop and Map Reduce
No ratings yet
Hadoop and Map Reduce
27 pages
Steps to create jar file and execute word count problem in mapper reducer
No ratings yet
Steps to create jar file and execute word count problem in mapper reducer
5 pages
Map Reduce
No ratings yet
Map Reduce
57 pages
Hadoop Developingapps PDF
No ratings yet
Hadoop Developingapps PDF
17 pages
Lecture 04
No ratings yet
Lecture 04
25 pages
Bda Experiment No2
No ratings yet
Bda Experiment No2
12 pages
Unit IV Programming Model
No ratings yet
Unit IV Programming Model
30 pages
6. Map Reduce Programming
No ratings yet
6. Map Reduce Programming
67 pages
Part B Assignment - No - 1
No ratings yet
Part B Assignment - No - 1
6 pages
Example - (Map Function in Word Count)
No ratings yet
Example - (Map Function in Word Count)
6 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
Exp 4 Word Count
No ratings yet
Exp 4 Word Count
4 pages
BDA3
No ratings yet
BDA3
7 pages
Ravinder Big Data 4 PDF
No ratings yet
Ravinder Big Data 4 PDF
15 pages
Prerequisites: Single Node Setup Cluster Setup
No ratings yet
Prerequisites: Single Node Setup Cluster Setup
5 pages
Practical 2-1
No ratings yet
Practical 2-1
4 pages
DA Lab Program-2
No ratings yet
DA Lab Program-2
6 pages
Practical 2c
No ratings yet
Practical 2c
2 pages
Experiment 6 BDA
No ratings yet
Experiment 6 BDA
4 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Hadoop MapReduce Flow Chart
No ratings yet
Hadoop MapReduce Flow Chart
28 pages
Bda Unit III r20csm
No ratings yet
Bda Unit III r20csm
54 pages
Execute Java Map Reduce Sample Using Eclipse
No ratings yet
Execute Java Map Reduce Sample Using Eclipse
9 pages
Classcreation
No ratings yet
Classcreation
2 pages
Mapreduce Programming Framework
No ratings yet
Mapreduce Programming Framework
23 pages
CS702_Big_Data_Programs
No ratings yet
CS702_Big_Data_Programs
58 pages
Parlab Parallel Boot Camp Cloud Computing With Mapreduce and Hadoop
No ratings yet
Parlab Parallel Boot Camp Cloud Computing With Mapreduce and Hadoop
49 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Hadoop Mapred
100% (1)
Hadoop Mapred
11 pages
ADA Lab Manual
No ratings yet
ADA Lab Manual
34 pages
Mapreduce: Simplified Data Processing On Large Clusters by Jeffrey Dean and Sanjay Ghemawa Presented by Jon Logan
No ratings yet
Mapreduce: Simplified Data Processing On Large Clusters by Jeffrey Dean and Sanjay Ghemawa Presented by Jon Logan
30 pages
Word Count Program
No ratings yet
Word Count Program
3 pages
BDF Programs
No ratings yet
BDF Programs
32 pages
Exp 3-Word Count
No ratings yet
Exp 3-Word Count
4 pages
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
No ratings yet
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
22 pages
Source Code for Wordcount
No ratings yet
Source Code for Wordcount
3 pages
Unit 4 BDA
No ratings yet
Unit 4 BDA
31 pages
Practical 3bcbs
No ratings yet
Practical 3bcbs
5 pages
BDA
No ratings yet
BDA
6 pages
MapReduce - Notes
No ratings yet
MapReduce - Notes
17 pages
BDA Lab 8 Manual
No ratings yet
BDA Lab 8 Manual
7 pages
DSBDA 11
No ratings yet
DSBDA 11
15 pages
Assignment 11 DSBDA
No ratings yet
Assignment 11 DSBDA
4 pages
Week-8 de
No ratings yet
Week-8 de
9 pages
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
No ratings yet
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
9 pages
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
From Everand
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
Miguel Miranda de Mattos
No ratings yet
WSB 203
No ratings yet
WSB 203
2 pages
07-Abstract Class and Interface
No ratings yet
07-Abstract Class and Interface
19 pages
Supreme Pupil Government Action Plan SCHOOL YEAR 2020 - 2021 Core Value Activity/ies Description Partner Venue/Platform Remarks
No ratings yet
Supreme Pupil Government Action Plan SCHOOL YEAR 2020 - 2021 Core Value Activity/ies Description Partner Venue/Platform Remarks
3 pages
Tugas VBT Kimia Anorganik 2-Melva Hilderia S. (06101381520043)
No ratings yet
Tugas VBT Kimia Anorganik 2-Melva Hilderia S. (06101381520043)
6 pages
Catering Proposal
No ratings yet
Catering Proposal
13 pages
Software Defined Radio
No ratings yet
Software Defined Radio
28 pages
1.8. CFD Analysis of Vertical Axis Wind Turbine Using Ansys Fluent Paper
No ratings yet
1.8. CFD Analysis of Vertical Axis Wind Turbine Using Ansys Fluent Paper
10 pages
MPLAB User Guide 51519c
100% (2)
MPLAB User Guide 51519c
360 pages
NST2601 Assignmnts
No ratings yet
NST2601 Assignmnts
20 pages
GP208 HW1 State of Stress PDF
No ratings yet
GP208 HW1 State of Stress PDF
2 pages
Product Data Sheet - High Density Polyethylene HCH 5110A: TEC-PRO-PDS-024
No ratings yet
Product Data Sheet - High Density Polyethylene HCH 5110A: TEC-PRO-PDS-024
3 pages
Lecture 1 Introduction To DSP
No ratings yet
Lecture 1 Introduction To DSP
55 pages
Wolfson Eup3 Ch35 Test Bank
No ratings yet
Wolfson Eup3 Ch35 Test Bank
10 pages
Factors Affecting Punctuality and Attendance Among Elementary Pupil of Sas
No ratings yet
Factors Affecting Punctuality and Attendance Among Elementary Pupil of Sas
102 pages
Aries's Requirement
No ratings yet
Aries's Requirement
49 pages
The Wheel of Time
0% (2)
The Wheel of Time
69 pages
CNN Reading 1 What Broke The American Dream For Millennials
No ratings yet
CNN Reading 1 What Broke The American Dream For Millennials
22 pages
Monthly Newsletter
No ratings yet
Monthly Newsletter
18 pages
SSC JE Study Materials Civil ESTIMATING COSTING and VALUATION
100% (1)
SSC JE Study Materials Civil ESTIMATING COSTING and VALUATION
20 pages
SVC 2018 03v0.3 - PDA Scan Tool Guide
No ratings yet
SVC 2018 03v0.3 - PDA Scan Tool Guide
2 pages
DSMM Unit 4
No ratings yet
DSMM Unit 4
4 pages
Intertherm 79: Temporary Protective Primer
No ratings yet
Intertherm 79: Temporary Protective Primer
4 pages
Transport and Customs Clearance Coordinator Job Description
No ratings yet
Transport and Customs Clearance Coordinator Job Description
2 pages
Conceptronic CMED3PLUS
No ratings yet
Conceptronic CMED3PLUS
17 pages
STEAMR Duty list-REVISED
No ratings yet
STEAMR Duty list-REVISED
9 pages
Test Report PT Pharos Indonesia
No ratings yet
Test Report PT Pharos Indonesia
1 page
Analog Electronic Circuits (ELE-209) RCS (Makeup) PDF
No ratings yet
Analog Electronic Circuits (ELE-209) RCS (Makeup) PDF
2 pages