Map Reduce is a data processing technique that condenses large volumes of data into aggregated results. MongoDB mapreduce command is provided to accomplish this task.
Lets consider the following examples that demonstrates the mapreduce command usage.
Consider the car collection which contains the following documents;
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
>db.car.insert( [ {car_id:"c1",name:"Audi",color:"Black",cno:"H110",mfdcountry:"Germany",speed:72,price:11.25}, {car_id:"c2",name:"Polo",color:"White",cno:"H111",mfdcountry:"Japan",speed:65,price:8.5}, {car_id:"c3",name:"Alto",color:"Silver",cno:"H112",mfdcountry:"India",speed:53,price:4.5}, {car_id:"c4",name:"Santro",color:"Grey",cno:"H113",mfdcountry:"Sweden",speed:89,price:3.5} , {car_id:"c5",name:"Zen",color:"Blue",cno:"H114",mfdcountry:"Denmark",speed:94,price:6.5} ] ) BulkWriteResult({ "writeErrors" : [ ], "writeConcernErrors" : [ ], "nInserted" : 5, "nUpserted" : 0, "nMatched" : 0, "nModified" : 0, "nRemoved" : 0, "upserted" : [ ] }) > |
Now let’s write the map reduce function on car collection grouping by speed and categorizing them as overspeed cars.
Define the map function as shown below
1 2 3 4 5 6 7 8 9 |
>var speedmap = function (){ var criteria; if ( this.speed > 70 ) { criteria="overspeed"; emit(criteria,this.speed); } }; |
This function categorizes the car as overspeed cars based on the speed. Here “this” refers to the current document for which map reducing has to be processed.
Define the reduce function with arguments key and values to caluclate the average speed of the overspeed car as
1 2 3 4 5 6 7 8 9 10 |
>var avgspeed_reducemap = function(key, speed) { var total =0; for (var i = 0; i < speed.length; i++) { total = total+speed[i]; } return total/speed.length; }; > |
Here the speed is summed up for all the cars through iterating the loop and the average speed is calculated as the sum of all the speed by the number of overspeed cars.
Invoke the map reduce function by calling the Map and Reduce functions on all the documents present in the car collection as;
1 2 3 |
>var ret = db.car.mapReduce(speedmap, avgspeed_reducemap, {out: "avgspeed"}); |
The output is fetched in a collection avgspeed.If this collection does not exist a new collection is created else the new contents are replaced.
To see the documents invoke db.avgspeed.find()
Output:
1 2 3 |
{ "_id" : "overspeed", "value" : 85 } |
The output states that there average speed of the overspeed cars is 85.
MongoDB Map Reduce Java Example
Below is the java program for above mongo shell example, note that it’s just showcasing the Map Reduce functions working. So make sure data is present in the collection for it to give desired result.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
package com.journaldev.mongodb; import java.net.UnknownHostException; import com.mongodb.DB; import com.mongodb.DBCollection; import com.mongodb.DBObject; import com.mongodb.MapReduceCommand; import com.mongodb.MapReduceOutput; import com.mongodb.MongoClient; public class MongoDBMapReduce { public static void main(String[] args) throws UnknownHostException { // create an instance of client and establish the connection MongoClient m1 = new MongoClient(); // get the test db,use your own DB db = m1.getDB("journaldev"); // get the car collection DBCollection coll = db.getCollection("car"); // map function to categorize overspeed cars String carMap = "function (){" + "var criteria;" + "if ( this.speed > 70 ) {" + "criteria="overspeed";" + "emit(criteria,this.speed);" + "}" + "};"; // reduce function to add all the speed and calculate the average speed String carReduce = "function(key, speed) {" + "var total =0;" + "for (var i = 0; i < speed.length; i++) {" + "total = total+speed[i];" + "}" + "return total/speed.length;" + "};"; // create the mapreduce command by calling map and reduce functions MapReduceCommand mapcmd = new MapReduceCommand(coll, carMap, carReduce, null, MapReduceCommand.OutputType.INLINE, null); // invoke the mapreduce command MapReduceOutput cars = coll.mapReduce(mapcmd); // print the average speed of cars for (DBObject o : cars.results()) { System.out.println(o.toString()); } } } |
Above java program produces following output.
1 2 3 |
<span style="color: #008000;"><strong><code> { "_id" : "overspeed" , "value" : 85.0} </code></strong></span> |
That’s all for a brief overview of Map Reduce functions in the MongoDB database, we will look other MongoDB features in coming posts.