I have been meaning to write about this for quite a while. Now I actually have a blog, its the ideal opportunity to simplify the process for those getting to grips, clear up any misconceptions, and explain how and why the process is far from new.
Background
What is now known as Map/Reduce has recently generated a lot of interest in deployments requiring the ability to perform queries on large-scale datasets.
Using a “traditional” RDBMS to perform these queries has resulted in issues as the number of records increases - primarily due to the requirements of a monolithic series of linked tables known as a Relational Database.
These problems have been tackled by a process known as sharding, whereby the records in the tables are split among a series of database nodes. Middleware is often introduced to abstract the queries to these nodes, but (at least on commodity RDBMS) the key ability to perform complex joins (as required data may be on another node) has already been lost.
At its most basic, Map/Reduce can be compared to a sharded RDBMS. In fact, an implementation of Map/Reduce may actually USE a sharded RDBMS by behaving as the middleware.
Map/Reduce simplified
A simple way of explaining a map/reduce function is with a real world example. Ill try and illustrate this as best possible.
Map/reduce is usually demonstrated as a distributed group and count, but it could be any function performed in a distributed manner. Here we will do a find and sort, as its a nice easy introduction.
You have a black bin liner stuffed full of receipts which you need to group and order for your tax return. You need to look through the receipts and discard all those you are not likely to be able to file, and then order them according to date.
This bin liner contains 1000 records, and it will take you 30 seconds to decide whether or not each receipt needs to be discarded.
When you are sorting the receipts into date order, it will gradually become more difficult, as you will need to thumb through the list of already sorted receipts to find the place to insert it. This will take an increasing amount of time to do, as you cannot hold the position of each date in memory - an average of 0.1 second for each receipt in the already sorted pile.
This would take you:
30,000s = 1000 * 30s for discarding
49,950s = 999(1000)/2 * 0.1s for sorting
Thats just over 22 hours.
Now, if you could rope in 4 friends to help you out, and assuming they all work at the same speed, you could act as an overseer - simply sharing the work among them, and collating the receipts when they are finished.
It will take each of your friends:
7,500s = 250 * 30s for discarding
3,112.5s = 249(250)/2 * 0.1s for sorting
Just under 3 hours each.
When it comes to combining all of these sorted piles, it is quite straight forward - as you only need to look through 4 receipts at any one time - so it will take you a total of 0.4s as you simply grab the next value from the top of the relevant pile.
So you can merge the piles in:
400s = 1000 * 4 * 0.1s for sorting from the pre-sorted piles.
Thats only 7 minutes of sorting you need to do after they have finished the job - and you have finished what would take you 22 hours alone in 3 hours!
In terms of total time spent - if you were to value your own time at £50 p/h, you could still pay your friends that rate and save over £500
An important point to note is that simply increasing the number of nodes does not neccessarily mean an increase in overall performance in small datasets. The larger the dataset, the faster the relative speedup will be. To save us the most amount of cash, we would want to employ 19 friends to do the task for us:

Technical summary
As we have demonstrated in our real world example, Map/Reduce is nothing new. Its done by businesses on a daily basis in order to save them time and money in simple staffing costs.
Obviously, certain points I have made may not seem valid in a systems environment. Obviously “seek” times will be considerably less (although they are for illustration!), and in-memory pointers will alleviate the sorting issue - or will it?
Whilst when we are dealing with small amounts of data, the sorting of records could indeed be done in memory. The problem comes when you are dealing with Gigabytes or even Terabytes of the the stuff. The beauty of a secondary (or higher) reduce/merge is that it can be streamed. In our example, it would be the supervisor only needing to have 4 receipts in memory - the top of the sorted stacks. Additionally, if you are accessing ordered data on the mapping nodes, you can stream from there as well!
A map is simply a function that will be run on each node and spit out matching or relevant data (return valid receipts). A reduce is an operation on the mapped data (sort receipts). Secondary and higher reduces are generally called merges (sort of sorted receipt piles).
As you can see, in a sharded RDBMS environment, the map could simply be a “select all from receipts where isvalid=true” with the a built-in reduce of “order by date desc” run on each of the shards, with a merge occurring on the proxy/middleware. In operations like this, both can be as efficient as each other.
Where the RDBMS would fall is with a more complex map, requiring access to data not within the shard. As a standard map is just a function run on a distributed node, it could include logic to perform additional lookups on other nodes. This could be facilitated with a directory that tells it where the required data is located. I’m not aware of how this could be done effectively with a RDBMS such as MySQL, without implementing actual programatic maps.
Misconceptions
1. Schemaless
Your data can be as structured as you want it to be. All you are doing is running a function against records. Thats exactly what a RDBMS does anyway. You can run a map on an RDBMS if it makes you feel happier.
2. No indexes
You can locally index your data in whatever way you want, or, if you are dynamically mapping out the data, you can send the indexes over, or simply have an index server. Its no different to sharding in that respect.
Final Word
I will make amendments to this post over the space of the next day. I plan to insert some diagrams, as well as show how more nodes does not neccessarily mean more performance.
Apologies for my poor writing style. Its the first time I have ever written a blog, and it will take some time getting used to!
A few references you may find interesting:
Google Research Publication
Write your first Map/Reduce function in 20 mins
Misconceptions about MapReduce