public
Description: a Map/Reduce framework for distributed computing
Home | Edit | New

Rules of Thumb for Map/Reduce programming

This growing page lists common gotchas related to Disco / MapReduce programming.

Know your key-space

nr_reduces should be more or equal to the number of unique keys produced by the map function. Otherwise you will empty partitions, that is, there will be reduce tasks which have no input entries.

Last edited by tuulos, Mon Sep 22 12:26:52 -0700 2008
Home | Edit | New
Versions: