hadoop - How to maintain the ordering of MapWritables in the reducer? -
my mapper implementation
public class simplemapper extends mapper<text, text, text, mapwritable> { @override protected void map(text key, text value,context context) throws ioexception, interruptedexception { mapwritable writable = new linkedmapwritable(); writable.put("unique_key","one"); writable.put("another_key","two"); context.write(new text("key"),writable ); } }
and reducer implementation is:
public class simplereducer extends reducer<text, mapwritable, nullwritable, text> { @override protected void reduce(text key, iterable<mapwritable> values,context context) throws ioexception, interruptedexception { // map writables have ordered based on "unique_key" inserted } }
do have utilize secondary sort? there other way so?
mapwritable (values) in reducer in unpredictable order, order may vary run run, , have no command on it.
but map/reduce paradigm guarantees key presented reducer in sorted order , values belonging single key go single reducer.
so can utilize secondary sort , custom partitioner utilize case.
hadoop mapreduce writable
No comments:
Post a Comment