Friday, 15 May 2015

hadoop - How to maintain the ordering of MapWritables in the reducer? -



hadoop - How to maintain the ordering of MapWritables in the reducer? -

my mapper implementation

public class simplemapper extends mapper<text, text, text, mapwritable> { @override protected void map(text key, text value,context context) throws ioexception, interruptedexception { mapwritable writable = new linkedmapwritable(); writable.put("unique_key","one"); writable.put("another_key","two"); context.write(new text("key"),writable ); }

}

and reducer implementation is:

public class simplereducer extends reducer<text, mapwritable, nullwritable, text> { @override protected void reduce(text key, iterable<mapwritable> values,context context) throws ioexception, interruptedexception { // map writables have ordered based on "unique_key" inserted }

}

do have utilize secondary sort? there other way so?

mapwritable (values) in reducer in unpredictable order, order may vary run run, , have no command on it.

but map/reduce paradigm guarantees key presented reducer in sorted order , values belonging single key go single reducer.

so can utilize secondary sort , custom partitioner utilize case.

hadoop mapreduce writable

No comments:

Post a Comment