Sunday, 15 January 2012

apache pig - Performance profiling in Pig Scripts -



apache pig - Performance profiling in Pig Scripts -

some of pig scripts take long time execute info on run map cut down jobs huge. so, thinking of ways speed script . can suggest ideas , set thoughts . there lot of grouping fields involved while grouping info based on combination of 2 or 3 fields.

one thought can think of having 1 field while doing grouping by

data = grouping (int) (random()*100) reducers, field1, field2 etc

will help involve more number of reducers introducing 1 field in grouping by. know output part file sizes become lesser , overall speedify running time of pig scripts.

apache-pig

No comments:

Post a Comment