apache pig - Performance profiling in Pig Scripts -
some of pig scripts take long time execute info on run map cut down jobs huge. so, thinking of ways speed script . can suggest ideas , set thoughts . there lot of grouping fields involved while grouping info based on combination of 2 or 3 fields.
one thought can think of having 1 field while doing grouping by
data = grouping (int) (random()*100) reducers, field1, field2 etc
will help involve more number of reducers introducing 1 field in grouping by. know output part file sizes become lesser , overall speedify running time of pig scripts.
apache-pig
No comments:
Post a Comment