Breeding: linux - What is the fastest possible egrep -

Friday, 15 March 2013

linux - What is the fastest possible egrep -

this question has reply here:

fastest possible grep 8 answers

i need egrep big csv file 2 1000000 lines, want cutting downwards egrep time 0.5 sec, possible @ all? no, don't want database (sqlite3 or mysql) @ time..

$ time wc foo.csv 2000000 22805420 334452932 foo.csv real 0m3.396s user 0m3.261s sys 0m0.115s

i've been able cutting downwards run time 40 secs 1.75 secs

$ time egrep -i "storm|broadway|parkway center|chief financial" foo.csv|wc -l  108292  real    0m40.707s user    0m40.137s sys     0m0.309s  $ time lc_all=c egrep -i "storm|broadway|parkway center|chief financial" foo.csv|wc -l  108292  real    0m1.751s user    0m1.590s sys     0m0.140s

but want egrep real time less half second, tricks appreciated, file changes continuously, can't utilize cache mechanism...

if searching keywords, utilize fgrep (or grep -f) instead of egrep:

lc_all=c grep -f -i -e storm -e broadway -e "parkway center" -e "chief financial"

the next thing seek factoring out -i, bottleneck. if you're sure first letter might capitalized, example, do:

lc_all=c grep -f \ -e{s,s}torm -e{b,b}roadway -e{p,p}"arkway "{c,c}enter -e{c,c}"hief "{f,f}inancial

linux bash shell awk grep

Breeding

Friday, 15 March 2013

linux - What is the fastest possible egrep -

No comments:

Post a Comment