linux - What is the fastest possible egrep -
this question has reply here:
fastest possible grep 8 answersi need egrep big csv file 2 1000000 lines, want cutting downwards egrep time 0.5 sec, possible @ all? no, don't want database (sqlite3 or mysql) @ time..
$ time wc foo.csv 2000000 22805420 334452932 foo.csv real 0m3.396s user 0m3.261s sys 0m0.115s
i've been able cutting downwards run time 40 secs 1.75 secs
$ time egrep -i "storm|broadway|parkway center|chief financial" foo.csv|wc -l 108292 real 0m40.707s user 0m40.137s sys 0m0.309s $ time lc_all=c egrep -i "storm|broadway|parkway center|chief financial" foo.csv|wc -l 108292 real 0m1.751s user 0m1.590s sys 0m0.140s
but want egrep real time less half second, tricks appreciated, file changes continuously, can't utilize cache mechanism...
if searching keywords, utilize fgrep
(or grep -f
) instead of egrep
:
lc_all=c grep -f -i -e storm -e broadway -e "parkway center" -e "chief financial"
the next thing seek factoring out -i
, bottleneck. if you're sure first letter might capitalized, example, do:
lc_all=c grep -f \ -e{s,s}torm -e{b,b}roadway -e{p,p}"arkway "{c,c}enter -e{c,c}"hief "{f,f}inancial
linux bash shell awk grep
No comments:
Post a Comment