Saturday, 15 January 2011

python - Remove words of length less than 4 from string -



python - Remove words of length less than 4 from string -

this question has reply here:

remove little words using python 2 answers

i trying remove words of length less 4 string.

i utilize regex:

re.sub(' \w{1,3} ', ' ', c)

though removes strings fails when 2-3 words of length less 4 appear together. like:

in bank.

it gives me:

in bank.

how resolve this?

don't include spaces; utilize \b word boundary anchors instead:

re.sub(r'\b\w{1,3}\b', '', c)

this removes words of 3 characters entirely:

>>> import re >>> re.sub(r'\b\w{1,3}\b', '', 'the quick brownish fox jumps on lazy dog') ' quick brownish jumps on lazy ' >>> re.sub(r'\b\w{1,3}\b', '', 'i in bank.') ' bank.'

python regex

No comments:

Post a Comment