Tuesday, 15 September 2015

Change formatting on paragraphs, with perl -



Change formatting on paragraphs, with perl -

i have number of paragraphs have returns @ end of line. not want returns @ end of lines, allow layout programme take care of that. remove returns, , replace them spaces.

the issue want returns in between paragraphs. so, if there more 1 homecoming in row (2, 3, etc) maintain 2 returns.

this allow there paragraphs, 1 blank line between then, other formatting lines removed. allow layout programme worry line breaks, , not have breaks determined set number of characters, now.

i utilize perl accomplish change, open other methods.

example text:

this test. test. test. test.

would become:

this test. test. test. test.

can done easily?

i came solution , wanted explain regex matching.

matt@mattpc ~/perl/testing/8 $ cat input.txt test. test. test. test. test. test. matt@mattpc ~/perl/testing/8 $ perl -e '$/ = undef; $_ = <>; s/(?<!\n)\n(?!\n)/ /g; s/\n{2,}/\n\n/g; print' input.txt test. test. test. test. test. test.

i wrote perl programme , mashed one-liner. this.

# first 2 lines read in whole file $/ = undef; $_ = <>; # regex replaces every `\n` space # if not preceded or followed `\n` s/(?<!\n)\n(?!\n)/ /g; # replaces every 2 or more \n 2 \n s/\n{2,}/\n\n/g; # print $_ print; perl -p -i -e 's/(\w+|\s+)[\r\n]/$1 /g' abc.txt

part of problem here matching. (\w+|\s+) matches 1 of more word characters, same [a-za-z0-9_], or 1 or more whitespace characters, same [\t\n\f\r ].

this wouldn't match input, since aren't matching periods, , no line consists of white space or characters (even blank lines need 2 whitespace characters match it, since have [\r\n] @ end). plus, neither match period.

perl

No comments:

Post a Comment