Pythonic way to normalize vertical whitespace

python at bdurham.com python at bdurham.com
Fri May 8 11:53:37 EDT 2009


I'm looking for suggestions on technique (not necessarily code)
about the most pythonic way to normalize vertical whitespace in
blocks of text so that there is never more than 1 blank line
between paragraphs. Our source text has newlines normalized to
single newlines (\n vs. combinations of \r and \n), but there may
be leading and trailing whitespace around each newline.
Approaches:
1. split text to list of lines that get stripped then:
a. walk this list building a new list of lines that track and
ignore extra blank lines
-OR-
b. re-join lines and replace '\n\n\n' wth' \n\n' until no more
'\n\n\n' matches exist
2. use regular expressions to match and replace whitespace
pattern of 3 or more adjacent \n's with surrounding whitespace
3. a 3rd party text processing library designed for efficiently
cleaning up text
Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20090508/6b5fe3c6/attachment.html>


More information about the Python-list mailing list