Matching horizontal white space
Ben Finney
bignose+hates-spam at benfinney.id.au
Sun Sep 14 18:55:20 EDT 2008
Magnus.Moraberg at gmail.com writes:
> multipleSpaces = re.compile(u'\\h+')
>
> importantTextString = '\n \n \n \t\t '
> importantTextString = multipleSpaces.sub("M", importantTextString)
Please get into the habit of following the Python coding style guide
<URL:http://www.python.org/dev/peps/pep-0008>.
For literal strings that you expect to contain backslashes, it's often
clearer to use the "raw" string syntax:
multiple_spaces = re.compile(ur'\h+')
> I would have expected consecutive spaces and tabs to be replaced by
> M
Why, what leads you to expect that? Your regular expression doesn't
specify spaces or tabs. It specifies "the character 'h', one or more
times".
For "space or tab", specify a character class of space and tab:
>>> multiple_spaces = re.compile(u'[\t ]+')
>>> important_text_string = u'\n \n \n \t\t '
>>> multiple_spaces.sub("M", important_text_string)
u'\nM\nM\nM'
You probably want to read the documentation for the Python 're' module
<URL:http://www.python.org/doc/lib/module-re>. This is standard
practice when using any unfamiliar module from the standard library.
--
\ “If you do not trust the source do not use this program.” |
`\ —Microsoft Vista security dialogue |
_o__) |
Ben Finney
More information about the Python-list
mailing list