Excess whitespace in my soup
Fredrik Lundh
fredrik at pythonware.com
Sat Jan 19 07:00:57 EST 2008
John Machin wrote:
> I'm happy enough with reassembling the second item. The problem is in
> reliably and correctly collapsing the whitespace in each of the above
> fiveelements. The standard Python idiom of u' '.join(text.split())
> won't work because the text is Unicode and u'\xa0' is whitespace
> and would be converted to a space.
would this (or some variation of it) work?
>>> re.sub("[ \n\r\t]+", " ", u"foo\n frab\xa0farn")
u'foo frab\xa0farn'
</F>
More information about the Python-list
mailing list