How to match a hard space

Alex Martelli aleax at aleax.it
Thu Oct 3 08:53:23 EDT 2002


A wrote:

> I need to replace in a string several spaces by one space only for example

Judging from your example you must also replace newline characters AND
strip leading (and perhaps trailing?) ones, in which case the simplest
way to do it may be:

vy = ' '.join(text.split())


> It works well only if there is a normal space( \x20 value) but if there is
> a hard space( \x20\xA0) it does not work.

I'm not sure what a "hard space" is, but if it just means that sequence
of two bytes, and 0xA0 occurs in no other circumstance, you can simply
change all occurrences of chr(0xA0) to a space from your text before you 
keep treating it, e.g.:

import string
dehard = string.maketrans(chr(0xA0), ' ')

vy = ' '.join(text.translate(dehard).split())

But if you insist on using RE's, no problem:

> p = re.compile('\s+')

Just change the RE's pattern to r'[\s\xA0]+' (you _should_ always
use rawstring literals for RE patterns, or your backslashes will
trip you up one of these days...).  If you need to change only the
specific two-char sequence '\x20\xA0', and leave other \xA0's alone,
method .replace of strings lets you do that:

vy = ' '.join(text.replace('\x20\xA0', ' ').split())



Alex




More information about the Python-list mailing list