schizophrenic view of what is white space

Jean-Paul Calderone exarkun at
Thu Dec 4 15:33:37 CET 2008

On Thu, 04 Dec 2008 14:27:49 +0000, Robin Becker <robin at> wrote:
>Is python of two minds about what is white space. I notice that split, strip 
>seem to regard u'\xa0' (NO-BREAK SPACE) as white, but that code is not 
>matched by the \s pattern. If this difference is intended can we rely on it 
> >>> u'a b'.split()
>[u'a', u'b']
> >>> u'a\xa0b'.split()
>[u'a', u'b']
> >>> re.compile(r'\s').search(u'a b')
><_sre.SRE_Match object at 0x00DBB2C0>
> >>> re.compile(r'\s').search(u'a\xa0b')
> >>>

You have to give the re module an additional hint that you care about

  exarkun at charm:~$ python
  Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52) 
  [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> import re
  >>> print re.compile(r'\s').search(u'a\xa0b')
  >>> print re.compile(r'\s', re.U).search(u'a\xa0b')
  <_sre.SRE_Match object at 0xb7dbb3a0>


More information about the Python-list mailing list