[BangPypers] Reg Expression problem and Unicode...

Wed May 27 04:25:49 CEST 2009

Hello everyone...

I have a small problem...I'm trying to match a pound symbol in a document
using Python.  (Version 2.5.2, although I also ran this under 2.6 and it
seemed to give the same result.  But ultimately I need correct code for
2.5.2)

Here's the code:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import re

########## This part doesn't work doesn't work right...but at least it
matches

text = "The Price £7"
pattern = u"£\d"

m = re.search(pattern, text, re.UNICODE)
print m.group(0)

########## This part fails to get a match completely

pattern = u" £\d"
m = re.search(pattern, text, re.UNICODE)
print m.group(0)

Now what I would expect to happen is that the first half of the code would
print out:
£7

Instead it prints out:
�7

Which is doubly weird cause it can print the pound charcter in the
pattern...

The second half of the code should do the same thing except include the
space at the beginning.

So I expect:
 £7

but instead get an exception.

I've tried (I believe) every combination of using re.UNICODE or not, as well
as quoting the strings with u or just leaving them as normal.  Nothing
seemed to solve the problem.

Thank you for the help!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/bangpypers/attachments/20090526/d817e1f5/attachment-0001.htm>