[XML-SIG] Weirdness (bug?) with smart_len (wasRe: Issues with Unicode type)
Eric van der Vlist
vdv@dyomedea.com
25 Sep 2002 11:24:19 +0200
On Mon, 2002-09-23 at 23:16, Uche Ogbuji wrote:
> Oh, but then Python is so much simpler:
>=20
> =20
> SP_PAT =3D re.compile(u"[\uD800-\uDBFF][\uDC00-\uDFFF]")
> def smart_len(u):
> sp_count =3D len(SP_PAT.findall(u))
> return len(u) - sp_count
>=20
I am trying to use this when python is compiled with ucs2, but I am
seeing a weird behavior when using this function: it seems that it can't
stand being compiled as a .pyc!
I have:
test.py:
#!/usr/bin/env python
import Smart_len
print Smart_len.smart_len(u'\U00010800')
and Smart_len.py:
import re
SP_PAT =3D re.compile(u"[\uD800-\uDBFF][\uDC00-\uDFFF]")
def smart_len(u):
sp_count =3D len(SP_PAT.findall(u))
return len(u) - sp_count
It's working the 1st time (or when I remove Smart_len.pyc) but fails
after the second execution:
vdv@ibook:~/xmlschemata-cvs/downloads/python/xvif$ rm Smart_len.pyc
vdv@ibook:~/xmlschemata-cvs/downloads/python/xvif$ ./test.py=20
1
vdv@ibook:~/xmlschemata-cvs/downloads/python/xvif$ ./test.py=20
Traceback (most recent call last):
File "./test.py", line 2, in ?
import Smart_len
UnicodeError: UTF-8 decoding error: unexpected code byte
Weird, isn't it?
Thanks
Eric
--=20
Rendez-vous =E0 Paris.
http://www.technoforum.fr/integ2002/index.html
------------------------------------------------------------------------
Eric van der Vlist http://xmlfr.org http://dyomedea.com
(W3C) XML Schema ISBN:0-596-00252-1 http://oreilly.com/catalog/xmlschema
------------------------------------------------------------------------