[XML-SIG] Re: Issues with Unicode type
Daniel Veillard
veillard@redhat.com
Mon, 23 Sep 2002 17:32:04 -0400
On Mon, Sep 23, 2002 at 03:16:08PM -0600, Uche Ogbuji wrote:
> Oh, but then Python is so much simpler:
>
>
> SP_PAT = re.compile(u"[\uD800-\uDBFF][\uDC00-\uDFFF]")
> def smart_len(u):
> sp_count = len(SP_PAT.findall(u))
> return len(u) - sp_count
>
>
> Problem solved.
modulo the space and CPU requirements for the operation (okay you can tell
I'm primarilly a C coder :-)
> The great thing about Python is even when it frustrates you one moment, it
> finds a way to quickly make up for it.
I don't think chars are classes but types, and hence one cannot
make a subclass of strings whose instances could have all length/walk/extract
operations being special cased to reflect XML unicode string. I (and Eric
I bet) would like to be wrong on this :-)
Daniel
--
Daniel Veillard | Red Hat Network https://rhn.redhat.com/
veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/