[XML-SIG] Re: Issues with Unicode type

Daniel Veillard veillard@redhat.com
Thu, 26 Sep 2002 05:18:13 -0400


On Thu, Sep 26, 2002 at 09:08:53AM +0200, Eric van der Vlist wrote:
> Unless we rely on external C extensions such as the ones developed by
> Daniel for libxml, I just see no way to be "natively conform"!

  Hum, independantly, the XML Schemas regexp support will be in the
next version of libxml2 python bindings:

paphio:~/XML/python -> python
Python 1.5.2 (#1, Apr  3 2002, 18:16:26)  [GCC 2.96 20000731 (Red Hat Linux 7.2 2 on linux-i386
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> import libxml2
>>> re = libxml2.regexpCompile("a(b|c){2,3}d")
>>> re.regexpExec("abcd")
1
>>> re.regexpExec("acccd")
1
>>> re.regexpExec("abd")  
0
>>> re.regexpExec("accccd")
0
>>> re = libxml2.regexpCompile("((a|b|\p{Nd}){1,2}|aaa|bbbb){1,2}")
>>> re.regexpExec("bab")
1
>>> re.regexpExec("aaca")
0
>>> re.regexpExec("aaabbbb")
1
>>> re.regexpExec("a0b")    
1
>>> re.regexpExec("aa0aaa")
0
>>> re.regexpExec("b0aaa")
1
>>> 

  strings consumed are expected to be UTF8 encoded, there is also
support for the block escape \p{IsX} based on the version of 
the Unicode map database (April 2002).

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard@redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/