[XML-SIG] prepare_input_source and relative path

Sylvain Thénault Sylvain.Thenault at logilab.fr
Thu Feb 10 11:02:17 CET 2005

On Wednesday 09 February à 16:06, Mike Brown wrote:
> Sylvain Thénault wrote:
> > thanks a lot. Actually almost all the work is already done right there. 
> > Here is what I've worked on. Once we'll reach a consensus, I'll add that
> > to pyxml. So I've joined to this mail:
> > 
> > - a light version of 4Suite Uri.py including the following functions:
> >   SplitUriRef, UnsplitUriRef (it was really less annoying to use those
> >   two functions than the equivalent urllib's ones), Absolutize,
> >   MakeUrllibSafe, _RemoveDotSegments, BaseJoin, GetScheme and
> >   IsAbsolute. With the presented solution, the 3 last ones are not used
> >   and could be removed, but I've kept them in for now. 
> Doc strings will need to be updated to reflect the promotion from
> "rfc2396bis" to RFC 3986. Also there's one place where I have "RFC
> (newline)2396bis" which should also be fixed.

done. However, does sections of rfc 2396bis match sections of rfc 3986 ?

> In MakeUrllibSafe, you should catch the UnicodeError that could result
> from the attempt to force unicode to a byte string:
>     if isinstance(uri, unicode):
>         try:
>             uri = uri.encode('us-ascii')
>         except UnicodeError:
>             raise ValueError("uri %r must consist of ASCII characters." % uri)

> > Every tests for Absolutize from 4suite are still passing.
> I forgot to point you to my tests. They do not use unittest, so they
> would need to be adapted, but it would be easy since the comparisons
> are string-in to string-out (or exception), and I've labeled them
> pretty clearly:
>   http://cvs.4suite.org/viewcvs/4Suite/test/Lib/test_uri.py?view=markup
> As you will see, they are fairly comprehensive.

I did found them. As I said I've run relevant tests again the restricted 
version of Uri.py and all of them pass.

> > - a modified version of saxutils, expecting the Uri module above to be
> >   in the _xmlplus directory (ie importable as xml.Uri). I've refactored
> >   prepare_input_source to ease testing of the URI merging stuff.
> You might want to grep for "emacspymodestink" in your code. :)

right, forgot that :) And I've also added the following modification to
prepare_input_source since I send it here:

@@ -510,7 +510,7 @@
         source = xmlreader.InputSource()
         if hasattr(f, "name"):
-            source.setSystemId(f.name)
+            source.setSystemId('file:%s' % f.name)
     if source.getByteStream() is None:
         sysid = absolute_system_id(source.getSystemId(), base)

> > - a unittest file, which include some test cases for the URI merging
> >   function. Please take a look at the existant test cases to check
> >   everything looks fine to you. If you have other case to add, please let
> >   me know (or maybe can I add this file to the cvs first). Notice that
> >   to run the tests, you should have a "quotes.xml" file in the same
> >   directory as the test file (there is one in the test directory of
> >   pyxml). As a bonus, I've converted the escape function test from
> >   test_utils into a unittest in the same file.

did you take a look at those tests ? Sounds good to anyone here ? More
tests to add ?

> > Anyway, having SplitUriRef/UnsplitUriRef replacing 
> > urlparse.urlsplit/urlunsplit and Absolutize or BaseJoin replacing
> > urlparse.urljoin would definitly be the right thing.
> On python-dev in Sep 2004, I was discussing with Martin v. Löwi swhat 
> principles we think should be embraced by urlparse, urllib and urllib2. He 
> feels that we should simultaneously shoot for both URI and IRI support 
> according to the RFCs (3986 and 3987), with unicode arguments being assumed to 
> be IRIs.
> I would hold off on any stdlib changes until the APIs can be discussed in 
> more detail.

Sylvain Thénault                               LOGILAB, Paris (France).

http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org

More information about the XML-SIG mailing list