[stdlib-sig] urllib.parrrrse does not supporrrrt bytes

M.-A. Lemburg mal at egenix.com
Sun Sep 20 22:09:50 CEST 2009


Antoine Pitrou wrote:
> 
>> So you think the broken behavior is the best way to encourage people
>> working with Python 3?  Sounds about right.
> 
> You know, it would be better if you demonstrated that the behaviour is
> broken, rather than asserting it. The fact that a long discussion led to
> the current API is a good hint that it is probably not as broken as you
> make it to be.

Agreed.

The quote/unquote() APIs implement everything you need to
support non-ASCII URLs - they even give you a choice of using
a different encoding for the %-escaped parts of the URL.

Armin, what else do you think you need ?

If you do know the encoding of the URL, then you can easily
convert a bytes URL into a Unicode one.

If not, then it's better to stick to the standards and have the
functions raise an exception if needed.

If you don't want to see errors, use the errors="replace"
error handler or just use "latin-1" as encoding for both
unquote() and quote() - while it's not necessarily correct,
it should get you pretty close to the bytes-behavior you
see in Python 2.x.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 20 2009)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


More information about the stdlib-sig mailing list