[Python-Dev] Finally switch urllib.parse to RFC3986 semantics?

Guido van Rossum guido at python.org
Fri Mar 18 04:38:42 CET 2011


On Thu, Mar 17, 2011 at 8:19 PM, Senthil Kumaran <orsenthil at gmail.com> wrote:
> Nick Coghlan wrote:
>> > The problem is that it is quite a lot of work to get fully general URI
>> > parsing to work correctly, but the overlap with legacy URL parsing is
>> > large enough that many (most?) use cases in practice work just fine
>> > with the older RFC semantics.
>
> Yes. We can have API which strictly confirms to latest RFC by
> definition, but the problem is there is code out there which 'expects'
> the parsing behavior remain unchanged so that their existing code does
> not break. And with parsing behavior unchanged means conforming to
> older RFC parsing rules.
>
> The solution seems to be extra function or an flag in the urlparse
> method which will exhibit the more latest behavior.
>
> Guido wrote:
>
>> So would having two different API functions, one legacy and one
>> conforming, be a problem? Ideally the conforming API's name would not
>> be something lame like urllib2 but something timeless. :-)
>
> :-) Should blame Jeremy for that name!. But urllib2 is long replaced
> by urllib.parse, urllib.request and urllib.response. Considering how
> you remember urllib2, I think it's name has stood the test of time.

It stood out like a sore thumb. :-)

> But seriously, I think an additional function or additional flag in the
> current functions/method in the parse module is sufficient than going
> for another module.

I vote for a new function, not a flag. (Others can explain my rule of
thumb against flag arguments whose values are nearly always
constants.)

-- 
--Guido van Rossum (python.org/~guido)


More information about the Python-Dev mailing list