urlparse isn't working?

Thu Aug 1 13:59:54 EDT 2002

"Dale Strickland-Clark" <dale at riverhall.NOTHANKS.co.uk> wrote in message
news:51igkuo7u1ab6s9270svf7pbakusvjr2vj at 4ax.com...
> This doesn't seem right to me:
>
> >>> import urlparse
> >>> urlparse.urlparse('www.wibble.com/wibble/wibble.jpg', 'http:')
> ('http:', '', 'www.wibble.com/wibble/wibble.jpg', '', '', '')
>
> According to the help file:
> ===
> Example:
>
> urlparse('http://www.cwi.nl:80/%7Eguido/Python.html')
> yields the tuple
>
> ('http', 'www.cwi.nl:80', '/%7Eguido/Python.html', '', '', '')
> ===
>
> it should split the host from the path.
>
> I've tried different values for the second parameter (called
> 'default_scheme') with the same result.
>
> However, if you add "http://" to the url, it starts to behave:
>
> >>> urlparse.urlparse('http://www.wibble.com/wibble/wibble.jpg')
> ('http', 'www.wibble.com', '/wibble/wibble.jpg', '', '', '')
>
> So what is the point of the 'default_scheme' if it needs to be on the
> url to work properly?
>
> Or have I got confused?

Not this time ;-)

Reading the code it looks like the urlsplit() function expects the net
location to begin with "//" even when no scheme is present in the URL. The
whole module doesn't look that clever in the light of modern URL usage.
However, running

    urlparse.test()

might offer some insight into the *intended* operation of the module.

regards
-----------------------------------------------------------------------
Steve Holden                                 http://www.holdenweb.com/
Python Web Programming                http://pydish.holdenweb.com/pwp/
-----------------------------------------------------------------------