[Python-bugs-list] [ python-Bugs-620705 ] websucker relative-URL errors

noreply@sourceforge.net noreply@sourceforge.net
Mon, 14 Oct 2002 13:09:11 -0700


Bugs item #620705, was opened at 2002-10-09 06:30
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=620705&group_id=5470

Category: Demos and Tools
Group: Python 2.2.2
>Status: Closed
Resolution: None
Priority: 9
Submitted By: Alex Martelli (aleax)
Assigned to: Guido van Rossum (gvanrossum)
Summary: websucker relative-URL errors

Initial Comment:
reproduce easily with, e.g.:
python websucker.py -v http://www.aleax.it

gives a series of error messages such as:

Check http://www.aleax.it/./py2.htm
Error ('http error', 404, 'Object Not Found')
 HREF  http://www.aleax.it/./py2.htm
  from http://www.aleax.it/./Python/ (///./py2.htm)

Check http://www.aleax.it/p1.htm
Error ('http error', 404, 'Object Not Found')
 HREF  http://www.aleax.it/p1.htm
  from http://www.aleax.it/./TutWin32/index.htm (///p1.htm)

but the relevant snippets of the HTML sources are e.g:
in Python/index.html:
<A href="./py2.htm">
in TutWin32/index.html:
<a href="p1.htm">

i.e. both relative URLs, so should resolve to the URLs
of the files that ARE present, Python/py2.htm and
TutWin32/p1.htm respectively.

And indeed /usr/bin/wget has no problem fetching
the whole small site.

Pls let me know if you want me to explore the bug further
and prepare a patch in time for 2.2.2 release -- otherwise
I think this shd at least be documented as a known bug
(making websucker close to unusable, alas).


Alex





----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-10-14 16:09

Message:
Logged In: YES 
user_id=6380

OK, fixed in 2.2.2 and 2.3. Whew!

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-10-14 15:24

Message:
Logged In: YES 
user_id=6380

Argh! This looks like a bug in urlparse.py, introduced
somewhere in or after 2.2.

In 2.1, or in 2.2.1:

>>> import urlparse
>>> urlparse.urlunparse(urlparse.urlparse('./Python'))
'./Python'
>>> 

In 2.2.2 or 2.3:

>>> import urlparse
>>> urlparse.urlunparse(urlparse.urlparse('./Python'))
'///./Python'
>>> 

I'

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=620705&group_id=5470