Can not get urllib.urlopen to work

Sean Berry sean at buildingonline.com
Wed Oct 27 23:26:08 CEST 2004


>I am trying to implement the recipe listed at
> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/211886
>
> However, I can not get to first base. When I try to run
>
> import urllib
> fo=urllib.urlopen("http://www.dictionary.com/")
> page = fo.read()
>
> I get:
>
> Traceback (most recent call last):
>  File "C:/Program Files/Python/Lib/idlelib/testurl", line 2, in -toplevel-
>    fo=urllib.urlopen("http://www.dictionary.com/")
>  File "C:\PROGRA~1\PYTHON\lib\urllib.py", line 76, in urlopen
>    return opener.open(url)
>  File "C:\PROGRA~1\PYTHON\lib\urllib.py", line 181, in open
>    return getattr(self, name)(url)
>  File "C:\PROGRA~1\PYTHON\lib\urllib.py", line 297, in open_http
>    h.endheaders()
>  File "C:\PROGRA~1\PYTHON\lib\httplib.py", line 712, in endheaders
>    self._send_output()
>  File "C:\PROGRA~1\PYTHON\lib\httplib.py", line 597, in _send_output
>    self.send(msg)
>  File "C:\PROGRA~1\PYTHON\lib\httplib.py", line 564, in send
>    self.connect()
>  File "C:\PROGRA~1\PYTHON\lib\httplib.py", line 548, in connect
>    raise socket.error, msg
> IOError: [Errno socket error] (10061, 'Connection refused')
>

Connection refused is the key.  Apparently dictionary.com figured out that
people were trying to use their resourses without giving them credit.  I
have done some research on how to accomplish this so that people can not use
the cgi-bin programs (and others) that I write.

A few years back I needed to get information about the weather based on zip
code.  I used to use weather.com but they fixed the hole.  They made any
port 80 request forward to another address, which in turn forwarded you to
the original request.  So if you west to www.weather.com, it would forward
you to www2.weather.com, then back to www.weather.com.  The index page at
www.weather.com would then check the referrer to see if it came from
www2.weather.com.  If the referrer was correct, then you got the page
content.  If it was wrong, there was no page.

Similarly, www2.weather.com would check to see that the referrer was
www.weather.com... so there was no way around it.

I use another method to protect my programs... but it still does the same
thing ultimatly... stops people from using my programs.

Like it has been mentioned, a good starting place is Telnet.  I tried
telnetting when I first read this post and got a connection refused, but now
I can get through - weird.  I also tried a source dump from lynx... which 10
minutes ago did not work, but now it does.

# telnet www.dictionary.com 80

# lynx -source -preparse -dump http://www.dictionary.com

These both work to show the source of the index page.  And the python recipe 
even works now.... go figure. 





More information about the Python-list mailing list