[ python-Bugs-1601399 ] urllib2 does not close sockets properly

SourceForge.net noreply at sourceforge.net
Wed Nov 22 22:04:15 CET 2006


Bugs item #1601399, was opened at 2006-11-23 08:04
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1601399&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Brendan Jurd (direvus)
Assigned to: Nobody/Anonymous (nobody)
Summary: urllib2 does not close sockets properly

Initial Comment:
Python 2.5 (release25-maint, Oct 29 2006, 12:44:11)
[GCC 4.1.2 20061026 (prerelease) (Debian 4.1.1-18)] on linux2

I first noticed this when a program of mine (which makes a brief HTTPS connection every 20 seconds) started having some weird crashes.  It turned out that the process had a massive number of file descriptors open.  I did some debugging, and it became clear that the program was opening two file descriptors for every HTTPS connection it made with urllib2, and it wasn't closing them, even though I was reading all data from the response objects and then explictly calling close() on them.

I found I could easily reproduce the behaviour using the interactive console.  Try this while keeping an eye on the file descriptors held open by the python process:

To begin with, the process will have the usual FDs 0, 1 and 2 open for std(in|out|err), plus one other.

>>> import urllib2
>>> f = urllib2.urlopen("http://www.google.com")

Now at this point the process has opened two more sockets.

>>> f.read()
[... HTML ensues ...]
>>> f.close()

The two extra sockets are still open.

>>> del f

The two extra sockets are STILL open.

>>> f = urllib2.urlopen("http://www.python.org")
>>> f.read()
[...]
>>> f.close()

And now we have a total of four abandoned sockets open.

It's not until you terminate the process entirely, or the OS (eventually) closes the socket on idle timeout, that they are closed.

Note that if you do the same thing with httplib, the sockets are properly closed:

>>> import httplib
>>> c = httlib.HTTPConnection("www.google.com", 80)
>>> c.connect()

A socket has been opened.

>>> c.putrequest("GET", "/")
>>> c.endheaders()
>>> r = c.getresponse()
>>> r.read()
[...]
>>> r.close()

And the socket has been closed.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1601399&group_id=5470


More information about the Python-bugs-list mailing list