[Web-SIG] Threading and client-side support
ianb at colorstudy.com
Mon Oct 27 13:47:49 EST 2003
On Monday, October 27, 2003, at 08:45 AM, John J Lee wrote:
>> urlopen_lock = threading.Lock()
>> def urlopen(url, data=None):
> OK, thanks, that's basically as my vague understanding had it, but I
> the impression that there were all kinds of flavours of thread-safety,
> guaranteeing various subtly different things? I guess I've got some
> reading to do...
Different parts of the system may be threadsafe, while others are not.
For instance DB-API has threadsafety "levels", which is just a way of
indicating which parts of the system are threadsafe, e.g., level 0
means nothing is threadsafe, level 1 means connections aren't
threadsafe so you have to use one connection for each thread, and
higher levels mean that objects deeper in the system become threadsafe.
The analog of level 0 is bad, because you have to serialize all
operations for the entire process. Level 1 isn't so bad (it's what
most DB-API drivers have), it just means you have to create a new
handler/connection/whatever object for each thread (but you have to be
very explicit about that requirement). Or if object creation is
expensive you have to do pooling, which is an incentive to make object
> Some thinking out loud in case anybody cares to help clear up my
> Hmm, urllib2 doesn't do what your example does, but I suppose
> OpenerDirectors don't currently have any state that could get lost in a
> race condition in that particular case. That would change with cookie
I'm not sure about urllib2 in particular, but anything you initialize
at the module level doesn't have to be protected. So in ClientCookie
if you didn't lazily create the opener, it wouldn't be a problem. Or,
if it's no big deal if you recreate the object twice then it's not a
problem -- just unnecessarily recreating an object because of a very
specific race condition isn't a problem. But if that meant that one of
the objects created got lost, but maybe someone would still have a
reference to that object (so it wasn't *completely* lost), then that
would be a problem (and probably a very hard to debug problem if you
> Am I going to have a hard time spotting all the places where I need
> I can't see any other place where I'd need locks other than in
> I suppose I need to lock all access to all CookieJar methods, so that
> neither reading or writing state can happen whenever CookieJar state is
> changing? I suppose I'd also need to just label the .cookies
> attribute as
> non-threadsafe (or get rid of it, or add a __getattr__ to allow
> locking it
> -- yuck). Can I justify saying that some of this is the application's
> problem? For example, perhaps the .filename and attribute of CookieJar
> could mess things up if altered by one thread while another thread was
> reading it in order to open a file? Is it the application's own stupid
> fault if it fails to lock access to that attribute in cases where that
> might happen, or is it CookieJar's problem?
You can't be sure of what concurrency expectations the application has.
But in general reads don't have to be protected, unless someone is
reading multiple things and expecting consistency between those reads.
If it's a problem that you read value A, then someone changes the
related value B in another thread, then you read B and it doesn't fit
with A, then there's a threading issue for a read. Andrew pointed out
a possible example of this with cookies and expiration.
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
More information about the Web-SIG