threading issues with statcache
Tim Peters
tim.one at home.com
Sat Jan 27 21:05:29 EST 2001
[posted & mailed]
[Randall Kern]
> Looking at the code for the statcache module from py 1.5.2, it
> looks like it isn't thread safe.
It is if you serialize all calls to it yourself <wink>.
> While writing my own substitute, I realized I am unclear
> on when names are looked up.
Every time they're (dynamically) referenced. No exceptions (e.g., global,
local, builtin are all the same in this respect; call "len(x)" inside a
loop, and "len" is looked up anew on each iteration -- and so is "x", for
that matter).
> In particular, given two function like these (copied from statcache):
>
> cache = {}
> def stat(path):
> if cache.has_key(path):
> return cache[path]
>
> cache[path] = ret = os.stat(path)
> return ret
>
> def reset():
> global cache
> cache = {}
>
>
> If the symbol 'cache' is looked up _once_ per function,
It is not.
> then these two functions may be used across multiple threads. If
> it is looked up for every reference, than it would be possible
> to call reset() between a TRUE has_key() and the return in stat(),
> which would result in a KeyError.
Yup! Good eye. Looks like statcache is full of insecurities like that.
Some of them are easy to fix; e.g.,
def stat(path):
ret = cache.get(path, None)
if ret is None:
cache[path] = ret = os.stat(path)
return ret
I'll try to make time to fix this stuff for 2.1a2 (btw, 1.5.2 is ancient --
move up to 2.0! it's good practice for upgrading to 2.1, in which statcache
will be thread-safe <wink>).
> When are global variable's bound?
Sorry, don't think I understand this question. Any vrbl, whether global or
local, is bound when and only when a binding stmt is executed in which the
vrbl appears as a binding target. If it would make your life easier,
consider changing the body of reset to:
def reset():
cache.clear()
Then nothing in statcache will ever rebind the name "cache" -- but you'd
still be vulnerable to all the same race conditions (i.e., the rebinding is
not the cause of the problems, it's that *content* may disappear between the
time one stmt thinks it exists and a later stmt *acts* on that belief).
unraveling-the-thread-ly y'rs - tim
More information about the Python-list
mailing list