thread-safe generator in standard library
In comp.lang.python, Andrae Muys started a discussion about how to use generators in a thread-safe manner. Newsgroup Subjects: Suggested generator to add to threading module. Accessing a shared generator from multiple threads There were several suggestions (some buggy) about how to implement it, and some questions about whether it was worth doing. Aahz asked why you would want to do it. The most obvious use case is to generate unique keys (as lisp gensym). The standard library currently has several ways to create temporary filenames. Unfortunately, they aren't all threadsafe, they often enforce the "temporary" aspect, they eventually run into hashing collisions, there is no good way to include ordering information, etc. The fact that these are in the standard library suggests that it is a common use case. The fact that there are several different versions each with their own drawbacks suggests that the problem is hard enough to justify putting a good solution in the library. At that point, Aahz suggested moving the discussion here, so ... I am doing so. Specific questions: (1) Which library? The tempfile libraries are an immediate use, but they are not the only use. The problem only comes up with threads, but it isn't an integral part of threading. itertools might be appropriate, but I'm not sure people will look to "tools" when dealing with "threads". It is also a bit different from the other tools. It could be its own module, but that adds more overhead. It could be part of the queue module, since a queue interface provides one of the solutions. (Override _get() to use iter.next() while already inside the threadlock. But what should be done about "put" and "qsize"? Should the user be able to check "empty()" without consuming a value or getting a StopIteration?) (2) Are there requirements on the implementation beyond "it works"? Does it need to be fast? Use/ avoid certain extensions? (3) What API? Should it provide a default iterator, or require that one be passed in? If one is passed in, does can/should this module assume that it is the only direct consumer, or does it need to worry about someone else calling .next() on the original generator? -jJ
[Jewett, Jim J]
... The most obvious use case is to generate unique keys (as lisp gensym).
Just noting a practical hack that's often sufficient: import sys genunique = iter(xrange(sys.maxint)).next Then each call to genunique() delivers "the next" (short) integer, and it inherits thread safety from the global interpreter lock. A similar effect can be gotten via import itertools genunique = itertools.count().next and that also inherits thread safety from the GIL. A difference is that the xrange spelling stops when it reaches sys.maxint, but the .count spelling silently wraps around to -sys.maxint-1 then (undetected overflow at the C level).
participants (2)
-
Jewett, Jim J
-
Tim Peters