[Python-Dev] A few lessons from the tempfile.py rewrite

Zack Weinberg zack@codesourcery.com
Fri, 16 Aug 2002 15:30:40 -0700


While doing the tempfile.py rewrite I discovered some places where
improvements could be made to the rest of the standard library.  I'd
like to discuss these here.

1) Dummy threads module.

Currently, a library module that wishes to be thread-safe but still
work on platforms where threads are not implemented, has to jump
through hoops.  In tempfile.py we have

| try:
|     import thread as _thread
|     _allocate_lock = _thread.allocate_lock
| except (ImportError, AttributeError):
|     class _allocate_lock:
|         def acquire(self):
|             pass
|         release = acquire

It would be nice if the thread and threading modules existed on all
platforms, providing these sorts of dummy locks on the platforms that
don't actually implement threads.  I notice that Queue.py uses 'import
thread' unconditionally -- perhaps this is already the case?  I can't
find any evidence of it.

2) pthread_once equivalent.

pthread_once is a handy function in the C pthreads library which
can be used to guarantee that some data object is initialized exactly
once, and no thread sees it in a partially initialized state.  I had
to implement a fake version in tempfile.py.

| _once_lock = _allocate_lock()
| 
| def _once(var, initializer):
|     """Wrapper to execute an initialization operation just once,
|     even if multiple threads reach the same point at the same time.
| 
|     var is the name (as a string) of the variable to be entered into
|     the current global namespace.
| 
|     initializer is a callable which will return the appropriate initial
|     value for variable.  It will be called only if variable is not
|     present in the global namespace, or its current value is None.
| 
|     Do not call _once from inside an initializer routine, it will deadlock.
|     """
| 
|     vars = globals()
|     # Check first outside the lock.
|     if vars.get(var) is not None:
|         return
|     try:
|         _once_lock.acquire()
|         # Check again inside the lock.
|         if vars.get(var) is not None:
|             return
|         vars[var] = initializer()
|     finally:
|         _once_lock.release()

I call it fake for three reasons.  First, it should be using
threading.RLock so that recursive calls do not deadlock.  That's a
trivial fix (this sort of high level API probably belongs in
threading.py anyway).  Second, it uses globals(), which means that all
symbols it initializes live in the namespace of its own module, when
what's really wanted is the caller's module.  And most important, I'm
certain that this interface is Not The Python Way To Do It.
Unfortunately, I've not been able to figure out what the Python Way To
Do It is, for this problem.

3) test_support.TestSkipped and unittest.py

Simple - you can't use TestSkipped in a unittest.py-based test set.
This is a missing feature of unittest, which has no notion of skipping
a given test.  Any exception thrown from inside one of its test
routines is taken to indicate a failure.

I think the right fix here is to add a skip() method to
unittest.TestCase which works with both a bare unittest.py-based test
framework, and Python's own test_support.py.

Thoughts?

zw