[Python-Dev] Using itertools in modules that are part of the build chain (Re: [Python-checkins] r76264 - python/branches/py3k/Lib/tokenize.py)

Nick Coghlan ncoghlan at gmail.com
Sun Nov 15 04:06:48 CET 2009


benjamin.peterson wrote:
> Modified: python/branches/py3k/Lib/tokenize.py
> ==============================================================================
> --- python/branches/py3k/Lib/tokenize.py	(original)
> +++ python/branches/py3k/Lib/tokenize.py	Sat Nov 14 17:27:26 2009
> @@ -377,17 +377,12 @@
>      The first token sequence will always be an ENCODING token
>      which tells you which encoding was used to decode the bytes stream.
>      """
> +    # This import is here to avoid problems when the itertools module is not
> +    # built yet and tokenize is imported.
> +    from itertools import chain

This is probably a bad idea - calling tokenize.tokenize() from a thread
started as a side effect of importing a module will now deadlock on the
import lock if the module import waits for that thread to finish.

We tell people not to do that (starting and then waiting on threads as
part of module import) for exactly this reason, but it is also the
reason we avoid embedding import statements inside functions in the
standard library (not to mention encouraging third party developers to
also avoid embedding import statements inside functions).

This does constrain where we can use itertools - if we want carte
blanche to use it anywhere in the standard library, even those parts
that are imported as part of the build chain, we'll need to bite the
bullet and make it a builtin module rather than a separately built
extension module.

Cheers,
Nick.

P.S. The problem is easy to demonstrate on the current Py3k branch:

1. Put this in a module file in your py3k directory (e.g. "deadlock.py"):
-----------
import threading
import tokenize
f = open(__file__, 'rU')
def _deadlocks():
  tokenize.tokenize(f.readline)
t = threading.Thread(target=_deadlocks)
t.start()
t.join()
-----------

2. Then run: ./python -c "import deadlock"

It will, as advertised, deadlock and you'll need to use Ctrl-Brk or kill
-9 to get rid of it. (Note that preventing this kind of thing is one of
the major reasons why direct execution and even the -m switch *don't*
hang onto the import lock while running the __main__ module)

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


More information about the Python-Dev mailing list