[New-bugs-announce] [issue39430] tarfile.open(mode="r") race condition when importing lzma

Maciej Gol report at bugs.python.org
Thu Jan 23 06:15:38 EST 2020

New submission from Maciej Gol <maciej.gol at curo.ch>:

Hey guys,

We have a component that archives and unarchives multiple files in separate threads that started to
misbehave recently.

We have noticed a bunch of `AttributeError: module 'lzma' has no attribute 'LZMAFile'` errors, which are
unexpected because our python is not compiled with LZMA support.

What is unfortunate, is that given the traceback:

    Traceback (most recent call last):
      File "test.py", line 18, in <module>
        list(pool.map(test_lzma, range(100)))
      File "/opt/lang/python37/lib/python3.7/concurrent/futures/_base.py", line 598, in result_iterator
        yield fs.pop().result()
      File "/opt/lang/python37/lib/python3.7/concurrent/futures/_base.py", line 428, in result
        return self.__get_result()
      File "/opt/lang/python37/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
        raise self._exception
      File "/opt/lang/python37/lib/python3.7/concurrent/futures/thread.py", line 57, in run
        result = self.fn(*self.args, **self.kwargs)
      File "test.py", line 14, in test_lzma
        tarfile.open(fileobj=buf, mode="r")
      File "/opt/lang/python37/lib/python3.7/tarfile.py", line 1573, in open
        return func(name, "r", fileobj, **kwargs)
      File "/opt/lang/python37/lib/python3.7/tarfile.py", line 1699, in xzopen
        fileobj = lzma.LZMAFile(fileobj or name, mode, preset=preset)
    AttributeError: module 'lzma' has no attribute 'LZMAFile'

the last line of the traceback is right AFTER this block (tarfile.py:1694):

        import lzma
    except ImportError:
        raise CompressionError("lzma module is not available")

Importing lzma in ipython fails properly:

    In [2]: import lzma                                                                                                                               
    ModuleNotFoundError                       Traceback (most recent call last)
    <ipython-input-2-c0255b54beb9> in <module>
    ----> 1 import lzma

    /opt/lang/python37/lib/python3.7/lzma.py in <module>
        25 import io
        26 import os
    ---> 27 from _lzma import *
        28 from _lzma import _encode_filter_properties, _decode_filter_properties
        29 import _compression

    ModuleNotFoundError: No module named '_lzma'

When trying to debug the problem, we have noticed it's not deterministic. In order to reproduce it,
we have created a test python that repeatedly writes an archive to BytesIO and then reads from it.
Using it with 5 threads and 100 calls, gives very good chances of reproducing the issue. For us it
was almost every time.

Race condition occurs both on Python 3.7.3 and 3.7.6.
Test script used to reproduce it attached.

I know that the test script writes uncompressed archives and during opening tries to guess the compression.
But I guess this is a legitimate scenario and should not matter in this case.

files: test.py
messages: 360551
nosy: Maciej Gol
priority: normal
severity: normal
status: open
title: tarfile.open(mode="r") race condition when importing lzma
type: crash
versions: Python 3.7
Added file: https://bugs.python.org/file48860/test.py

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list