ctypes, memory mapped files and context manager
Hans-Peter Jansen
hpj at urpla.net
Tue Dec 27 08:37:45 EST 2016
Hi,
I'm using $subjects combination successfully in a project for
creating/iterating over huge binary files (> 5GB) with impressive performance,
while resource usage keeps pretty low, all with plain Python3 code. Nice!
Environment: (Python 3.4.5, Linux 4.8.14, openSUSE/x86_64, NFS4 and XFS
filesystems)
The idea is: map a ctypes structure onto the file at a certain offset, act on
the structure, and release the mapping. The latter is necessary for keeping
the mmap file properly resizable and closable (due to the nature of mmaps and
Python's posix implementation thereof). Hence, a context manager serves us
well (in theory).
Here's some code excerpt:
class cstructmap:
def __init__(self, cstruct, mm, offset = 0):
self._cstruct = cstruct
self._mm = mm
self._offset = offset
self._csinst = None
def __enter__(self):
# resize the mmap (and backing file), if structure exceeds mmap size
# mmap size must be aligned to mmap.PAGESIZE
cssize = ctypes.sizeof(self._cstruct)
if self._offset + cssize > self._mm.size():
newsize = align(self._offset + cssize, mmap.PAGESIZE)
self._mm.resize(newsize)
self._csinst = self._cstruct.from_buffer(self._mm, self._offset)
return self._csinst
def __exit__(self, exc_type, exc_value, exc_traceback):
# free all references into mmap
del self._csinst
self._csinst = None
def work():
with cstructmap(ItemHeader, self._mm, self._offset) as ih:
ih.identifier = ItemHeader.Identifier
ih.length = ItemHeaderSize + datasize
blktype = ctypes.c_char * datasize
with cstructmap(blktype, self._mm, self._offset) as blk:
blk.raw = data
In practice, this results in:
Traceback (most recent call last):
File "ctypes_mmap_ctx.py", line 146, in <module>
mf.add_data(data)
File "ctypes_mmap_ctx.py", line 113, in add_data
with cstructmap(blktype, self._mm, self._offset) as blk:
File "ctypes_mmap_ctx.py", line 42, in __enter__
self._mm.resize(newsize)
BufferError: mmap can't resize with extant buffers exported.
The issue: when creating a mapping via context manager, we assign a local
variable (with ..), that keep existing in the local context, even when the
manager context was left. This keeps a reference on the ctypes mapped area
alive, even if we try everything to destroy it in __exit__. We have to del the
with var manually.
Now, I want to get rid of the ugly any error prone del statements.
What is needed, is a ctypes operation, that removes the mapping actively, and
that could be added to the __exit__ part of the context manager.
Full working code example:
https://gist.github.com/frispete/97c27e24a0aae1bcaf1375e2e463d239
The script creates a memory mapped file in the current directory named
"mapfile". When started without arguments, it copies itself into this file,
until 10 * mmap.PAGESIZE growth is reached (or it errored out before..).
IF you change NOPROB to True, it will actively destruct the context manager
vars, and should work as advertized.
Any ideas are much appreciated.
Thanks in advance,
Pete
More information about the Python-list
mailing list