ctypes, memory mapped files and context manager
Peter Otten
__peter__ at web.de
Tue Dec 27 15:39:51 EST 2016
Hans-Peter Jansen wrote:
> Hi,
>
> I'm using $subjects combination successfully in a project for
> creating/iterating over huge binary files (> 5GB) with impressive
> performance, while resource usage keeps pretty low, all with plain Python3
> code. Nice!
>
> Environment: (Python 3.4.5, Linux 4.8.14, openSUSE/x86_64, NFS4 and XFS
> filesystems)
>
> The idea is: map a ctypes structure onto the file at a certain offset, act
> on the structure, and release the mapping. The latter is necessary for
> keeping the mmap file properly resizable and closable (due to the nature
> of mmaps and Python's posix implementation thereof). Hence, a context
> manager serves us well (in theory).
>
> Here's some code excerpt:
>
> class cstructmap:
> def __init__(self, cstruct, mm, offset = 0):
> self._cstruct = cstruct
> self._mm = mm
> self._offset = offset
> self._csinst = None
>
> def __enter__(self):
> # resize the mmap (and backing file), if structure exceeds mmap
> # size mmap size must be aligned to mmap.PAGESIZE
> cssize = ctypes.sizeof(self._cstruct)
> if self._offset + cssize > self._mm.size():
> newsize = align(self._offset + cssize, mmap.PAGESIZE)
> self._mm.resize(newsize)
> self._csinst = self._cstruct.from_buffer(self._mm, self._offset)
> return self._csinst
Here you give away a reference to the ctypes.BigEndianStructure. That means
you no longer control the lifetime of self._csinst which in turn holds a
reference to the underlying mmap or whatever it's called.
There might be a way to release the mmap reference while the wrapper
structure is still alive, but the cleaner way is probably to not give it
away in the first place, and create a proxy instead with
return weakref.proxy(self._csinst)
>
> def __exit__(self, exc_type, exc_value, exc_traceback):
> # free all references into mmap
> del self._csinst
The line above is redundant. It removes the attribute from the instance
__dict__ and implicitly decreases its refcount. It does not actually
physically delete the referenced object. If you remove the del statement the
line below will still decrease the refcount.
Make sure you understand this to avoid littering your code with cargo cult
del-s ;)
> self._csinst = None
>
>
> def work():
> with cstructmap(ItemHeader, self._mm, self._offset) as ih:
> ih.identifier = ItemHeader.Identifier
> ih.length = ItemHeaderSize + datasize
>
> blktype = ctypes.c_char * datasize
> with cstructmap(blktype, self._mm, self._offset) as blk:
> blk.raw = data
>
>
> In practice, this results in:
>
> Traceback (most recent call last):
> File "ctypes_mmap_ctx.py", line 146, in <module>
> mf.add_data(data)
> File "ctypes_mmap_ctx.py", line 113, in add_data
> with cstructmap(blktype, self._mm, self._offset) as blk:
> File "ctypes_mmap_ctx.py", line 42, in __enter__
> self._mm.resize(newsize)
> BufferError: mmap can't resize with extant buffers exported.
>
> The issue: when creating a mapping via context manager, we assign a local
> variable (with ..), that keep existing in the local context, even when the
> manager context was left. This keeps a reference on the ctypes mapped area
> alive, even if we try everything to destroy it in __exit__. We have to del
> the with var manually.
>
> Now, I want to get rid of the ugly any error prone del statements.
>
> What is needed, is a ctypes operation, that removes the mapping actively,
> and that could be added to the __exit__ part of the context manager.
>
> Full working code example:
> https://gist.github.com/frispete/97c27e24a0aae1bcaf1375e2e463d239
>
> The script creates a memory mapped file in the current directory named
> "mapfile". When started without arguments, it copies itself into this
> file, until 10 * mmap.PAGESIZE growth is reached (or it errored out
> before..).
>
> IF you change NOPROB to True, it will actively destruct the context
> manager vars, and should work as advertized.
>
> Any ideas are much appreciated.
You might put some more effort into composing example scripts. Something
like the script below would have saved me some time...
import ctypes
import mmap
from contextlib import contextmanager
class T(ctypes.Structure):
_fields = [("foo", ctypes.c_uint32)]
@contextmanager
def map_struct(m, n):
m.resize(n * mmap.PAGESIZE)
yield T.from_buffer(m)
SIZE = mmap.PAGESIZE * 2
f = open("tmp.dat", "w+b")
f.write(b"\0" * SIZE)
f.seek(0)
m = mmap.mmap(f.fileno(), mmap.PAGESIZE)
with map_struct(m, 1) as a:
a.foo = 1
with map_struct(m, 2) as b:
b.foo = 2
>
> Thanks in advance,
> Pete
More information about the Python-list
mailing list