ctypes, memory mapped files and context manager

Hans-Peter Jansen hpj at urpla.net
Thu Dec 29 07:18:09 EST 2016


On Donnerstag, 29. Dezember 2016 09:33:59 Peter Otten wrote:
> Hans-Peter Jansen wrote:
> > On Mittwoch, 28. Dezember 2016 16:53:53 Hans-Peter Jansen wrote:
> 
> The minimal example is
> 
> >>> import weakref, ctypes
> >>> T = ctypes.c_ubyte * 3
> >>> t = T()
> >>> bytes(t) == b"\0" * 3
> 
> True
> 
> >>> bytes(weakref.proxy(t)) == b"\0" * 3
> 
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> AttributeError: 'c_ubyte_Array_3' object has no attribute '__bytes__'
> 
> That looks like a leaky abstraction. While I found a workaround
> 
> >>> bytes(weakref.proxy(t)[:]) == b"\0" * 3
> 
> True

I found a couple of other rough corners already, when working with the ctypes 
module. Obviously, this module is lacking some love.

> to me your whole approach is beginning to look very questionable. You know,
> 
> "If the implementation is hard to explain, it's a bad idea."
> 
> What do you gain from using the mmap/ctypes combo instead of regular file
> operations and the struct module? Your sample code seems to touch every
> single byte of the file once so that there are little to no gains from
> caching. And then your offset is basically a file position managed manually
> instead of implicitly with read, write, and seek.

Of course, the real code is a bit more complex... The code presented here is 
for demonstration purposes only. I'm not allowed to reveal the projects' code, 
but I can state, that using this combination allows for crawling through huge 
files (5-25GB) in unbelievable performance (without any further optimization), 
and updating parts of it. By delegating the whole I/O management to the 
kernel, one can observe, that python runs at full speed managing the data just 
by reference and assignment operations, all (mostly) in place. The resource 
usage is impressively low at the same time. Since the code is meant to be 
executed with many instances in parallel on a single machine, this is an 
important design criteria. 

While I would love to get rid of these dreaded and unpythonic del statements, 
I can accept them for now, until a better approach is found. 

Will dig through the ctypes module again, when I find time.

Thanks again for taking your valuable time, Peter. Much appreciated.

I wish you a Happy New Year!

Cheers,
Pete



More information about the Python-list mailing list