[Numpy-discussion] read not byte aligned records

aymeric.rateau at gmail.com aymeric.rateau at gmail.com
Tue May 5 07:07:42 EDT 2015

To answer Jerome (I hope), data is sometime spread on bytes shared by other data in the whole record. 10 bits was an example, sometimes, 24, 2, 8, 7 etc. all combined including some padding between them. I am not sure to have understood...

To Nathaniel, yes indeed I could read the records in big/long bytes and apply right_shift and bitwise_and functions to extract each channels. I am a bit afraid of performance though.

I am currently using bitstring module which is doing exactly this bits handling. It is implemented in both pure python and cython.
I tried to use the pure python and performance drawback compared to byte aligned data is around 2-3x for similar file sizes.
--> I will try with bitstring's cython implementation.
--> I will also try the way using right_shift and bitwise_and
Best will win but at least I am sure I am not missing any trick or optimisation and I am in the right direction from your answers.
Thanks !

5 mai 2015 08:15 "Nathaniel Smith" <njs at pobox.com> a écrit:
> On Mon, May 4, 2015 at 10:21 PM, Jerome Kieffer <Jerome.Kieffer at esrf.fr> wrote:
>> Hi,
>> If you want to play with 10 bits data-blocks, read 5 bytes and work with 4 entries at a time...
> NumPy arrays don't have any support for sub-byte alignment. So if you
> want to handle such data, you either need to write some manual
> packing/unpacking code (using bitshift operators, or perhaps
> np.unpackbits, or whatever), or use another library designed for doing
> this. You may find Cython useful to write the core packing/unpacking,
> since bit-by-bit processing in a for loop is not something that
> CPython is super well suited to.
> Good luck,
> -n
> --
> Nathaniel J. Smith -- http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

More information about the NumPy-Discussion mailing list