[Numpy-discussion] read not byte aligned records
Gmail
aymeric.rateau at gmail.com
Sun May 10 15:11:29 EDT 2015
For the archive, I tried to use bitarray instead of bitstring and for
same file parsing went from 180ms to 60ms. Code was finally shorter and
more simple but less easy to jump into (documentation).
Performance is still far from using fromstring or fromfile which gives
like 5ms for similar size of file but byte aligned.
Aymeric
my code is below:
def readBitarray(self, bita, channelList=None):
""" reads stream of record bytes using bitarray module needed
for not byte aligned data
Parameters
------------
bitarray : stream
stream of bytes
channelList : List of str, optional
Returns
--------
rec : numpy recarray
contains a matrix of raw data in a recarray (attributes
corresponding to channel name)
"""
from bitarray import bitarray
B = bitarray(endian="little") # little endian by default
B.frombytes(bytes(bita))
# initialise data structure
if channelList is None:
channelList = self.channelNames
format = []
for channel in self:
if channel.name in channelList:
format.append(channel.RecordFormat)
buf = recarray(self.numberOfRecords, format)
# read data
for chan in range(len(self)):
if self[chan].name in channelList:
record_bit_size = self.CGrecordLength * 8
temp = [B[self[chan].posBitBeg + record_bit_size * i:\
self[chan].posBitEnd + record_bit_size * i]\
for i in range(self.numberOfRecords)]
nbytes = len(temp[0].tobytes())
if not nbytes == self[chan].nBytes and \
self[chan].signalDataType not in (6, 7, 8, 9,
10, 11, 12): # not Ctype byte length
byte = 8 * (self[chan].nBytes - nbytes) *
bitarray([False])
for i in range(self.numberOfRecords): # extend data
of bytes to match numpy requirement
temp[i].append(byte)
temp = [self[chan].CFormat.unpack(temp[i].tobytes())[0] \
for i in range(self.numberOfRecords)]
buf[self[chan].name] = asarray(temp)
return buf
Le 05/05/15 15:39, Benjamin Root a écrit :
> I have been very happy with the bitarray package. I don't know if it
> is faster than bitstring, but it is worth a mention. Just watch out
> for any hashing operations on its objects, it doesn't seem to do them
> right (set(), dict(), etc...), but comparison operations work just fine.
>
> Ben Root
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150510/9d8ec65c/attachment.html>
More information about the NumPy-Discussion
mailing list