Re: [Numpy-discussion] converting list of int16 values to bitmask and back to bitmask and back to list of int32\float values

Hi again, I realize that my question was not clear enough, so I've refined it into one runnable function (attached below) My question is basically - is there a way to perform the same operation, but faster using NumPy (or even just by using Python better..) Thanks again and sorry for the unclearness.. Nissim. import struct def Convert(): Endian = '<I' # Big endian ParameterFormat = 'f' # float32 RawDataList = [17252, 26334, 16141, 58057,17252, 15478, 16144, 43257] # list of int32 registers NumOfParametersInRawData = int(len(RawDataList)/2) Result = [] for i in range(NumOfParametersInRawData): # pack every 2 registers, take only the first 2 bytes from each one, change their endianess than unpack them back to the Parameter format Result.append((struct.unpack(ParameterFormat,(struct.pack(Endian,RawDataList[(i*2)+1])[0:2] + struct.pack('<I',RawDataList[i*2])[0:2])))[0])

On 08/10/17 09:12, Nissim Derdiger wrote:
Hi again, I realize that my question was not clear enough, so I've refined it into one runnable function (attached below) My question is basically - is there a way to perform the same operation, but faster using NumPy (or even just by using Python better..) Thanks again and sorry for the unclearness.. Nissim.
import struct
def Convert(): Endian = '<I' # Big endian < is little endian. Make sure you're getting out the right values! ParameterFormat = 'f' # float32 RawDataList = [17252, 26334, 16141, 58057,17252, 15478, 16144, 43257] # list of int32 registers NumOfParametersInRawData = int(len(RawDataList)/2) Result = [] for i in range(NumOfParametersInRawData): Iterating over indices is not very Pythonic, and there's usually a better way. In this case: for int1, int2 in zip(RawDataList[::2], RawDataList[1::2])
# pack every 2 registers, take only the first 2 bytes from each one, change their endianess than unpack them back to the Parameter format Result.append((struct.unpack(ParameterFormat,(struct.pack(Endian,RawDataList[(i*2)+1])[0:2] + struct.pack('<I',RawDataList[i*2])[0:2])))[0])
You can do this a little more elegantly (and probably faster) with struct by putting it in a list comprehension: [struct.unpack('f', struct.pack('<HH', i1 & 0xffff, i2 & 0xffff))[0] for i1, i2 in zip(raw_data[::2], raw_data[1::2])] Numpy can also do it. You can get your array of little-endian shorts with le_shorts = np.array(raw_data, dtype='<u2') and then reinterpret the bytes backing it as float32 with np.frombuffer: np.frombuffer(le_shorts.data, dtype='f4') For small lists like the one in your example, the two approaches are equally fast. For long ones, numpy is much faster: In [82]: raw_data Out[82]: [17252, 26334, 16141, 58057, 17252, 15478, 16144, 43257] In [83]: raw_data2 = np.random.randint(0, 2**32, size=10**6, dtype='u4') # 1 million random integers In [84]: %timeit np.frombuffer(np.array(raw_data, dtype='<u2').data, dtype='f4') 6.45 µs ± 60.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) In [85]: %timeit np.frombuffer(np.array(raw_data2, dtype='<u2').data, dtype='f4') 854 µs ± 37.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) In [86]: %timeit [struct.unpack('f', struct.pack('<HH', i1 & 0xffff, i2 & 0xffff))[0] for i1, i2 in zip(raw_data[::2], raw_data[1::2])] 4.87 µs ± 17.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) In [87]: %timeit [struct.unpack('f', struct.pack('<HH', i1 & 0xffff, i2 & 0xffff))[0] for i1, i2 in zip(raw_data2[::2], raw_data2[1::2])] 3.6 s ± 9.78 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) -- Thomas

On 08/10/17 22:50, Thomas Jollans wrote:
On 08/10/17 09:12, Nissim Derdiger wrote:
Hi again, I realize that my question was not clear enough, so I've refined it into one runnable function (attached below) My question is basically - is there a way to perform the same operation, but faster using NumPy (or even just by using Python better..) Thanks again and sorry for the unclearness.. Nissim.
import struct
def Convert(): Endian = '<I' # Big endian < is little endian. Make sure you're getting out the right values! ParameterFormat = 'f' # float32 RawDataList = [17252, 26334, 16141, 58057,17252, 15478, 16144, 43257] # list of int32 registers NumOfParametersInRawData = int(len(RawDataList)/2) Result = [] for i in range(NumOfParametersInRawData): Iterating over indices is not very Pythonic, and there's usually a better way. In this case: for int1, int2 in zip(RawDataList[::2], RawDataList[1::2])
# pack every 2 registers, take only the first 2 bytes from each one, change their endianess than unpack them back to the Parameter format Result.append((struct.unpack(ParameterFormat,(struct.pack(Endian,RawDataList[(i*2)+1])[0:2] + struct.pack('<I',RawDataList[i*2])[0:2])))[0])
You can do this a little more elegantly (and probably faster) with struct by putting it in a list comprehension:
[struct.unpack('f', struct.pack('<HH', i1 & 0xffff, i2 & 0xffff))[0] for i1, i2 in zip(raw_data[::2], raw_data[1::2])]
Numpy can also do it. You can get your array of little-endian shorts with
le_shorts = np.array(raw_data, dtype='<u2')
and then reinterpret the bytes backing it as float32 with np.frombuffer:
np.frombuffer(le_shorts.data, dtype='f4')
For small lists like the one in your example, the two approaches are equally fast. For long ones, numpy is much faster:
*sigh* let's try that again: In [82]: raw_data Out[82]: [17252, 26334, 16141, 58057, 17252, 15478, 16144, 43257] In [83]: raw_data2 = np.random.randint(0, 2**32, size=10**6, dtype='u4') In [84]: %timeit np.frombuffer(np.array(raw_data, dtype='<u2').data, dtype='f4') 6.45 µs ± 60.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) In [85]: %timeit np.frombuffer(np.array(raw_data2, dtype='<u2').data, dtype='f4') 854 µs ± 37.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) In [86]: %timeit [struct.unpack('f', struct.pack('<HH', i1 & 0xffff, i2 & 0xffff))[0] for i1, i2 in zip(raw_data[::2], raw_data[1::2])] 4.87 µs ± 17.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) In [87]: %timeit [struct.unpack('f', struct.pack('<HH', i1 & 0xffff, i2 & 0xffff))[0] for i1, i2 in zip(raw_data2[::2], raw_data2[1::2])] 3.6 s ± 9.78 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [82]: raw_data Out[82]: [17252, 26334, 16141, 58057, 17252, 15478, 16144, 43257] In [83]: raw_data2 = np.random.randint(0, 2**32, size=10**6, dtype='u4') # 1 million random integers In [84]: %timeit np.frombuffer(np.array(raw_data, dtype='<u2').data, dtype='f4') 6.45 µs ± 60.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) In [85]: %timeit np.frombuffer(np.array(raw_data2, dtype='<u2').data, dtype='f4') 854 µs ± 37.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) In [86]: %timeit [struct.unpack('f', struct.pack('<HH', i1 & 0xffff, i2 & 0xffff))[0] for i1, i2 in zip(raw_data[::2], raw_data[1::2])] 4.87 µs ± 17.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) In [87]: %timeit [struct.unpack('f', struct.pack('<HH', i1 & 0xffff, i2 & 0xffff))[0] for i1, i2 in zip(raw_data2[::2], raw_data2[1::2])] 3.6 s ± 9.78 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
-- Thomas
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
participants (2)
-
Nissim Derdiger
-
Thomas Jollans