[Numpy-discussion] Question about unaligned access

Mon Jul 6 14:11:47 EDT 2015

On 06.07.2015 18:21, Francesc Alted wrote:
> 2015-07-06 18:04 GMT+02:00 Jaime Fernández del Río <jaime.frio at gmail.com
> <mailto:jaime.frio at gmail.com>>:
> 
>     On Mon, Jul 6, 2015 at 10:18 AM, Francesc Alted <faltet at gmail.com
>     <mailto:faltet at gmail.com>> wrote:
> 
>         Hi,
> 
>         I have stumbled into this:
> 
>         In [62]: sa = np.fromiter(((i,i) for i in range(1000*1000)),
>         dtype=[('f0', np.int64), ('f1', np.int32)])
> 
>         In [63]: %timeit sa['f0'].sum()
>         100 loops, best of 3: 4.52 ms per loop
> 
>         In [64]: sa = np.fromiter(((i,i) for i in range(1000*1000)),
>         dtype=[('f0', np.int64), ('f1', np.int64)])
> 
>         In [65]: %timeit sa['f0'].sum()
>         1000 loops, best of 3: 896 µs per loop
> 
>         The first structured array is made of 12-byte records, while the
>         second is made by 16-byte records, but the latter performs 5x
>         faster.  Also, using an structured array that is made of 8-byte
>         records is the fastest (expected):
> 
>         In [66]: sa = np.fromiter(((i,) for i in range(1000*1000)),
>         dtype=[('f0', np.int64)])
> 
>         In [67]: %timeit sa['f0'].sum()
>         1000 loops, best of 3: 567 µs per loop
> 
>         Now, my laptop has a Ivy Bridge processor (i5-3380M) that should
>         perform quite well on unaligned data:
> 
>         http://lemire.me/blog/archives/2012/05/31/data-alignment-for-speed-myth-or-reality/
> 
>         So, if 4 years-old Intel architectures do not have a penalty for
>         unaligned access, why I am seeing that in NumPy?  That strikes
>         like a quite strange thing to me.
> 
> 
>     I believe that the way numpy is setup, it never does unaligned
>     access, regardless of the platform, in case it gets run on one that
>     would go up in flames if you tried to. So my guess would be that you
>     are seeing chunked copies into a buffer, as opposed to bulk copying
>     or no copying at all, and that would explain your timing
>     differences. But Julian or Sebastian can probably give you a more
>     informed answer.
> 
> 
> Yes, my guess is that you are right.  I suppose that it is possible to
> improve the numpy codebase to accelerate this particular access pattern
> on Intel platforms, but provided that structured arrays are not that
> used (pandas is probably leading this use case by far, and as far as I
> know, they are not using structured arrays internally in DataFrames),
> then maybe it is not worth to worry about this too much.
> 
> Thanks anyway,
> Francesc
>  
> 
> 
>     Jaime
>      
> 
> 
>         Thanks,
>         Francesc
> 
>         -- 
>         Francesc Alted
> 
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> 
> 
>     -- 
>     (\__/)
>     ( O.o)
>     ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus
>     planes de dominación mundial.
> 
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> 
> 
> -- 
> Francesc Alted
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150706/68a4ac45/attachment.sig>