<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Nov 21, 2015 at 8:54 PM, G Jones <span dir="ltr"><<a href="mailto:glenn.caltech@gmail.com" target="_blank">glenn.caltech@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div><div><div>Hi,<br></div>Using the latest numpy from anaconda (1.10.1) on Python 2.7, I found that the following code works OK if npackets = 2, but acts bizarrely if npackets is large (2**12):<br><br>-----------<br><br>npackets = 2**12<br>dlen=2048<br>PacketType = np.dtype([('timestamp','float64'),<br>                           ('pkts',np.dtype(('int8',(npackets,dlen)))),<br>                           ('data',np.dtype(('int8',(npackets*dlen,)))),<br>                           ])<br><br>b = np.zeros((1,),dtype=PacketType)<br><br>b['timestamp']  # Should return array([0.0])<br><br>----------------<br><br></div>Specifically, if npackets is large, i.e. 2**12 or 2**16, trying to access b['timestamp'] results in 100% CPU usage while the memory consumption is increasing by hundreds of MB per second. When I interrupt, I find the traceback in numpy/core/_internal.pyc : _get_all_field_offsets<br></div><div>Since it seems to work for small values of npackets, I suspect that if I had the memory and time, the access to b['timestamp'] would eventually return, so I think the issue is that the algorithm doesn't scale well with record dtypes made up of lots of bytes.<br></div>Looking on Github, I can see this code has been in flux recently, but I can't quite tell if the issue I'm seeing is addressed by the issues being discussed and tackled there.<br></div></div></div></blockquote><div><br></div><div>This should be fixed in 1.10.2.  1.10.2rc1 is up on sourceforge if you want to test it.<br><br></div><div>Chuck<br></div></div></div></div>