[Numpy-discussion] fromrecords yields "ValueError: invalid itemsize in generic type tuple"

Friedrich Romstedt friedrichromstedt at gmail.com
Wed Dec 29 06:28:56 EST 2010

2010/12/7 Rajat Banerjee <rbanerj at fas.harvard.edu>:
> Hi All,
> I have been using Numpy for a while with great success. I left my
> little project for a little while
> (http://web.mit.edu/stardev/cluster/) and now some of my code is
> broken.
> I have some Numpy code to create graphs of activity on a cluster with
> matplotlib. It ran just fine in July / August 2010, but has since
> stopped working. I have updated numpy on my machine, I think.
> In [2]: np.version.version
> Out[2]: '1.5.1'
> My call to np.rec.fromrecords() is throwing this exception:
>  File "/home/rajat/Envs/StarCluster/lib/python2.6/site-packages/numpy/core/records.py",
> line 607, in fromrecords
>    descr = sb.dtype((record, dtype))
> ValueError: invalid itemsize in generic type tuple
> Here is the code with some irrelevant stuff stripped:
>        for line in file:
>            a = [datetime.strptime(parts[0], '%Y-%m-%d %H:%M:%S.%f'),
>                 int(parts[1]), int(parts[2]), int(parts[3]), int(parts[4]),
>                 int(parts[5]), int(parts[6]), float(parts[7])]
>            list.append(a)
>        file.close()
>        names = ['dt', 'hosts', 'running_jobs', 'queued_jobs',\
>                 'slots', 'avg_duration', 'avg_wait', 'avg_load']
>        descriptor = {'names':
> ('dt,hosts,running_jobs,queued_jobs,slots,avg_duration,avg_wait,avg_load'),\
>                      'formats' : ('S20','u','u','u','u','u','u','f')}
>        self.records = np.rec.fromrecords(list,','.join(names)) #used to work
>        #self.records = np.rec.fromrecords(list, dtype=descriptor) #new attempt
> Here is one "line" from the array "list":
>>>> parts (8) = ['2010-12-07 03:09:46.855712', '2', '2', '177', '2', '86', '370', '1.05'].
> Neither of those np.rec.fromrecords() calls works. I've tried both
> separately. They both throw the exact same exception, ValueError:
> invalid itemsize in generic type tuple

Hi Rajat,

seems to be good that I read all email on the list, seems to be bad
that it's such a long queue.

Consider the script attached.  Remarks:

*  Use tuples as rows in the numpy.rec array "raw" argument.  It works
for the first conversion with [] too, but I think more by incident
than by design.  For the second case, which you will need, it does not
work with lists.
*  Always use keyword args to fromrecords().  I believe this is a)
more error-prone b) there is no specification for positional
arguments, so their order might change (as it seems to have happened).
 With positional "names", it ceases working.  I don't know what it
thinks you are requesting, but for sure not "names". :-)
*  Don't use the *dtype* in the way you did.  I'm not authoritative
with the *dtype* arg, but at least it doesn't work this way.  Use the
names= and formats= kwargs instead.

I just tinkered a bit around with your code without deep knowledge of
the numpy.rec package.  I just used fromrecords() some time ago in the
way I did use it here.


P.S.: Please reply, if you don't I'll resend the email to you OL in
the assumtion that you desperately disappointedly unsubscribed.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rec.py
Type: application/octet-stream
Size: 405 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101229/fadc4fa7/attachment.obj>

More information about the NumPy-Discussion mailing list