On Thu, Apr 21, 2011 at 10:06 PM, Gökhan Sever <gokhansever@gmail.com> wrote:
Hello, Given this piece of code (I can provide the meg file off-the list for those who wants to reproduce the error)
Can you instead construct a test as simple as possible for this? It sounds like you need only a two line string to reproduce this. The bug sounds similar to http://projects.scipy.org/numpy/ticket/1689. Ralf
import numpy as np f = open("a08A0122.341071.meg", "rb") dt = np.dtype([('line1', '|S80'), ('line2', np.object_), ('line3', '|S80'), ('line4', '|S80'), ('line5', '|S80'), ('line6', '|S2'), ('line7', np.int32, 2000), ('line8', '|S2'), ('line9', np.int32, 2000), ('line10', '|S2')]) k = np.fromstring(f.read(dt.itemsize), dt)[0] Accessing k causes a "Segmentation fault (core dumped)" and kills my python and IPython sessions immediately. I actually know that the culprit is "np.object_" in this case. The original was as ('line2', '|S81') however those meg files (mix of text and binary content) have a funny habit of switching from 80 characters to 81 (including "/r/n" chars). I was testing if I could create a variable length string dtype, which seems not possible. Little more info: that line2 has time stamps, one of which is in the form of 22:34:59.999. I have seen in the file that 22:34:59.999 was originally written as 22:34:59.1000 which causes that extra character flow. (Interestingly, millisecond should cycle from 0-999 and overflow at 999 instead of 1000 which to me indicates a slight bug) Because of this reason, I can't read the whole content of those meg files since somewhere in the middle fromstring attempts reading a shifted (erroneous) content. Should I go fix that millisecond overflow first or is there an alternative way to approach this problem?