[Numpy-discussion] Problems creating numpy.array with a dtype

Sat May 15 00:24:03 EDT 2010

Hello, I am really liking Numpy a lot. It is wonderful to be able to do 
the things that it does in a language as friendly as Python, and with 
the performance Numpy delivers over standard Python. Thanks.

I am having a problem with creation of Numpy arrays with my generated 
dtypes. I am creating a dataset of a weeks worth of financial 
instruments data, in order to explore and test relationships with 
various Technical Analysis functions.

My basic data is simply a list (or tuple) of lists (or tuples).
((startdate, bidopen, bidhigh, bidlow, bidclose, askopen, askhigh, 
asklow, askclose), ...)

Nothing unusual. However I am creating arrays which have many, many more 
columns to allow storing the data generated from applying the functions 
to the original data.

I have created two functions. One to dynamically create the dtype based 
on data I want to create for the exploration. And another to create the 
array and populate it with the initial data from a database.

Code slightly modified, not tested.

#examples
taFunctions = (smva, wmva)
inputColumns = (bidclose, ohlcavg)

def createDType():
     """Will create a dtype based on the pattern for naming and the
        parameters of those items being stored in the array.
     """
     dttypes = [('startdate','object'),
         ('bidopen','f8'), ('bidhigh','f8'),
         ('bidlow','f8'), ('bidclose','f8'),
         ('askopen','f8'), ('askhigh','f8'),
         ('asklow','f8'), ('askclose','f8'),
         ('ocavg','f8'), ('hlavg','f8'), ('ohlavg','f8'),
         ('ohlcavg','f8'), ('direction','i1'), ('volatility', 'f8'),
         ('spread', 'f8'), ('pivot', 'S4')]
     for f in taFunctions:
         for i in inputColumns:
             dttypes.append((f+"-"+i,'f8'))
     dtminute = np.dtype(dttypes)
     return dtminute, dttypes

def getArray(instrument, weekString=None):
     ...
     cur.execute(sql)
     weekData = cur.fetchall()
     wdata = []
     lst = []
     dtminute, dttypes = createDType()
     for i in dttypes:
         if i[1] == 'f8': lst.append(0.0)
         elif i[1] == 'i1': lst.append(0)
         else: lst.append('')
     for m in weekData:
         data = list(m)+lst[9:]
         wdata.append(data)
     return np.array(wdata,dtype=dtminute)

The createDType() function works fine. The getArray() function fails with:
ValueError: Setting void-array with object members using buffer.

However changing the getArray() function to this works just fine.

def getArray(instrument, weekString=None):
     ...
     cur.execute(sql)
     weekData = cur.fetchall()
     arrayLength = len(weekData)
     lst = []
     dtminute, dttypes = createDType()
     for i in dttypes:
         if i[1] == 'f8': lst.append(0.0)
         elif i[1] == 'i1': lst.append(0)
         else: lst.append('')
     listLength = len(lst)
     weekArray = np.zeros(arrayLength, dtype=dtminute)
     for i in range(arrayLength):
         for j in range(listLength):
             if j < 9: weekArray[i][j] = weekData[i][j]
             else: weekArray[i][j] = lst[j]
     return weekArray

After I finally worked out getArray number two I am back in business 
writing the rest of my app. But I banged my head on version number one 
for quite some time trying to figure out what I am doing wrong. I still 
don't know.

I find no errors in my data length or types. I would thing that either 
would cause version two to fail also.

In help in understanding is greatly appreciated.

This is using Numpy 1.4.1.

Thanks.

Jimmie