Memory Problem

Christoph Scheit cscheit at lstm.uni-erlangen.de
Tue Sep 18 09:58:42 EDT 2007


On Tuesday 18 September 2007 15:10, Marc 'BlackJack' Rintsch wrote:
> On Tue, 18 Sep 2007 14:06:22 +0200, Christoph Scheit wrote:
> > Then the data is added to a table, which I use for the actual
> > Post-Processing. The table is actually a Class with several "Columns",
> > each column internally being represented by array.
>
> Array or list?

array

More details:
class DBTable:
	# the class DBTable has a list, each list entry referencing a DBColu  bject
	self.cols  = []
        
        self.dict = {, -1} #the dictionary is used to look up if an entry
        # already exists
	
class DBColumn:
	# has a name (string and a datatype (int, float, e.g.) as attribute plus
	self.data = array('f')	# an array of type float

I have to deal with several millions of data, actually I'm trying an example 
with
360 grid points and 10000 time steps, i.e. 3 600 000 entries (and each row 
consits of 4 int and one float)

Of course, the more keys the bigger is the dictionary, but is there a way to 
evaluate the actual size of the dictionary?

Greets and Thanks,

Chris
>
> > # create reader
> > breader = BDBReader("<var>", "<type>", "#")
> >
> > # read data
> > bData = breader.readDB(dbFileList[0])
> >
> > # create table
> > dTab = DBTable(breader.headings, breader.converters, [1,2])
> > addRows(bData, dTab)
> >
> > Before I add a new entry to the table, I check if there is already an
> > entry like this. To do so, I store keys for all the entries with
> > row-number in a dictionary. What about the memory consumption of the
> > dictionary?
>
> The more items you put into the dictionary the more memory it uses.  ;-)
>
> > Here the code for adding a new row to the table:
> >
> > # check if data already exists
> > if (self.keyDict.has_key(key)):
> >         rowIdx = self.keyDict[key]
> >         for i in self.mutableCols:
> >             self.cols[i][rowIdx] += rowData[i]
> >         return
> >
> >  # key is still available - insert row to table
> >  self.keyDict[key] = self.nRows
> >
> >  # insert data to the columns
> >  for i in range(0, self.nCols):
> >      self.cols[i].add(rowData[i])
> >
> >  # add row i and increment number of rows
> >  self.rows.append(DBRow(self, self.nRows))
> >  self.nRows += 1
> >
> > Maybe somebody can help me. If you need, I can give more implementation
> > details.
>
> IMHO That's not enough code and/or description of the data structure(s).
> And you also left out some information like the number of rows/columns and
> the size of the data.
>
> Have you already thought about using a database?
>
> Ciao,
> 	Marc 'BlackJack' Rintsch

-- 

============================
M.Sc. Christoph Scheit
Institute of Fluid Mechanics
FAU Erlangen-Nuremberg
Cauerstrasse 4
D-91058 Erlangen
Phone: +49 9131 85 29508
============================



More information about the Python-list mailing list