bsddb.error

Fri Apr 20 12:10:40 EDT 2001

Hi,

I wonder if anyone can provide some insight to an error I am getting. I
am
unable to figure out what is causing it. I am a developing some software

for a bioinformatics research lab, specifically for the analysis of DNA
sequences.

Firstly, I am using python 1.5.2 on a machine running red hat linux 7.0.

The machine is running at 700Mhz with 380Mb RAM and about 1GB free space

on root.

In the broadest terms the program is generating a list of "motif"
objects and when the length of this list exceeds one million I am
shelving the objects. The error that I am getting is that while I am
shelving the objects I sometimes receive the following exception
bsddb.error:

90% Written
100% Written
Traceback (innermost last):
  File "runMotif.py", line 34, in ?
    filename = genMotifs(posList, S, filename)
  File "motif.py", line 161, in genMotifs
    motifs = writeMotifs(motiflist, filename, S)
  File "motif.py", line 189, in writeMotifs
    motifDict[hstring] = motif(x, S)
  File "/usr/lib/python1.5/shelve.py", line 71, in __setitem__
    self.dict[key] = f.getvalue()
bsddb.error: (0, 'Error')

Ok, now I'll be a bit more specific about what I am doing at the
shelving step! Each of the  "motif" objects in the list contains 3
attributes which contain strings of length 5 to 10. I use these strings
to generate a unique key for each motif in the shelved data file. Then
for each object in the list I perform a check to see if that key has
been generated previously. If key is not found in shelved file then
I add the object, else I simply update the shelved object accordingly.
The code is as follows:

def writeMotifs(motiflist, filename, S):
     motifDict = shelve.open(filename)
     print "writing to %s" % filename
     count = 0
     for x in motiflist:
          hstring = x[0][0] + '_' + x[1][0] + '_' + x[2][0]
          if count % 100000 == 0:
               print "%s%% Written" % (count/10000)
          if motifDict.has_key(hstring):
               temp = motifDict[hstring]
               if checkNoMotifOverlap(temp, x) is not 1:
                    temp.setCount()
                    temp.setPos(x)
                    motifDict[hstring] = temp
              else:
                   motifDict[hstring] = motif(x, S)
              count = count + 1
     length = len(motifDict)
     motifDict.close
 return length

This program is taking a huge combinatorial problem and is therefore
producing absolutely piles of data. I have checked to ensure that I am
not running out of RAM during execution, and although the shelved data
file is getting pretty big I have so far only used run the analysis on
short sequences producing output in the region of 90-160MB size. The
fact that it falls over at different sizes suggests to me that I am not
hitting some linux defined constant for file sizes or the like, but
I can't think what it could be. I have check the number of objects in
the shelved file and it has ranged between 90,000 and 143, 000 before it

keels over and dies.

Can anyone suggest any reason this problem might be occuring?

cheers

Blobby