rkmr.em at rkmr.em at
Sun Mar 18 17:38:08 CET 2007

On 3/18/07, Daniel Nogradi <nogradi at> wrote:
> > I need to process a really huge text file (4GB) and this is what i
> > "list comprehension" can fast up things. Can you point out how to do
> > f = open('file.txt','r')
> > for line in f:
> >         db[line.split(' ')[0]] = line.split(' ')[-1]
> >         db.sync()
> What is db here? Looks like a dictionary but that doesn't have a sync method.
db is a handle for Berkely db that i open with import bsddb

import bsddb

> If the file is 4GB are you sure you want to store the whole thing into
> memory?
I dont want to load it in memory. Once I call the sync() function it
get synced to the disk, and it is not loaded completely.

> use list comprehension like this:
> db = [ line.split(' ')[-1] for line in open('file.txt','r') ]
> or
> db = [ ( line.split(' ')[0], line.split(' ')[-1] ) for line in
> open('file.txt','r') ]
> depending on what exactly you want to store.

line.split(' ')[0] is the key and line.split(' ')[-1] is the value.
THat is what I want to store.
Will the second line comprehension work in this case?

