list comprehension help
Alex Martelli
aleax at mac.com
Mon Mar 19 01:11:44 EDT 2007
rkmr.em at gmail.com <rkmr.em at gmail.com> wrote:
...
> > > files (you see "huge" is really relative ;-)) on 2-4GB RAM boxes and
> > > setting a big buffer (1GB or more) reduces the wall time by 30 to 50%
> > > compared to the default value. BerkeleyDB should have a buffering
> > Out of curiosity, what OS and FS are you using? On a well-tuned FS and
>
> Fedora Core 4 and ext 3. Is there something I should do to the FS?
In theory, nothing. In practice, this is strange.
> Which should I do? How much buffer should I allocate? I have a box
> with 2GB memory.
I'd be curious to see a read-only loop on the file, opened with (say)
1MB of buffer vs 30MB vs 1GB -- just loop on the lines, do a .split() on
each, and do nothing with the results. What elapsed times do you
measure with each buffersize...?
If the huge buffers confirm their worth, it's time to take a nice
critical look at what other processes you're running and what all are
they doing to your disk -- maybe some daemon (or frequently-run cron
entry, etc) is out of control...? You could try running the benchmark
again in single-user mode (with essentially nothing else running) and
see how the elapsed-time measurements change...
Alex
More information about the Python-list
mailing list