speeding up reading files (possibly with cython)

Carl Banks pavlovevidence at gmail.com
Sun Mar 8 07:16:17 EDT 2009


On Mar 7, 3:06 pm, per <perfr... at gmail.com> wrote:
> hi all,
>
> i have a program that essentially loops through a textfile file thats
> about 800 MB in size containing tab separated data... my program
> parses this file and stores its fields in a dictionary of lists.

When building a very large structure like you're doing, the cyclic
garbage collector can be a bottleneck.  Try disabling the cyclic
garbage collector before building the large dictionary, and re-
enabling it afterwards.

import gc
gc.disable()
try:
    for line in file:
        split_values = line.strip().split('\t')
        # do stuff with split_values
finally:
    gc.enable()



Carl Banks



More information about the Python-list mailing list