[Tutor] Reading/dealing/matching with truly huge (ascii) files

Alan Gauld alan.gauld at btinternet.com
Thu Feb 23 11:07:57 CET 2012

On 23/02/12 01:55, Elaina Ann Hyde wrote:

> line 158, in get_lines
>      lines = table.splitlines()
> MemoryError
> ----------------------
> So this means I don't have enough memory to run through the large file?

Probably, or the code you are using is doing something extremely 

> Even if I just read in with asciitable I get this problem, I looked
> again and the large file is 1.5GB of text lines, so very large.

How much RAM do you have? Probably only 1-2G? so I'd suggest trying
another approach.

Peter has suggested a couple of ideas.

The other way is to simply load both files into database tables and use 
a SQL SELECT to pull out the combined lines. This will probably be 
faster than trying to do line by line stitch ups in Python.

You can also use the SQL interactive prompt to experiment with the query 
till you are sure its right!

Do you know any SQL? If not it is very easy to learn.
(See the database topic in my tutorial(v2 only) )

Alan G
Author of the Learn to Program web site

More information about the Tutor mailing list