[Tutor] large file

Alan Gauld alan.gauld at btinternet.com
Mon Jun 14 02:19:21 CEST 2010


"Hs Hs" <ilhs_hs at yahoo.com> wrote

> I have a very large file 15Gb.

> Every two lines are part of one readgroup.
> I want to add two variables to every line.

> HWUSI-EAS1211_0001:1:1:977:20764#0   RG:Z:2301
> HWUSI-EAS1211_0001:1:1:977:20764#0    RG:Z:2302
> ...
> Since I cannot read the entire file, I wanted to cat the file

What makes you think you cannot read the entire file?

> something like this:
>
> cat myfile  | python myscript.py > myfile.sam

How does that help over Python reading the file line by line?

> I do not know how to execute my logic after I read the line, 
> althought I tried:

> while True:
>        second = raw_input()
>        x =  second.split('\t')

Why are you splitting theline? You only need to append
data to the end of the line...

> Could someone help me here either what I want to do.

In pseudo code:

open input and ouput files
read the first 14 lines from input
oddLine = True
while True:
     read line from input
     if oddLine:
            append odd data
     else
           append evenData
     write line to output file
     oddLine = not oddLine

You probably want a try/except in there to catch the end of file.


This is not very different from the menu example in the file
handling topic of my tutorial...

HTH

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/




More information about the Tutor mailing list