[Tutor] large file

bob gailer bgailer at gmail.com
Mon Jun 14 00:42:01 CEST 2010


On 6/13/2010 5:45 PM, Hs Hs wrote:
> hi:
>
> I have a very large file 15Gb. Starting from 15th line this file shows 
> the following lines:
>
> HWUSI-EAS1211_0001:1:1:977:20764#0
>
> HWUSI-EAS1211_0001:1:1:977:20764#0
>
> HWUSI-EAS1521_0001:1:1:978:13435#0
>
> HWUSI-EAS1521_0001:1:1:978:13435#0
>
>
> Every two lines are part of one readgroup. I want to add two variables 
> to every line. First variable goes to all lines with odd numbers. 
> Second variable should be appended to all even number lines.  like the 
> following:
>
> HWUSI-EAS1211_0001:1:1:977:20764#0   RG:Z:2301
>
> HWUSI-EAS1211_0001:1:1:977:20764#0    RG:Z:2302
>
> HWUSI-EAS1521_0001:1:1:978:13435#0  RG:Z:2301
>
> HWUSI-EAS1521_0001:1:1:978:13435#0  RG:Z:2302
>
> Since I cannot read the entire file, I wanted to cat the file
>
> something like this:
>
> cat myfile  | python myscript.py > myfile.sam
>
>
> I do not know how to execute my logic after I read the line, althought 
> I tried:
>
> myscript.py:
>
> while True:
>         second = raw_input()
>         x =  second.split('\t')
>
>

# do something with first 14 lines?
while True:
     line = raw_input().rstrip()
     if not line: break
     print line + "   RG:Z:2301"
     line = raw_input().rstrip()
     if not line: break
     print line + "   RG:Z:2302"

-- 
Bob Gailer
919-636-4239
Chapel Hill NC

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20100613/f8971af0/attachment-0001.html>


More information about the Tutor mailing list