[Tutor] Re: Python question (when you have a free moment) (fwd)

Fri, 20 Aug 1999 15:00:11 -0700

How much RAM does he have?

import string
input = open('~/yeast/chromosome04', 'r')
input = string.join(string.split(input, '\012'))  # or \012\015 or
whatever

On Fri, 20 Aug 1999, Deirdre Saoirse wrote:
> A friend of mine, a molecular biologist, has a problem. He has a file,
> approximately 1.5mb, that has a bunch of returns in it (which interfere
> with his ability to perform raw searches).
> 
> I suggested that, instead of using regex.gsub after thwapping the file
> into memory that he try:
> 
> #!/usr/local/bin/python
> 
> input = open('~/yeast/chromosome04', 'r')
> lines = 0
> S = ''
> while 1:
>  s = input.readline()
>  if s:
>   S = S + s[:-1]
>   lines = lines + 1
>  else:
>   break
> 
> (Which eliminates the need for a regex)
> 
> ...but that, per him, didn't speed up significantly over his first
> approach. Any ideas?
> 
> -- 
> _Deirdre   *   http://www.linuxcabal.net   *   http://www.deirdre.net
> "I must say that I was really happy to see _Linux for Dummies_ -- that's 
> when you know you've arrived." -- Linus Torvalds
> 
> ---------- Forwarded message ----------
> Date: Fri, 20 Aug 1999 14:17:04 -0700
> From: Bernard <nutella@zork.net>
> To: Deirdre Saoirse <deirdre@deirdre.net>
> Subject: Re: Python question (when you have a free moment)
> 
> 20Aug1999 12:18AM (-0700) From [deirdre@deirdre.net] deirdre [Deirdre]
> > Actually, with your permission, I'd like to pose the question to the Tutor
> > list -- they can come up with some amazing speed optimizations. Ok?
> 
> Sure!  I'd love to hear of anything that will speed things up.
> As I said, I'm reluctant to dump Python if I think there's a
> chance that it could be better.
> 	Just a quick reminder;
> 
> 1.5 Mb of plain text
> 50 characters per line (so ca. 30,000 lines)
> I want it all in one continuous string
> 
> In Perl: If I read one line at a time, chomp and concatenate it takes
> a total of about 6 minutes.  If I read the entire file in one go and
> then perform a global substitution it all takes ca. 0.1 s
> 
> In Python: If I read one line at a time, discard the last character
> and concatenate it takes a total of 4 minutes.  If I read the entire
> file it takes only about 0.1 s but the regex.gsub takes 4 minutes.
> 
> Timings are on a PII/350 on a vanilla Red Hat 5.2 box (Perl 5,
> Python 1.5 - I can give you the minor versions if you need them)
> 
> So, I'm assuming the speed-up would be at the level of the regex.gsub
> 
> 	Thanks for any tips,
> 		Bernard
> 
> Bernard P. Murray, PhD
> nutella@zork.net (Department of Desserts and Toppings, San Francisco, USA)
> 
> 
> _______________________________________________
> Tutor maillist  -  Tutor@python.org
> http://www.python.org/mailman/listinfo/tutor