[Tutor] Re: Python question (when you have a free moment) (fwd)
Fri, 20 Aug 1999 15:00:11 -0700
How much RAM does he have?
input = open('~/yeast/chromosome04', 'r')
input = string.join(string.split(input, '\012')) # or \012\015 or
On Fri, 20 Aug 1999, Deirdre Saoirse wrote:
> A friend of mine, a molecular biologist, has a problem. He has a file,
> approximately 1.5mb, that has a bunch of returns in it (which interfere
> with his ability to perform raw searches).
> I suggested that, instead of using regex.gsub after thwapping the file
> into memory that he try:
> input = open('~/yeast/chromosome04', 'r')
> lines = 0
> S = ''
> while 1:
> s = input.readline()
> if s:
> S = S + s[:-1]
> lines = lines + 1
> (Which eliminates the need for a regex)
> ...but that, per him, didn't speed up significantly over his first
> approach. Any ideas?
> _Deirdre * http://www.linuxcabal.net * http://www.deirdre.net
> "I must say that I was really happy to see _Linux for Dummies_ -- that's
> when you know you've arrived." -- Linus Torvalds
> ---------- Forwarded message ----------
> Date: Fri, 20 Aug 1999 14:17:04 -0700
> From: Bernard <email@example.com>
> To: Deirdre Saoirse <firstname.lastname@example.org>
> Subject: Re: Python question (when you have a free moment)
> 20Aug1999 12:18AM (-0700) From [email@example.com] deirdre [Deirdre]
> > Actually, with your permission, I'd like to pose the question to the Tutor
> > list -- they can come up with some amazing speed optimizations. Ok?
> Sure! I'd love to hear of anything that will speed things up.
> As I said, I'm reluctant to dump Python if I think there's a
> chance that it could be better.
> Just a quick reminder;
> 1.5 Mb of plain text
> 50 characters per line (so ca. 30,000 lines)
> I want it all in one continuous string
> In Perl: If I read one line at a time, chomp and concatenate it takes
> a total of about 6 minutes. If I read the entire file in one go and
> then perform a global substitution it all takes ca. 0.1 s
> In Python: If I read one line at a time, discard the last character
> and concatenate it takes a total of 4 minutes. If I read the entire
> file it takes only about 0.1 s but the regex.gsub takes 4 minutes.
> Timings are on a PII/350 on a vanilla Red Hat 5.2 box (Perl 5,
> Python 1.5 - I can give you the minor versions if you need them)
> So, I'm assuming the speed-up would be at the level of the regex.gsub
> Thanks for any tips,
> Bernard P. Murray, PhD
> firstname.lastname@example.org (Department of Desserts and Toppings, San Francisco, USA)
> Tutor maillist - Tutor@python.org