[Tutor] Re: Python question (when you have a free moment) (fwd)

Patrick Phalen python-tutor@teleo.net
Fri, 20 Aug 1999 15:19:45 -0700


duh. missed a step ... make that

import string
input = open('~/yeast/chromosome04', 'r')
input = input.read()
input = string.join(string.split(input, '\012')) 

# or to preserve white space between lines
input = string.join(string.split(input, '\n'), ' ')


On Fri, 20 Aug 1999, I wrote:
> How much RAM does he have?
> 
> import string
> input = open('~/yeast/chromosome04', 'r')
> input = string.join(string.split(input, '\012'))  # or \012\015 or
> whatever
> 
> On Fri, 20 Aug 1999, Deirdre Saoirse wrote:
> > A friend of mine, a molecular biologist, has a problem. He has a file,
> > approximately 1.5mb, that has a bunch of returns in it (which interfere
> > with his ability to perform raw searches).
> > 
> > I suggested that, instead of using regex.gsub after thwapping the file
> > into memory that he try:
> > 
> > #!/usr/local/bin/python
> > 
> > input = open('~/yeast/chromosome04', 'r')
> > lines = 0
> > S = ''
> > while 1:
> >  s = input.readline()
> >  if s:
> >   S = S + s[:-1]
> >   lines = lines + 1
> >  else:
> >   break
> > 
> > (Which eliminates the need for a regex)
> > 
> > ...but that, per him, didn't speed up significantly over his first
> > approach. Any ideas?
> > 
> > -- 
> > _Deirdre   *   http://www.linuxcabal.net   *   http://www.deirdre.net
> > "I must say that I was really happy to see _Linux for Dummies_ -- that's 
> > when you know you've arrived." -- Linus Torvalds
> > 
> > ---------- Forwarded message ----------
> > Date: Fri, 20 Aug 1999 14:17:04 -0700
> > From: Bernard <nutella@zork.net>
> > To: Deirdre Saoirse <deirdre@deirdre.net>
> > Subject: Re: Python question (when you have a free moment)
> > 
> > 20Aug1999 12:18AM (-0700) From [deirdre@deirdre.net] deirdre [Deirdre]
> > > Actually, with your permission, I'd like to pose the question to the Tutor
> > > list -- they can come up with some amazing speed optimizations. Ok?
> > 
> > Sure!  I'd love to hear of anything that will speed things up.
> > As I said, I'm reluctant to dump Python if I think there's a
> > chance that it could be better.
> > 	Just a quick reminder;
> > 
> > 1.5 Mb of plain text
> > 50 characters per line (so ca. 30,000 lines)
> > I want it all in one continuous string
> > 
> > In Perl: If I read one line at a time, chomp and concatenate it takes
> > a total of about 6 minutes.  If I read the entire file in one go and
> > then perform a global substitution it all takes ca. 0.1 s
> > 
> > In Python: If I read one line at a time, discard the last character
> > and concatenate it takes a total of 4 minutes.  If I read the entire
> > file it takes only about 0.1 s but the regex.gsub takes 4 minutes.
> > 
> > Timings are on a PII/350 on a vanilla Red Hat 5.2 box (Perl 5,
> > Python 1.5 - I can give you the minor versions if you need them)
> > 
> > So, I'm assuming the speed-up would be at the level of the regex.gsub
> > 
> > 	Thanks for any tips,
> > 		Bernard
> > 
> > Bernard P. Murray, PhD
> > nutella@zork.net (Department of Desserts and Toppings, San Francisco, USA)
> > 
> > 
> > _______________________________________________
> > Tutor maillist  -  Tutor@python.org
> > http://www.python.org/mailman/listinfo/tutor
> 
> _______________________________________________
> Tutor maillist  -  Tutor@python.org
> http://www.python.org/mailman/listinfo/tutor