[Tutor] performance considerations

Andrei Kulakov ak@silmarill.org
Mon, 03 Dec 2001 13:10:41 -0500


On Mon, Dec 03, 2001 at 12:51:39PM -0500, dman wrote:
> 
> I'm working on a script/program to do some processing (munging) on
> text files in a directory tree.  In addition, for convenience, all
> non-interesting files are copied without munging.  My script, as it
> stands, is quite CPU intensive so I would like to optimize it some.
> 
> In the case of copying non-interesting files, what is (generally) the
> most efficient way?  It seems there is no system-level "copy" command.

shutil.copy()

> Is there a better way than read each byte of the file and write it out
> to the new file?  I suppose I could just create a hard link, if I am
> willing to tie it down to Unix-only.  Does anyone have recommendations
> on how many bytes I should read at a time?
> 
> A portion of the script generates strings by starting with 'a' and
> "adding" to it.  Ie "a", "b", ..., "z", "aa", "ab", ..., "zz", "aaa".
> Would it be better to use a list of one-char-strings than to modify a
> single string?  Here's the code I have now (BTW, that funny-looking

I believe so.. profile!

> "isinstance" stuff requires 2.2) (also I am certain that this is not
> where most of the time is spent anyways) :
> 
>     def increment( s ) :
>         """
>         Increment the string.  Recursively "carries" if needed.
>         """
> 
>         assert isinstance( s , str ) , "'s' must be a string"
> 
>         # a special case, for terminating recursion
>         if s == "" :
>             return "a"
> 
>         # the ordinal of the next character in succession
>         next_ord = ord( s[-1] ) + 1
> 
>         # check for overflow
>         if ord( 'a' ) <= next_ord <= ord( 'z' ) :
>             s = s[:-1] + chr( next_ord )
>         else :
>             s = increment( s[:-1] ) + "a"
>         return s
>     # end increment()
> 
> 
> One last question for now :
> I traverse the interesting files line-by-line and check them for a
> regex match, then modify the line if it matches properly.  Would it be
> better (faster) to read in the whole file and treat it as a single
> string?  Memory is not a problem.

Yeah, probably.. profile!

> 
> TIA,
> -D
> 
> -- 
> 
> It took the computational power of three Commodore 64s to fly to the moon.
> It takes at least a 486 to run Windows 95.
> Something is wrong here.
> 
> 
> _______________________________________________
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor

-- 
Cymbaline: intelligent learning mp3 player - python, linux, console.
get it at: cy.silmarill.org