Program inefficiency?

thebjorn BjornSteinarFjeldPettersen at gmail.com
Sat Sep 29 14:13:26 EDT 2007


On Sep 29, 7:55 pm, Pablo Ziliani <pa... at decode.com.ar> wrote:
> thebjorn wrote:
> > On Sep 29, 5:22 pm, hall.j... at gmail.com wrote:
>
> >> I wrote the following simple program to loop through our help files
> >> and fix some errors (in case you can't see the subtle RE search that's
> >> happening, we're replacing spaces in bookmarks with _'s)
> >> (...)
>
> > Ugh, that was entirely too many regexps for my taste :-)
>
> > How about something like:
>
> > def attr_ndx_iter(txt, attribute):
> >     (...)
> > def substr_map(txt, indices, fn):
> >     (...)
> > def transform(s):
> >     (...)
> > def zap_spaces(txt, *attributes):
> >     (...)
> > def mass_replace():
> >     (...)
>
> Oh yeah, now it's clear as mud.

I'm anxiously awaiting your beacon of clarity ;-)

> I do think that the whole program shouldn't take more than 10 lines of
> code

Well, my mass_replace above is 10 lines, and the actual replacement
code is a one liner. Perhaps you'd care to illustrate how you'd
shorten that while still keeping it "clear"?

> using one sensible regex

I have no doubt that it would be possible to do with a single regex.
Whether it would be sensible or not is another matter entirely...

> (impossible to define without knowing the real input and output formats).

Of course, but I don't think you can guess too terribly wrong. My
version handles upper and lower case attributes, quoting with single
(') and double (") quotes, and any number of spaces in attribute
values. It maintains all other text as-is, and converts spaces to
underscores in href and name attributes. Did I get anything majorly
wrong?

> And (sorry to tell) I'm convinced this is a problem for regexes, in
> spite of anybody's personal taste.

Well, let's see it then :-)

smack-smack'ly y'rs
-- bjorn




More information about the Python-list mailing list