re documentation error

Carlos Gaston Alvarez cgaston at moonqzie.com
Mon Sep 17 09:43:42 EDT 2001


It seems that I dont understand.
I am reading the doc at http://py-howto.sourceforge.net/regex/node19.html
and the example is the 5.2 (search and replace).
I run the example (at last) and the results where

>>> re.sub('x*', '-', 'abxc')
'-a-b--c-'
>>> re.sub('x', '-', 'abxc')
'ab-c'

there is an small diference, in doc it said that the first one should return

'-a-b-d-'

when it returned

'-a-b--c-'


So there was a small mistake. :_) .  Anyway, now I undestand why it is
putting the - in from of every letter. I was reading x* as if it were an x+.
It seems that it starts matching not from the first letter but before (with
an empty string, so it does match).

Thanks,

Gaston

ps: stupid of me. Instead of trying it and asking myself why it was
'misbehaving' I blamed the documentation.

ps2: 'intelligence in the world remains constant but population keeps
growing'



----- Original Message -----
From: "Chris Gonnerman" <chris.gonnerman at newcenturycomputers.net>
To: "Carlos Gaston Alvarez" <cgaston at moonqzie.com>
Cc: <python-list at python.org>
Sent: Monday, September 17, 2001 2:18 PM
Subject: Re: re documentation error


> ----- Original Message -----
> From: "Carlos Gaston Alvarez" <cgaston at moonqzie.com>
>
>
> > Empty matches are replaced only when not they're not adjacent to a
> previous
> > match.
>
> I don't understand how this ^ has anything to do with this v
>
> > >>> p = re.compile('x*')
> > >>> p.sub('-', 'abxd')
> > '-a-b-d-'
>
> You have an expression matching zero or more x's, and you are substituting
> a dash.  This result is exactly right.
>
> > I would expect the result to be.
> > 'abd'
>
> Why?  You are putting a dash into the string.  If you had said you
expected
> the result to be 'ab-d' I would know you didn't understand the expression,
> but evidently you do.  Do you think that a dash in the *substitution*
string
> means something special?  With the exception of backslash-escapes, there
is
> *nothing* special about that string.
>
> > If the '-' is representing no char, an empty string (as the text says)
> then
>
> The text (2.1 is what I am looking at) says nothing of the sort where the
> example you show is described.  It says, to wit:
>
>     Empty matches for the pattern are replaced only when not adjacent to a
>     previous match, so "sub('x*', '-', 'abc')" returns '-a-b-c-'.
>
> Are you mixing this up with the example a few paragraphs prior?  That
> example
> is using a *function* for the replacement value and has nothing to do with
> the rule you are complaining of.
>
> > I would like it to say
> > >>> p = re.compile('x*')
> > >>> p.sub('', 'abxd')
> > 'abd'
> >
> > Which is an example that does teachs nothing new.
> >
> > is - an special char of re for representing nothing?
> > Dont think so.
>
> In fact, it's not.  The example doesn't say it is.
>
> > Chau,
> >
> > Gaston
> >
> >
> >
> > --
> > http://mail.python.org/mailman/listinfo/python-list
> >
> >
>
>





More information about the Python-list mailing list