How to write replace string for object which will be substituted? [regexp]

Jon Clements joncle at googlemail.com
Wed Aug 5 07:40:05 EDT 2009


On 5 Aug, 07:53, ryniek <rynie... at gmail.com> wrote:
> On 5 Sie, 00:55, MRAB <pyt... at mrabarnett.plus.com> wrote:
>
>
>
> > ryniek90 wrote:
> > > Hi.
> > > I started learning regexp, and some things goes well, but most of them
> > > still not.
>
> > > I've got problem with some regexp. Better post code here:
>
> > > "
> > >  >>> import re
> > >  >>> mail = '\nn... at mail.com\nname1 [at] mail [dot] com\nname2 [$at$]
> > > mail [$dot$] com\n'
> > >  >>> mail
> > > '\nn... at mail.com\nname1 [at] mail [dot] com\nname2 [$at$] mail [$dot$]
> > > com\n'
> > >  >>> print mail
>
> > > n... at mail.com
> > > name1 [at] mail [dot] com
> > > name2 [$at$] mail [$dot$] com
>
> > >  >>> maail = re.sub('^\n|$\n', '', mail)
> > >  >>> print maail
> > > n... at mail.com
> > > name1 [at] mail [dot] com
> > > name2 [$at$] mail [$dot$] com
> > >  >>> maail = re.sub(' ', '', maail)
> > >  >>> print maail
> > > n... at mail.com
> > > name1[at]mail[dot]com
> > > name2[$at$]mail[$dot$]com
> > >  >>> maail = re.sub('\[at\]|\[\$at\$\]', '@', maail)
> > >  >>> print maail
> > > n... at mail.com
> > > name1 at mail[dot]com
> > > name2 at mail[$dot$]com
> > >  >>> maail = re.sub('\[dot\]|\[\$dot\$\]', '.', maail)
> > >  >>> print maail
> > > n... at mail.com
> > > na... at mail.com
> > > na... at mail.com
> > >  >>> #How must i write the replace string to replace all this regexp's
> > > with just ONE command, in string 'mail' ?
> > >  >>> maail = re.sub('^\n|$\n| |\[at\]|\[\$at\$\]|\[dot\]|\[\$dot\$\]',
> > > *?*, mail)
> > > "
>
> > > How must i write that replace pattern (look at question mark), to maek
> > > that substituion work? I didn't saw anything helpful while reading Re
> > > doc and HowTo (from Python Doc). I tried with 'MatchObject.group()' but
> > > something gone wrong - didn't wrote it right.
> > > Is there more user friendly HowTo for Python Re, than this?
>
> > > I'm new to programming an regexp, sorry for inconvenience.
>
> > I don't think you can do it in one regex, nor would I want to. Just use
> > the string's replace() method.
>
> >  >>> mail = '\nn... at mail.com\nname1 [at] mail [dot] com\nname2 [$at$]
> > mail [$dot$] com\n'
> >  >>> mail
> > '\nn... at mail.com\nname1 [at] mail [dot] com\nname2 [$at$] mail [$dot$]
> > com\n'
> >  >>> print mail
>
> > n... at mail.com
> > name1 [at] mail [dot] com
> > name2 [$at$] mail [$dot$] com
>
> >  >>> maail = mail.strip()
> > n... at mail.com
> > name1 [at] mail [dot] com
> > name2 [$at$] mail [$dot$] com
>
> >  >>> maail = maail.replace(' ', '')
> >  >>> print maail
> > n... at mail.com
> > name1[at]mail[dot]com
> > name2[$at$]mail[$dot$]com
> >  >>> maail = maail.replace('[at]', '@').replace('[$at$]', '@')
> >  >>> print maail
> > n... at mail.com
> > name1 at mail[dot]com
> > name2 at mail[$dot$]com
> >  >>> maail = maail.replace('[dot]', '.').replace('[$dot$]', '.')
> >  >>> print maail
> > n... at mail.com
> > na... at mail.com
> > na... at mail.com
>
> Too bad, I thought that the almighty re module could do anything, but
> it failed with this (or maybe re can do what i want, but only few
> people knows how to force him to that?  :P).
> But with help of MRAB, i choose The 3rd Point of Python's Zen -
> "Simple is better than complex."
>
> "
>
> >>> mail = '\nn... at mail.com\nname1 [at] mail [dot] com\nname2 [$at$] mail [$dot$] com\n'
> >>> mail
>
> '\nn... at mail.com\nname1 [at] mail [dot] com\nname2 [$at$] mail [$dot$]
> com\n'
>
> >>> print mail
>
> n... at mail.com
> name1 [at] mail [dot] com
> name2 [$at$] mail [$dot$] com
>
> >>> maail = mail.lstrip().rstrip().replace(' ', '').replace('[dot]', '.').replace('[$dot$]', '.').replace('[at]', '@').replace('[$at$]', '@')
> >>> print maail
>
> n... at mail.com
> na... at mail.com
> na... at mail.com
>
> >>> #Did it  :)
>
> "
>
> Thanks again   :)

Short of writing a dedicated function I might be tempted to write this
as:

EMAIL_REPLACEMENTS = (
    ('[at]', '@'),
    ('[dot]', '.'),
    ...
)

for src, dest in EMAIL_REPLACEMENTS:
    mail = mail.replace(src, dest)

Apart from taste reasons, it keeps the replaces more obvious (and
accessible via a variable rather than embedded in the code), enables
swapping the order or adding/removing easier.

Jon




More information about the Python-list mailing list