How to write replace string for object which will be substituted? [regexp]
Jon Clements
joncle at googlemail.com
Wed Aug 5 07:40:05 EDT 2009
On 5 Aug, 07:53, ryniek <rynie... at gmail.com> wrote:
> On 5 Sie, 00:55, MRAB <pyt... at mrabarnett.plus.com> wrote:
>
>
>
> > ryniek90 wrote:
> > > Hi.
> > > I started learning regexp, and some things goes well, but most of them
> > > still not.
>
> > > I've got problem with some regexp. Better post code here:
>
> > > "
> > > >>> import re
> > > >>> mail = '\nn... at mail.com\nname1 [at] mail [dot] com\nname2 [$at$]
> > > mail [$dot$] com\n'
> > > >>> mail
> > > '\nn... at mail.com\nname1 [at] mail [dot] com\nname2 [$at$] mail [$dot$]
> > > com\n'
> > > >>> print mail
>
> > > n... at mail.com
> > > name1 [at] mail [dot] com
> > > name2 [$at$] mail [$dot$] com
>
> > > >>> maail = re.sub('^\n|$\n', '', mail)
> > > >>> print maail
> > > n... at mail.com
> > > name1 [at] mail [dot] com
> > > name2 [$at$] mail [$dot$] com
> > > >>> maail = re.sub(' ', '', maail)
> > > >>> print maail
> > > n... at mail.com
> > > name1[at]mail[dot]com
> > > name2[$at$]mail[$dot$]com
> > > >>> maail = re.sub('\[at\]|\[\$at\$\]', '@', maail)
> > > >>> print maail
> > > n... at mail.com
> > > name1 at mail[dot]com
> > > name2 at mail[$dot$]com
> > > >>> maail = re.sub('\[dot\]|\[\$dot\$\]', '.', maail)
> > > >>> print maail
> > > n... at mail.com
> > > na... at mail.com
> > > na... at mail.com
> > > >>> #How must i write the replace string to replace all this regexp's
> > > with just ONE command, in string 'mail' ?
> > > >>> maail = re.sub('^\n|$\n| |\[at\]|\[\$at\$\]|\[dot\]|\[\$dot\$\]',
> > > *?*, mail)
> > > "
>
> > > How must i write that replace pattern (look at question mark), to maek
> > > that substituion work? I didn't saw anything helpful while reading Re
> > > doc and HowTo (from Python Doc). I tried with 'MatchObject.group()' but
> > > something gone wrong - didn't wrote it right.
> > > Is there more user friendly HowTo for Python Re, than this?
>
> > > I'm new to programming an regexp, sorry for inconvenience.
>
> > I don't think you can do it in one regex, nor would I want to. Just use
> > the string's replace() method.
>
> > >>> mail = '\nn... at mail.com\nname1 [at] mail [dot] com\nname2 [$at$]
> > mail [$dot$] com\n'
> > >>> mail
> > '\nn... at mail.com\nname1 [at] mail [dot] com\nname2 [$at$] mail [$dot$]
> > com\n'
> > >>> print mail
>
> > n... at mail.com
> > name1 [at] mail [dot] com
> > name2 [$at$] mail [$dot$] com
>
> > >>> maail = mail.strip()
> > n... at mail.com
> > name1 [at] mail [dot] com
> > name2 [$at$] mail [$dot$] com
>
> > >>> maail = maail.replace(' ', '')
> > >>> print maail
> > n... at mail.com
> > name1[at]mail[dot]com
> > name2[$at$]mail[$dot$]com
> > >>> maail = maail.replace('[at]', '@').replace('[$at$]', '@')
> > >>> print maail
> > n... at mail.com
> > name1 at mail[dot]com
> > name2 at mail[$dot$]com
> > >>> maail = maail.replace('[dot]', '.').replace('[$dot$]', '.')
> > >>> print maail
> > n... at mail.com
> > na... at mail.com
> > na... at mail.com
>
> Too bad, I thought that the almighty re module could do anything, but
> it failed with this (or maybe re can do what i want, but only few
> people knows how to force him to that? :P).
> But with help of MRAB, i choose The 3rd Point of Python's Zen -
> "Simple is better than complex."
>
> "
>
> >>> mail = '\nn... at mail.com\nname1 [at] mail [dot] com\nname2 [$at$] mail [$dot$] com\n'
> >>> mail
>
> '\nn... at mail.com\nname1 [at] mail [dot] com\nname2 [$at$] mail [$dot$]
> com\n'
>
> >>> print mail
>
> n... at mail.com
> name1 [at] mail [dot] com
> name2 [$at$] mail [$dot$] com
>
> >>> maail = mail.lstrip().rstrip().replace(' ', '').replace('[dot]', '.').replace('[$dot$]', '.').replace('[at]', '@').replace('[$at$]', '@')
> >>> print maail
>
> n... at mail.com
> na... at mail.com
> na... at mail.com
>
> >>> #Did it :)
>
> "
>
> Thanks again :)
Short of writing a dedicated function I might be tempted to write this
as:
EMAIL_REPLACEMENTS = (
('[at]', '@'),
('[dot]', '.'),
...
)
for src, dest in EMAIL_REPLACEMENTS:
mail = mail.replace(src, dest)
Apart from taste reasons, it keeps the replaces more obvious (and
accessible via a variable rather than embedded in the code), enables
swapping the order or adding/removing easier.
Jon
More information about the Python-list
mailing list