find and replace with regular expressions

Mensanator mensanator at aol.com
Thu Jul 31 16:56:04 EDT 2008


On Jul 31, 3:07 pm, chrispoliq... at gmail.com wrote:
> I am using regular expressions to search a string (always full
> sentences, maybe more than one sentence) for common abbreviations and
> remove the periods.  I need to break the string into different
> sentences but split('.') doesn't solve the whole problem because of
> possible periods in the middle of a sentence.
>
> So I have...
>
> ----------------
>
> import re
>
> middle_abbr = re.compile('[A-Za-z0-9]\.[A-Za-z0-9]\.')
>
> # this will find abbreviations like e.g. or i.e. in the middle of a
> sentence.
> # then I want to remove the periods.
>
> ----------------
>
> I want to keep the ie or eg but just take out the periods.  Any
> ideas?  Of course newString = middle_abbr.sub('',txt) where txt is the
> string will take out the entire abbreviation with the alphanumeric
> characters included.

>>> middle_abbr = re.compile('[A-Za-z0-9]\.[A-Za-z0-9]\.')
>>> s = 'A test, i.e., an example.'
>>> a = middle_abbr.search(s)      # find the abbreviation
>>> b = re.compile('\.')           # period pattern
>>> c = b.sub('',a.group(0))       # remove periods from abbreviation
>>> d = middle_abbr.sub(c,s)       # substitute new abbr for old
>>> d
'A test, ie, an example.'



More information about the Python-list mailing list