find and replace with regular expressions

dusans dusan.smitran at gmail.com
Fri Aug 1 06:55:36 EDT 2008


On Aug 1, 12:53 pm, dusans <dusan.smit... at gmail.com> wrote:
> On Jul 31, 10:07 pm, chrispoliq... at gmail.com wrote:
>
>
>
>
>
> > I am using regular expressions to search a string (always full
> > sentences, maybe more than one sentence) for common abbreviations and
> > remove the periods.  I need to break the string into different
> > sentences but split('.') doesn't solve the whole problem because of
> > possible periods in the middle of a sentence.
>
> > So I have...
>
> > ----------------
>
> > import re
>
> > middle_abbr = re.compile('[A-Za-z0-9]\.[A-Za-z0-9]\.')
>
> > # this will find abbreviations like e.g. or i.e. in the middle of a
> > sentence.
> > # then I want to remove the periods.
>
> > ----------------
>
> > I want to keep the ie or eg but just take out the periods.  Any
> > ideas?  Of course newString = middle_abbr.sub('',txt) where txt is the
> > string will take out the entire abbreviation with the alphanumeric
> > characters included.
>
> Its impossible with regex. U could try it with a statistical analysis;
> and even this would give u a good split.

"and even this wont* give u a good split." :P



More information about the Python-list mailing list