[Tutor] removing whole numbers from text

Tue Aug 5 13:08:57 CEST 2008

[root at serv ~]# python
Python 2.5.1 (r251:54863, Nov 23 2007, 16:16:53)
[GCC 4.1.1 20070105 (Red Hat 4.1.1-51)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> import re
>>>
>>> a = 'this is a simple 3xampl3 of 2 or 3 numb3rs' # some text
>>> b = re.sub('\b[0-9]\b', 'xx', a)  # you need to use the r (raw) or else
it does work
>>> print b
this is a simple 3xampl3 of 2 or 3 numb3rs
>>> b = re.sub(r'\b[0-9]\b', 'xx', a) #replace with xx
>>> print b
this is a simple 3xampl3 of xx or xx numb3rs
>>> b = re.sub(r'\b[0-9]\b', '', a)  #replace with nothing
>>> print b
this is a simple 3xampl3 of  or  numb3rs
>>>


2008/8/4 Ricardo Aráoz <ricaraoz at gmail.com>

> Dinesh B Vadhia wrote:
>
>> I want to remove whole numbers from text but retain numbers attached to
>> words.  All whole numbers to be removed have a leading and trailing space.
>>  For example, in "the cow jumped-20 feet high30er than the lazy 20 timing
>> fox who couldn't keep up the 865 meter race." remove the whole numbers 20
>> and 865 but keep the 20 in jumped-20 and the 30 in high30er.
>>  What is the best to do this using re?
>>  Dinesh
>>
>
> >>> text = "the cow jumped-20 feet high30er than the lazy 20 timing fox who
> couldn't keep up the 865 meter race."
>
> >>> ' '.join(i for i in text.split() if not i.isdigit())
> "the cow jumped-20 feet high30er than the lazy timing fox who couldn't keep
> up the meter race."
>
>
> HTH
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20080805/92e82f30/attachment.htm>