[Tutor] String punctuation

Roeland Rengelink r.b.rigilink@chello.nl
Fri, 26 Oct 2001 08:44:51 +0200


Mike Yuen wrote:
> 
> Hi, i'm trying to figure how the string function punctuation works.  I'm
> hoping that it will remove comma's, periods and other stuff from my strings.
> 
> For some reason, I keep getting an attribute error.  Can anyone  tell me how
> to use this?  Speak to me like i'm a 10 year old. I'm very new at this
> stuff.
> 
> Thanks,
> Mike
> 

Hi Mike,

Unfortunately, punctuation is not a function of the string module, but
just an attribute (variable) of that module that holds a string
containing all punctuation characters.

Here's an interactive session showing some of the thing you could do
with it.

>>> import string
>>> print string.punctuation
!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
>>> '.' in string.punctuation
1
>>> 'a' in string.punctuation
0

Let's build a function that checks to see if a character is a
punctuation character

>>> def is_punct_char(char):
...     '''check if char is punctuation char'''
...     if char in string.punctuation:
...         return 1
...     else:
...         return 0
... 
>>> is_punct_char('.')
1
>>> is_punct_char('a')
0


This function can be made a little shorter

>>> def is_punct_char(char):     #shorter version
...     '''check if char is punctuation char'''
...     return char in string.punctuation
... 
>>> is_punct_char('.')
1
>>> is_punct_char('a')
0


Let's use this function to remove punctuation characters from a string

>>> my_string = 'a string, not too long, containing "#$%&" characters.'
>>> my_string
'a string, not too long, containing "#$%&" characters.'
>>> new_string = ''     
>>> for char in my_string:
...     if not is_punct_char(char):
...             new_string = new_string+char
... 
>>> new_string
'a string not too long containing  characters'


Here's a slightly modified version

>>> def is_not_punct_char(char):
...     '''check if char is not punctuation char'''
...     return not is_punct_char(char)
... 
>>> is_not_punct_char('.')
0
>>> is_not_punct_char('a')
1
>>> new_string = ''
>>> for char in my_string:
...     if is_not_punct_char(char):
...             new_string = new_string+char
... 
>>> new_string
'a string not too long containing  characters'

This is probably what you wanted to get. But, to wet your appetite for
more, here's some magic with the build in function 'filter'. This
function does the same thing as the loop we've just written. Have a look
at the Python documentation to see if you can fuigure out why this
works.

>>> new_string = filter(is_not_punct_char, my_string)
>>> new_string
'a string not too long containing  characters'

Hope this helps,

Roeland

-- 
r.b.rigilink@chello.nl

"Half of what I say is nonsense. Unfortunately I don't know which half"