[Tutor] capitalize() but only first letter
Magnus Lycka
magnus@thinkware.se
Thu Feb 6 16:26:36 2003
Erik Price <eprice@ptc.com> wrote:
> I'm using the capitalize()
>method of the Python String type, and although this does capitalize the
>first letter of the word, it lowercases all the rest of the letters. I
>need to capitalize just the first letter and leave all the other letters
>in the word alone.
Use regular expressions (RE).
>>> import re
>>> def upper(matchobj):
... return matchobj.group(0).upper()
...
>>> re.sub(r'\b\w', upper, 'thIs is a stRiNG wiTh miXed cAsEs')
'ThIs Is A StRiNG WiTh MiXed CAsEs'
What's this?
First of all, the RE pattern itself: r'\b\w'
r'' means raw string, i.e. don't interpret the \ as starting an
escape sequence, just store the string as I typed it. So, while
'\t' means "a tab character", r'\t' means "a bacslash followed
by a t".
So, r'\b\w' is the same as '\\b\\w', but more convenient. \b
means word boundry, and \w means alphanumeric character. A word
boundry followed by an alphanumeric is the start of a word, right?
re.sub(pattern, whatToSubstituteWith, aString)
re.sub stands for substitute, so re.sub(r'\b\w', 'X', 'hi there')
would return 'Xi Xhere'. But instead of a string to substitute
with, we can supply a function.
This function will be fed with a RE match object for each RE
match in the string. In this case the first character in each
word. I wrote
def upper(matchobj):
return matchobj.group(0).upper()
Some RE pattens are quite complex and can extract several different
parts of a string at once. These all end up in a "group" each, in
this case we just get one group in each match, so we need to extract
that from the match object, apply the .upper() method to it, and
return it.
The re module is written in C, and perhaps not quite as fast as
Perl's, but quite reasonable for large volume string manipulation.
--
Magnus Lycka, Thinkware AB
Alvans vag 99, SE-907 50 UMEA, SWEDEN
phone: int+46 70 582 80 65, fax: int+46 70 612 80 65
http://www.thinkware.se/ mailto:magnus@thinkware.se