Simple (newbie) regular expression question

John Machin sjmachin at
Fri Jan 21 15:40:47 EST 2005

André Roberge wrote:
> Sorry for the simple question, but I find regular
> expressions rather intimidating.  And I've never
> needed them before ...
> How would I go about to 'define' a regular expression that
> would identify strings like
> __alphanumerical__  as in __init__
> (Just to spell things out, as I have seen underscores disappear
> from messages before, that's  2 underscores immediately
> followed by an alphanumerical string immediately followed
> by 2 underscore; in other words, a python 'private' method).
> Simple one-liner would be good.
> One-liner with explanation would be better.
> One-liner with explanation, and pointer to 'great tutorial'
> (for future reference) would probably be ideal.
> (I know, google is my friend for that last part. :-)
> Andre

Firstly, some corrections: (1) google is your friend for _all_ parts of
your question (2) Python has an initial P and doesn't have private

Read this:

>>> pat1 = r'__[A-Za-z0-9_]*__'
>>> pat2 = r'__\w*__'
>>> import re
>>> tests = ['x', '__', '____', '_____', '__!__', '__a__', '__Z__',
'__8__', '__xyzzy__', '__plugh']
>>> [x for x in tests if, x)]
['____', '_____', '__a__', '__Z__', '__8__', '__xyzzy__']
>>> [x for x in tests if, x)]
['____', '_____', '__a__', '__Z__', '__8__', '__xyzzy__']

I've interpreted your question as meaning "valid Python identifier that
starts and ends with two [implicitly, or more] underscores".

In the two alternative patterns, the part in the middle says "zero or
more instances of a character that can appear in the middle of a Python
identifier". The first pattern spells this out as "capital letters,
small letters, digits, and underscore". The second pattern uses the \w
shorthand to give the same effect.
You should be able to follow that from the Python documentation.
Now, read this:



More information about the Python-list mailing list