[Tutor] Question on regular expressions

Wed May 24 19:18:31 CEST 2006

> perl -ple "s/([^\w\s])/sprintf(q#%%%2X#, ord $1)/ge"  somefile.txt

Hi Andrew,

Give me a second.  I'm trying to understand the command line switches:

(Looking in 'perl --help'...)

   -p              assume loop like -n but print line also, like sed
   -l[octal]       enable line ending processing, specifies line terminator
   -e program      one line of program (several -e's allowed, omit programfile)

and the regular expression modifiers there --- 'g' and 'e' --- mean ... 
(reading 'perldoc perlop'...)

                    g   Match globally, i.e., find all occurrences.
                    e   Evaluate the right side as an expression.

Ok, I have a better idea of what's going on here now.  This takes a file, 
and translates every non-whitespace character into a hex string.  That's a 
dense one-liner.

> How would you convert this to a python equivalent using the re or 
> similar module?

The substitution on the right hand side in the Perl code actually 
is evaluated rather than literally substituted.  To get the same effect 
from Python, we pass a function off as the substituting value to re.sub().

For example, we can translate every word-like character by shifting it
one place ('a' -> 'b', 'b' -> 'c', etc...)

###############################################
>>> import re
>>> def rot1(ch):
...     return chr((ord(ch) + 1) % 256)
...
>>> def rot1_on_match(match):
...     return rot1(match.group(0))
...
>>> re.sub(r'\w', rot1_on_match, "hello world")
'ifmmp xpsme'
###############################################

> I've begun reading about using re expressions at
> http://www.amk.ca/python/howto/regex/ but I am still hazy on implementation.

The part in:

http://www.amk.ca/python/howto/regex/regex.html#SECTION000620000000000000000

that talks about a "replacement function" is relevant to what you're 
asking.  We need to provide a replacement function to simulate the 
right-hand-side "evaluation" that's happening in the Perl code.

Good luck!