Conversion of perl based regex to python method

Andrew Robert andrew.arobert at gmail.com
Wed May 24 14:24:55 EDT 2006


Andrew Robert wrote:
> I have two Perl expressions
> 
> 
> If windows:
> 
> perl -ple "s/([^\w\s])/sprintf(q#%%%2X#, ord $1)/ge"  somefile.txt
> 
> If posix
> 
> perl -ple 's/([^\w\s])/sprintf("%%%2X", ord $1)/ge'  somefile.txt
> 
> 
> 
> The [^\w\s]  is a negated expression stating that any character
> a-zA-Z0-9_, space or tab is ignored.
> 
> The () captures whatever matches and throws it into the $1 for
> processing by the sprintf
> 
> In this case, %%%2X which is a three character hex value.
> 
> How would you convert this to a python equivalent using the re or
> similar module?
> 
> I've begun reading about using re expressions at
> http://www.amk.ca/python/howto/regex/ but I am still hazy on implementation.
> 
> Any help you can provide would be greatly appreciated.
> 
> Thanks,
> Andy
Okay.. I got part of it..

The code/results below seem to do the first part of the expression.

I believe the next part is iterating across each of the characters,
evaluate the results and replace with hex as needed.


# Import the module
import re

# Open test file
file=open(r'm:\mq\mq\scripts\testme.txt','r')

# Read in a sample line
line=file.readline()

# Compile expression to exclude all characters plus space/tab
pattern=re.compile('[^\w\s]')

# Look to see if I can find a non-standard character
# from test line  #! C:\Python24\Python

var=pattern.match('!')

# gotcha!
print var
<_sre.SRE_Match object at 0x009DA8E0

# I got
print var.group()

!

# See if pattern will come back with something it shouldn't
var =pattern.match('C')
print var

#I got
None



Instead of being so linear, I was thinking that this might be closer.
Got to figure out the hex line but then we are golden


# Evaluate captured character as hex
def ret_hex(ch):
    return chr((ord(ch) + 1) % )

# Evaluate the value of whatever was matched
def eval_match(match):
    return ret_hex(match.group(0))

# open file
file = open(r'm:\mq\mq\scripts\testme.txt','r')

# Read each line, pass any matches on line to function
for line in file.readlines():
     re.sub('[^\w\s]',eval_match, line)



More information about the Python-list mailing list