Conversion of perl based regex to python method
Andrew Robert
andrew.arobert at gmail.com
Wed May 24 14:24:55 EDT 2006
Andrew Robert wrote:
> I have two Perl expressions
>
>
> If windows:
>
> perl -ple "s/([^\w\s])/sprintf(q#%%%2X#, ord $1)/ge" somefile.txt
>
> If posix
>
> perl -ple 's/([^\w\s])/sprintf("%%%2X", ord $1)/ge' somefile.txt
>
>
>
> The [^\w\s] is a negated expression stating that any character
> a-zA-Z0-9_, space or tab is ignored.
>
> The () captures whatever matches and throws it into the $1 for
> processing by the sprintf
>
> In this case, %%%2X which is a three character hex value.
>
> How would you convert this to a python equivalent using the re or
> similar module?
>
> I've begun reading about using re expressions at
> http://www.amk.ca/python/howto/regex/ but I am still hazy on implementation.
>
> Any help you can provide would be greatly appreciated.
>
> Thanks,
> Andy
Okay.. I got part of it..
The code/results below seem to do the first part of the expression.
I believe the next part is iterating across each of the characters,
evaluate the results and replace with hex as needed.
# Import the module
import re
# Open test file
file=open(r'm:\mq\mq\scripts\testme.txt','r')
# Read in a sample line
line=file.readline()
# Compile expression to exclude all characters plus space/tab
pattern=re.compile('[^\w\s]')
# Look to see if I can find a non-standard character
# from test line #! C:\Python24\Python
var=pattern.match('!')
# gotcha!
print var
<_sre.SRE_Match object at 0x009DA8E0
# I got
print var.group()
!
# See if pattern will come back with something it shouldn't
var =pattern.match('C')
print var
#I got
None
Instead of being so linear, I was thinking that this might be closer.
Got to figure out the hex line but then we are golden
# Evaluate captured character as hex
def ret_hex(ch):
return chr((ord(ch) + 1) % )
# Evaluate the value of whatever was matched
def eval_match(match):
return ret_hex(match.group(0))
# open file
file = open(r'm:\mq\mq\scripts\testme.txt','r')
# Read each line, pass any matches on line to function
for line in file.readlines():
re.sub('[^\w\s]',eval_match, line)
More information about the Python-list
mailing list