Regular Expression problem
Steven D'Aprano
steven at REMOVE.THIS.cybersource.com.au
Tue Sep 8 06:11:51 EDT 2009
On Tue, 08 Sep 2009 09:21:35 +0000, §äŽMŠÛ€vªº€@€ù€Ñ wrote:
> I have the following source code
>
> ------------------------
> import re
> d = 'RTCB\r\nsignature:\xf1\x11
> \xde\x10\xfe\x0f\x9c\x10\xf6\xc9_\x10\xf3\xeb<\x10\xf2Zt\x10\xef\xd2\x91
\x10\xe6\xe7\xfb\x10\xe5p\x99\x10\xe2\x1e\xdf\x10\xdb\x0e\x9f\x10\xd8p\x06
\x10\xce\xb3_\x10\xcc\x8d\xe2\x10\xc8\x00\xa4\x10\xc5\x994\x10\xc2={\x10
\xc0\xdf\xda\x10\xbb\x03\xa3\x1
> 0\xb6E\n\x10\xacM\x12\x10\xa5`\xaa\x10\xa0\xaa\x1b\x10\x9bwy\x10\x9a
\xc4w\x10\x95\xb6\xde\x10\x93o
> \x10\x89N\xd3\x10\x86\xda=\x00\x00\x00\x00\x00\x00\x00\x00\r\ncef-
ip:127.0.0.1\r\nsender-ip:152.100.123.77\r\n\r\n'
> m = re.search('signature:(.*?)\r\n',d)
>
> ---------------------------
>
> as you can see, there is "signature:..." in front of d
>
> but re.search can not find the match object, it return None object...
That's because you're trying to match over multiple lines. You need to
specify the DOTALL flag.
I've re-formatted the string constant to take advantage of Python string
concatenation, so it is easier to copy and paste into the interactive
interpreter:
d =('RTCB\r\nsignature:'
'\xf1\x11\xde\x10\xfe\x0f\x9c\x10\xf6\xc9'
'_\x10\xf3\xeb'
'<\x10\xf2'
'Zt\x10\xef\xd2\x91\x10\xe6\xe7\xfb\x10\xe5'
'p\x99\x10\xe2\x1e\xdf\x10\xdb\x0e\x9f\x10\xd8'
'p\x06\x10\xce\xb3'
'_\x10\xcc\x8d\xe2\x10\xc8\x00\xa4\x10\xc5\x99'
'4\x10\xc2'
'={\x10\xc0\xdf\xda\x10\xbb\x03\xa3\x10\xb6'
'E\n\x10\xac'
'M\x12\x10\xa5'
'`\xaa\x10\xa0\xaa\x1b\x10\x9b'
'wy\x10\x9a\xc4'
'w\x10\x95\xb6\xde\x10\x93'
'o\x10\x89'
'N\xd3\x10\x86\xda'
'=\x00\x00\x00\x00\x00\x00\x00\x00'
'\r\ncef-ip:127.0.0.1\r\nsender-ip:152.100.123.77\r\n\r\n'
)
assert len(d) == 182
>>> re.search('signature:(.*?)\r\n', d)
>>> re.search('signature:.*?\r\n', d, re.DOTALL)
>>> m
<_sre.SRE_Match object at 0xb7e98138>
--
Steven
More information about the Python-list
mailing list