[Tutor] regex
Kent Johnson
kent37 at tds.net
Tue Dec 27 13:28:10 CET 2005
Danny Yoo wrote:
>>Dec 18 10:04:45 dragon logger: TCPWRAP: SERVICE=sshd@::ffff:192.168.0.1
>>,TYPE=ALL_DENY,HOST_ADDRESS=::ffff:195.145.94.75,HOST_INFO=::ffff:
>>195.145.94.75,HOST_NAME=unknown,USER_NAME=unknown,OTHERINFO=
>
>
> Hi Will,
>
> Observation: the output above looks comma delimited, at least the stuff
> after the 'TCPWRAP:' part.
>
>
>>self.twist_fail_re =
>>rc('SERVICE=\S*\sHOST_ADDRESS=\S*\sHOST_INFO=\S*\sHOST_NAME=\S*\sUSER_NAME=\S*\s')
>
>
> The line given as example doesn't appear to have whitespace in the places
> that the regular expression expects. It does contain commas as delimiters
> between the key/value pairs encoded in the line.
Expanding on Danny's comment...
\S*\s matches any amount of non-whitespace followed by one whitespace.
This doesn't match your sample. It looks like you want to match
non-comma followed by comma. For example this will match the first field:
SERVICE=[^,]*,
Presumably you will want to pull out the value of the field so enclose
it in parenthesis to make a group:
SERVICE=([^,]*),
Another thing I notice about your regex is it doesn't include all the
fields in the sample, for example TYPE. If the fields are always the
same you can just include them in your regex. If they vary you can try
to make the regex skip them, use a different regex for each field, or
try Danny's approach of using str.split() to break apart the data.
The Regex Demo program that comes with Python is handy for creating and
testing regexes. Look in C:\Python24\Tools\Scripts\redemo.py or the
equivalent.
Kent
More information about the Tutor
mailing list