Trouble with quotes
Martin P. Hellwig
martin.hellwig at dcuktec.org
Mon Mar 8 18:40:07 CET 2010
On 03/08/10 17:06, Stephen Nelson-Smith wrote:
> I've written some (primitive) code to parse some apache logfies and
> establish if apache has appended a session cookie to the end. We're
> finding that some browsers don't and apache doesn't just append a "-"
> - it just omits it.
> It's working fine, but for an edge case:
> Couldn't match 192.168.1.107 - - [24/Feb/2010:20:30:44 +0100] "GET
> http://sekrit.com/node/175523 HTTP/1.1" 200 -
> "http://sekrit.com/search/results/"3%2B2%20course"" "Mozilla/4.0
> (compatible; MSIE 6.0; Windows NT 5.1; SV1; GTB6.4)"
I didn't try to mentally parse the regex pattern (I like to keep
reasonably sane). However from the sounds of it the script barfs when
there is a quoted part in the second URL part. So how about doing a
simple string.replace('/"','') & string.replace('" ','') before doing
your re foo?
More information about the Python-list