Re for Apache log file format
Neil Cerutti
neilc at norwich.edu
Tue Oct 8 08:50:22 EDT 2013
On 2013-10-08, Sam Giraffe <sam at giraffetech.biz> wrote:
>
> Hi,
>
> I am trying to split up the re pattern for Apache log file format and seem
> to be having some trouble in getting Python to understand multi-line
> pattern:
>
> #!/usr/bin/python
>
> import re
>
> #this is a single line
> string = '192.168.122.3 - - [29/Sep/2013:03:52:33 -0700] "GET / HTTP/1.0"
> 302 276 "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"'
>
> #trying to break up the pattern match for easy to read code
> pattern = re.compile(r'(?P<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+'
> r'(?P<ident>\-)\s+'
> r'(?P<username>\-)\s+'
> r'(?P<TZ>\[(.*?)\])\s+'
> r'(?P<url>\"(.*?)\")\s+'
> r'(?P<httpcode>\d{3})\s+'
> r'(?P<size>\d+)\s+'
> r'(?P<referrer>\"\")\s+'
> r'(?P<agent>\((.*?)\))')
I recommend using the re.VERBOSE flag when explicating an re.
It'll make your life incrementally easier.
pattern = re.compile(
r"""(?P<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+
(?P<ident>\-)\s+
(?P<username>\-)\s+
(?P<TZ>\[(.*?)\])\s+ # You can even insert comments.
(?P<url>\"(.*?)\")\s+
(?P<httpcode>\d{3})\s+
(?P<size>\d+)\s+
(?P<referrer>\"\")\s+
(?P<agent>\((.*?)\))""", re.VERBOSE)
--
Neil Cerutti
More information about the Python-list
mailing list