Re for Apache log file format
Cameron Simpson
cs at zip.com.au
Tue Oct 8 18:17:44 EDT 2013
On 08Oct2013 10:59, Skip Montanaro <skip at pobox.com> wrote:
| > Aiui apache log format uses space as delimiter, encapsulates strings in
| > '"' characters, and uses '-' as an empty field.
|
| Specifying the field delimiter as a space, you might be able to use
| the csv module to read these. I haven't done any Apache log file work
| since long before the csv module was available, but it just might
| work.
You can definitely do this. I pull things out of apache log files
using awk in exactly this fashion. It does rely on each of the
"real" fields having a fixed number of "words" in it. You just stick
the fields back together again.
And also in Python.
I've got a merge-apache-logs script to read multiple logs, presumed
in time order, and produce a single output stream for passing to
log analysis tools:
https://bitbucket.org/cameron_simpson/css/src/tip/bin/merge-apache-logs
It is a bit of a hack, but useful.
It has an "aptime" function to pull and parse the time field from
the line which starts like this:
def aptime(logline, zones, defaultZone):
''' Compute a datetime object from the supplied Apache log line.
`defaultZone` is the timezone to use if it cannot be deduced.
'''
fields = logline.split()
if len(fields) < 5:
##warning("bad log line: %s", logline)
return None
dt = None
tzinfo = None
# try for desired "[DD/Mon/YYYY:HH:MM:SS +hhmm]" format
humantime, tzinfo = fields[3], fields[4]
if len(humantime) == 21 \
and humantime.startswith('[') \
and tzinfo.endswith(']'):
try:
dt = datetime.strptime(humantime, "[%d/%b/%Y:%H:%M:%S")
except ValueError, e:
dt = None
if dt is None:
tzinfo = None
else:
tzinfo = tzinfo[:-1]
and proceeeds otherwise (we have a few different log formats in play, alas).
So regexpas are not your only choice here, and possibly not even the best choice.
Cheers,
--
Cameron Simpson <cs at zip.com.au>
This is not a bug. It's just the way it works, and makes perfect sense.
- Tom Christiansen <tchrist at jhereg.perl.com>
I like that line. I hope my boss falls for it.
- Chaim Frenkel <chaimf at cris.com>
More information about the Python-list
mailing list