parsing a file for analysis
Rita
rmorgan466 at gmail.com
Sat Feb 26 00:45:20 EST 2011
I have a large text (4GB) which I am parsing.
I am reading the file to collect stats on certain items.
My approach has been simple,
for row in open(file):
if "INFO" in row:
line=row.split()
user=line[0]
host=line[1]
__time=line[2]
...
I was wondering if there is a framework or a better algorithm to read such
as large file and collect it stats according to content. Also, are there any
libraries, data structures or functions which can be helpful? I was told
about 'collections' container. Here are some stats I am trying to get:
*Number of unique users
*Break down each user's visit according to time, t0 to t1
*what user came from what host.
*what time had the most users?
(There are about 15 different things I want to query)
I understand most of these are redundant but it would be nice to have a
framework or even a object oriented way of doing this instead of loading it
into a database.
Any thoughts or ideas?
--- Get your facts first, then you can distort them as you please.--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20110226/c4d43025/attachment.html>
More information about the Python-list
mailing list