[Tutor] Re: manipulating a file

Reed L. O'Brien reed at intersiege.com
Wed Feb 9 06:13:37 CET 2005


Shitiz Bansal wrote:
> Hi,
> I do see a problem.
> The script is fine, the problem lies else where.
> 
> Your script is trying to write log.bak to log, it
> should b other way round.
> 
> i.e....
> srcfile = open('/var/log/httpd-access.log', 'r')
> dstfile = open('/var/log/httpd-access.log.bak', 'w')
> 
> hope that fixes it.
> 
> About the efficiency, why do u need python at all...
> How abt a simple shell command....
>        cat httpd-access.log>>log.bak
> 
> Shitiz
> 
> --- Danny Yoo <dyoo at hkn.eecs.berkeley.edu> wrote:
> 
> 
>>
>>On Mon, 7 Feb 2005, Reed L. O'Brien wrote:
>>
>>
>>>I want to read the httpd-access.log and remove any
>>
>>oversized log records
>>
>>>I quickly tossed this script together.  I manually
>>
>>mv-ed log to log.bak
>>
>>>and touched a new logfile.
>>>
>>>running the following with print i uncommented
>>
>>does print each line to
>>
>>>stdout.  but it doesn't write to the appropriate
>>
>>file...
>>
>>
>>Hello!
>>
>>Let's take a look at the program again:
>>
>>###
>>import os
>>srcfile = open('/var/log/httpd-access.log.bak', 'r')
>>dstfile = open('/var/log/httpd-access.log', 'w')
>>while 1:
>>     lines = srcfile.readlines()
>>     if not lines: break
>>     for i in lines:
>>         if len(i) < 2086:
>>             dstfile.write(i)
>>srcfile.close()
>>dstfile.close()
>>###
>>
>>
>>>a) what am I missing?
>>>b) is there a less expensive way to do it?
>>
>>Hmmm... I don't see anything offhand that prevents
>>httpd-access.log from
>>containing the lines you expect.  Do you get any
>>error messages, like
>>permission problems, when you run the program?
>>
>>Can you show us how you are running the program, and
>>how you are checking
>>that the resulting file is empty?
>>
>>
>>Addressing the question on efficiency and expense:
>>yes.  The program at
>>the moment tries to read all lines into memory at
>>once, and this is
>>expensive if the file is large.  Let's fix this.
>>
>>
>>In recent versions of Python, we can modify
>>file-handling code from:
>>
>>###
>>lines = somefile.readlines()
>>for line in lines:
>>    ...
>>###
>>
>>to this:
>>
>>###
>>for line in somefile:
>>    ...
>>###
>>
>>That is, we don't need to extract a list of 'lines'
>>out of a file.
>>Python allows us to loop directly across a file
>>object.  We can find more
>>details about this in the documentation on
>>"Iterators" (PEP 234):
>>
>>    http://www.python.org/peps/pep-0234.html
>>
>>Iterators are a good thing to know, since Python's
>>iterators are deeply
>>rooted in the language design.  (Even if it they
>>were retroactively
>>embedded.  *grin*)
>>
>>
>>A few more comments: the while loop appears
>>unnecessary, since on the
>>second run-through the loop, we'll have already read
>>all the lines out of
>>the file.  (I am assuming that nothing is writing to
>>the backup file at
>>the time.)  If the body of a while loop just runs
>>once, we don't need a
>>loop.
>>
>>This simplifies the code down to:
>>
>>###
>>srcfile = open('/var/log/httpd-access.log.bak', 'r')
>>dstfile = open('/var/log/httpd-access.log', 'w')
>>for line in srcfile:
>>    if len(line) < 2086:
>>        dstfile.write(line)
>>srcfile.close()
>>dstfile.close()
>>###
>>
>>
>>I don't see anything else here that causes the file
>>writing to fail.  If
>>you can tell us more information on how you're
>>checking the program's
>>effectiveness, that may give us some more clues.
>>
>>Best of wishes to you!
>>
>>_______________________________________________
>>Tutor maillist  -  Tutor at python.org
>>http://mail.python.org/mailman/listinfo/tutor
>>
> 
> 
> 
> 
> 		
> __________________________________ 
> Do you Yahoo!? 
> Yahoo! Mail - Helps protect you from nasty viruses. 
> http://promotions.yahoo.com/new_mail
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
> 
I am actually mv log to bak and write the non offening entries back to 
log.  I can not be working on log as it needs to be able to accept new 
entries as the webserver is accessed.  Plus I am learning python, I 
could have sh scripted  it easy but am broadening my horizons...

reed



More information about the Tutor mailing list