[Tutor] Text matching

Gardner, Dean Dean.Gardner at barco.com
Tue May 15 09:42:31 CEST 2007


 So I took Kents advice and refactored but I have uncovered another
problem which I am hoping people may be able to help with. I realized
that the string I was using to identify the end of a record can in some
cases not be found (i.e. if a commit wasn't reviewed). This can lead to
additional records being returned. 

Can anyone suggest how I can get round this?

Text file example ( in this case looking for commit 1 would give details
of commit two also):

---- SomeSpec 0000-0001 ----

> some commit 1

Reviewed By: someone
---- SomeSpec 0000-0002 ----
> some commit 2

---- SomeSpec 0000-0003 ----
>some commit 1
Reviewed By: Someone


Code:

 def searchChangeLog(self,filename):
        uid = self.item.Uid()
        record=[]
        logList=[]
        displayList=[]
        f = open(filename)
        logTextFile="temp.txt"
        """ searched through the changelog 'breaking' it up into
            individual entries"""

        for line in f:
            if ("Reviewed: 000" in line):
                logList.append(record)
                record = []
            else:
                record.append(line)

        """ searches to determine if we can find entries for
            a particualr item"""
        for record in logList:
            for item in record:
                if uid in item:
                    displayList.append(record)
        """ creates a temporary file to write our find results to """
        removeFile = os.path.normpath( os.path.join(os.getcwd(),
logTextFile))

        # if the file exists, get rid of it before writing our new
findings
        if Shared.config.Exists(removeFile):
            os.remove(removeFile)
        recordLog = open(logTextFile,"a")

        for record in range(len(displayList)):
            for item in displayList[record]:
                recordLog.write(item)
        recordLog.close()
        #display our results
        commandline = "start cmd /C " + logTextFile
        os.system(commandline)

Dean Gardner
Test Engineer 
Barco
Bonnington Bond, 2 Anderson Place, Edinburgh EH6 5NP, UK
Tel + 44 (0) 131 472 5731 Fax + 44 (0) 131 472 4799
www.barco.com 
dean.gardner at barco.com 


-----Original Message-----
From: Kent Johnson [mailto:kent37 at tds.net] 
Sent: 04 May 2007 11:26
To: Gardner, Dean
Cc: tutor at python.org
Subject: Re: [Tutor] Text matching

Gardner, Dean wrote:
>  
> So here it is....it might not be pretty (it seems a bit un-python like

> to me) but it works exactly as required. If anyone can give any tips 
> for possible optimisation or refactor I would be delighted to hear 
> from them.
> 
> Thanks
> 
> 	  uid = self.item.Uid()
>         record=[]
>         logList=[]
>         displayList=[]
>         f = open(filename)
>         logTextFile="temp.txt"
>         """ searched through the changelog 'breaking' it up into
>             individual entries"""
>         try:
>             while 1:
>                 endofRecord=0
>                 l = f.next()
>                 if l.startswith("----"):
>                     record.append(l)
>                 l=f.next()
>                 while endofRecord==0:
>                     if "Reviewed: 000" not in l:
>                         record.append(l)
>                         l=f.next()
>                     else:
>                         logList.append(record)
>                         record=[]
>                         endofRecord=1
>         except StopIteration:
>             pass

I don't think you need endofRecord and the nested loops here. In fact I
think you could use a plain for loop here. AFAICT all you are doing is
accumulating records with no special handling for anything except the
end records. What about this:
record = []
for line in f:
   if "Reviewed: 000" in line:
     logList.append(record)
     record = []
   else:
     record.append(line)

>         """ searches to determine if we can find entries for
>             a particualr item"""
>         for record in logList:
>             currRec = record
>             for item in currRec:
>                 if uid in item:
>                     displayList.append(currRec)

The currRec variable is not needed, just use record directly.
If uid can only be in a specific line of the record you can test that
directly, e.g.
for record in logList:
   if uid in record[1]:

>         """ creates a temporary file to write our find results to """
>         removeFile = os.path.normpath( os.path.join(os.getcwd(),
> logTextFile))
> 
>         # if the file exists, get rid of it before writing our new 
> findings
>         if Shared.config.Exists(removeFile):
>             os.remove(removeFile)
>         recordLog = open(logTextFile,"a")
> 
>         for record in range(len(displayList)):
>             for item in displayList[record]:
>                 recordLog.write(item)

for record in displayList:
   recordLog.writelines(record)

>         recordLog.close()
>         #display our results
>         commandline = "start cmd /C " + logTextFile
>         os.system(commandline)
> 

Kent


DISCLAIMER:
Unless indicated otherwise, the information contained in this message is privileged and confidential, and is intended only for the use of the addressee(s) named above and others who have been specifically authorized to receive it. If you are not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message and/or attachments is strictly prohibited. The company accepts no liability for any damage caused by any virus transmitted by this email. Furthermore, the company does not warrant a proper and complete transmission of this information, nor does it accept liability for any delays. If you have received this message in error, please contact the sender and delete the message. Thank you.


More information about the Tutor mailing list