extract text from log file using re

Tim Williams tdwdotnet at gmail.com
Thu Sep 13 22:52:16 CEST 2007


On 13/09/2007, Fabian Braennstroem <f.braennstroem at gmx.de> wrote:
> me again... I should describe it better:
> the result should be an array with just:
>
> 498 1.0086e-03 2.4608e-04 9.8589e-05 1.4908e-04 8.3956e-04
> 3.8560e-03 4.8384e-02 11:40:01  499
> 499 1.0086e-03 2.4608e-04 9.8589e-05 1.4908e-04  8.3956e-04
> 3.8560e-03 4.8384e-02 11:40:01  499
> 500 1.0049e-03 2.4630e-04 9.8395e-05 1.4865e-04 8.3913e-04
> 3.8545e-03 1.3315e-01 11:14:10  500
> 501 1.0086e-03 2.4608e-04 9.8589e-05 1.4908e-04 8.3956e-04
> 3.8560e-03 4.8384e-02 11:40:01  499
>
> as field values.
>
> Fabian Braennstroem schrieb am 09/13/2007 09:09 PM:
> > Hi,
> >
> > I would like to delete a region on a log file which has this
> > kind of structure:
> >
> >
> > #------flutest------------------------------------------------------------
> >    498 1.0086e-03 2.4608e-04 9.8589e-05 1.4908e-04
> > 8.3956e-04 3.8560e-03 4.8384e-02 11:40:01  499
> >    499 1.0086e-03 2.4608e-04 9.8589e-05 1.4908e-04
> > 8.3956e-04 3.8560e-03 4.8384e-02 11:40:01  499
> > reversed flow in 1 faces on pressure-outlet 35.
> >
> > Writing
> > "/home/gcae504/SCR1/Solververgleich/Klimakruemmer_AK/CAD/Daimler/fluent-0500.cas"...
> >  5429199 mixed cells, zone 29, binary.
> > 11187656 mixed interior faces, zone 30, binary.
> >    20004 triangular wall faces, zone 31, binary.
> >     1104 mixed velocity-inlet faces, zone 32, binary.
> >   133638 triangular wall faces, zone 33, binary.
> >    14529 triangular wall faces, zone 34, binary.
> >     1350 mixed pressure-outlet faces, zone 35, binary.
> >    11714 mixed wall faces, zone 36, binary.
> >  1232141 nodes, binary.
> >  1232141 node flags, binary.
> > Done.
> >
> >
> > Writing
> > "/home/gcae504/SCR1/Solververgleich/Klimakruemmer_AK/CAD/Daimler/fluent-0500.dat"...
> > Done.
> >
> >    500 1.0049e-03 2.4630e-04 9.8395e-05 1.4865e-04
> > 8.3913e-04 3.8545e-03 1.3315e-01 11:14:10  500
> >
> >  reversed flow in 2 faces on pressure-outlet 35.
> >    501 1.0086e-03 2.4608e-04 9.8589e-05 1.4908e-04
> > 8.3956e-04 3.8560e-03 4.8384e-02 11:40:01  499
> >
> > #------------------------------------------------------------------
> >
> > I have a small script, which removes lines starting with
> > '(re)versed', '(i)teration' and '(t)urbulent'  and put the
> > rest into an array:
> >
> > # -- plot residuals ----------------------------------------
> >       import re
> > filename="flutest"
> > reversed_flow=re.compile('^\ re')
> > turbulent_viscosity_ratio=re.compile('^\ tu')
> > iteration=re.compile('^\ \ i')
> >
> > begin_of_res=re.compile('>\ \ \ i')
> > end_of_res=re.compile('^\ ad')
> >
> > begin_of_writing=re.compile('^\Writing')
> > end_of_writing=re.compile('^\Done')
> >
> > end_number=0
> > begin_number=0
> >
> >
> > n = 0
> > for line in open(filename).readlines():
> >     n = n + 1
> >     if begin_of_res.match(line):
> >         begin_number=n+1
> >         print "Line Number (begin): " + str(n)
> >
> >     if end_of_res.match(line):
> >         end_number=n
> >         print "Line Number (end): " + str(n)
> >
> >     if begin_of_writing.match(line):
> >         begin_w=n+1
> >         print "BeginWriting: " + str(n)
> >         print "HALLO"
> >
> >     if end_of_writing.match(line):
> >         end_w=n+1
> >         print "EndWriting: " +str(n)
> >
> > if n > end_number:
> >     end_number=n
> >     print "Line Number (end): " + str(end_number)
> >
> >
> >
> >
> >
> > n = 0
> > array = []
> > array_dummy = []
> > array_mapped = []
> >
> > mapped = []
> > mappe = []
> >
> > n = 0
> > for line in open(filename).readlines():
> >     n = n + 1
> >     if (begin_number <= n) and (end_number > n):
> > #        if (begin_w <= n) and (end_w > n):
> >             if not reversed_flow.match(line) and not
> > iteration.match(line) and not
> > turbulent_viscosity_ratio.match(line):
> >                 m=(line.strip().split())
> >                 print m
> >                 if len(m) > 0:
> > #                    print len(m)
> >                     laenge_liste=len(m)
> > #                    print len(m)
> >                     mappe.append(m)
> >
> >
> > #--end plot
> > residuals-------------------------------------------------
> >
> > This works fine ; except for the region with the writing
> > information:
> >
> > #-----writing information
> > -----------------------------------------
> > Writing "/home/fb/fluent-0500.cas"...
> >  5429199 mixed cells, zone 29, binary.
> > 11187656 mixed interior faces, zone 30, binary.
> >    20004 triangular wall faces, zone 31, binary.
> >     1104 mixed velocity-inlet faces, zone 32, binary.
> >   133638 triangular wall faces, zone 33, binary.
> >    14529 triangular wall faces, zone 34, binary.
> >     1350 mixed pressure-outlet faces, zone 35, binary.
> >    11714 mixed wall faces, zone 36, binary.
> >  1232141 nodes, binary.
> >  1232141 node flags, binary.
> > Done.
> > # -------end writing information -------------------------------
> >
> > Does anyone know, how I can this 'writing' stuff too? The
> > matchingIt occurs a lot :-(
> >
>
> the result should be an array with just:
>
>
> 498 1.0086e-03 2.4608e-04 9.8589e-05 1.4908e-04 8.3956e-04
> 3.8560e-03 4.8384e-02 11:40:01  499
> 499 1.0086e-03 2.4608e-04 9.8589e-05 1.4908e-04  8.3956e-04
> 3.8560e-03 4.8384e-02 11:40:01  499
> 500 1.0049e-03 2.4630e-04 9.8395e-05 1.4865e-04 8.3913e-04
> 3.8545e-03 1.3315e-01 11:14:10  500
> 501 1.0086e-03 2.4608e-04 9.8589e-05 1.4908e-04 8.3956e-04
> 3.8560e-03 4.8384e-02 11:40:01  499


Sometimes Python is so simple there is a tendancy to overthink the
problem <wink>

Based soley on the input and output in your example and not
withstanding errors from the email itself word-wrapping yours and my
text:

>>> print '\r\n'.join([x.strip() for x in open('c:/flutest.txt') if 'e-0' in x])
498 1.0086e-03 2.4608e-04 9.8589e-05 1.4908e-04
8.3956e-04 3.8560e-03 4.8384e-02 11:40:01  499
499 1.0086e-03 2.4608e-04 9.8589e-05 1.4908e-04
8.3956e-04 3.8560e-03 4.8384e-02 11:40:01  499
500 1.0049e-03 2.4630e-04 9.8395e-05 1.4865e-04
8.3913e-04 3.8545e-03 1.3315e-01 11:14:10  500
501 1.0086e-03 2.4608e-04 9.8589e-05 1.4908e-04
8.3956e-04 3.8560e-03 4.8384e-02 11:40:01  499
>>>

HTH :)



More information about the Python-list mailing list