regular expression to extract text

Fredrik Lundh fredrik at pythonware.com
Thu Nov 20 10:48:28 EST 2003


Mark Light wrote:

> Hi I have a file read in as a string that looks like below. What I want to
> do is pull out the bits of information to eventually put in an html table.
> FOr the 1st example the 3 bits are:
> 1.QEXZUO
> 2. C26 H31 N1 O3
> 3. 6.164   15.892   22.551    90.00    90.00    90.00
>
> ANy ideas of the best way to do this - I was trying regular expressions but
> not getting very far.

here's one way to do it:

data = """
Using unit cell orientation matrix from collect.rmat
NOTICE: Performing automatic cell standardization
The following database entries have similar unit cells:
Refcode     Sumformula
      <Conventional cell parameters>
------------------------------------------
QEXZUO     C26 H31 N1 O3
         6.164   15.892   22.551    90.00    90.00    90.00
------------------------------------------
ARQTYD     C19 H23 N1 O5
         6.001   15.227   22.558    90.00    90.00    90.00
------------------------------------------
NHDIIS     C45 H40 Cl2
         6.532   15.147   22.453    90.00    90.00    90.00 """

from StringIO import StringIO

file = StringIO(data)

for line in file:
    if line.startswith("---"):
        part1, part2 = file.readline().strip().split(None, 1)
        part3 = file.readline().strip()
        print "1.", part1
        print "2.", part2
        print "3.", part3

</F>








More information about the Python-list mailing list