advice needed on processing rtf files

Tres Seaver tseaver at starbase.neosoft.com
Fri Mar 24 00:24:56 CET 2000


In article <tony.mcdonald-99EBA8.06591923032000 at news.ncl.ac.uk>,
Tony McDonald  <tony.mcdonald at ncl.ac.uk> wrote:
>Hi,
>I have some RTF files (smallest about 100k, largest 7Mbytes) that I need 
>to do some processing with, essentially replacing the contents of known 
>paragraph styles with the output from some python programs I have.
>
>
>For those that don't know, RTF has some preamble at the beginning of the 
>file that describes the paragraph styles, eg
>{\s61\sa120\widctlpar\adjustright \cf1\lang2057\cgrid
>\sbasedon15 \snext61 library;}
>
>describes the style 'library' and assigns it to the internal code \s61. 
>Then, further in the document you have
>
>\s61\sa120\widctlpar\adjustright \cf1\lang2057\cgrid {EMED102\par }
>
>which says that the text EMED102 is marked up with the style 's61', 
>which is library. I'm not certain that the text in the middle 
>(\sa120\widctlpar\adjustright \cf1\lang2057\cgrid) is going to be that 
>text eiteher.
>
>The EMED102 is the code I want to replace with my python programs.
>
>Thing is, the 's61' is not consistent, I need to use the code 'library' 
>to get to 's61' and do my processing.
>
>Thing is, should I use string.find and pals or use some regex work on 
>this, or is there some custom module that uses RTF in clever ways?

There is a set of RTF-parsing tools available at:

  ftp://ftp.primate.wisc.edu/pub/RTF/RTF-1.10.tar.Z

No-Python-bindings-yet,-but-I'm-sure-you'll-correct-that-lapse'ly

Tres.
-- 
---------------------------------------------------------------
Tres Seaver           tseaver at palladion.com       713-523-6582
Palladion Software    http://www.palladion.com



More information about the Python-list mailing list