How to Read/Write RTF and Word Files?

David LeBlanc whisper at oz.nospamnet
Mon May 21 12:22:06 EDT 2001


In article <9eajj507n7 at enews1.newsguy.com>, aleaxit at yahoo.com says...
> "Greg Jorgensen" <greg at pdxperts.com> wrote in message
> news:Pine.LNX.4.33.0105201926390.6410-100000 at C800000-A.potlnd1.or.home.com..
> .
> > On Sat, 19 May 2001, Dry Ice wrote:
> >
> > > How to Read/Write RTF and Word Files?
>     ...
> > Microsoft Word uses a proprietary and undocumented format for .doc files.
> > The Word file format has changed significantly across versions of Word.
> > Whether by reverse-engineering or licensing the spec from Microsoft, quite
> > a few companies have implemented at least some Word import/export
> > capabilities.
> 
> Yep.  And there are opensource programs that attempt the same
> feat (presumably after reverse-engineering, in this case).
> 
> http://www.fe.msk.ru/~vitus/catdoc/ offers such tools that manage
> to extract some text from MS Word (and Excel) files most of the
> time, and also points to other such tools.  These tools aren't
> in Python, but you might use them from Python, or maybe recode in
> Python the algorithms & heuristics they embody.
> 
> 
> Alex

WxWare which I tersely mentioned previously in this thread claims to do a 
pretty good job of round tripping various Word versions up through Word 
2000. It does both RTF and doc formats.

Dave LeBlanc



More information about the Python-list mailing list