Newbie Word & Text file access?

Gerhard Haering gerhard.haering at gmx.de
Fri Jul 19 10:26:57 EDT 2002


In article <3D381EB2.1030506 at mcw.net>, David Wilson wrote:
> Newbie question after checking about 500 messages on comp.lang.python : 
> Using W2K on Intel, Python 2.2.1, and IDLE 0.8
> 
> Unsuccessful open() of MS Word (.doc) , Abiword (.abw) and text (.txt) 
> files after successfully opening and reading HTML (.html) and LOG files.
> 
> Sample IDLE output:
> 
> >>> open("C:\DownLoad\test1.txt","r")
> Traceback (most recent call last):
>    File "<pyshell#0>", line 1, in ?
>      open("C:\DownLoad\test1.txt","r")
> IOError: [Errno 2] No such file or directory: 'C:\\DownLoad\test1.txt'
                                                   ^^        ^^
The \\ is an escaped backslash, the \t is a tab.

> 1) What else is necessary to open these types of files?

Be careful when using backslashes in strings. In normal Python strings, they
need to be escaped with an additional backslash, like \\, as there are special
sequences like \t (Tab), \n (newline), and others. Here's the full
specification: http://www.python.org/doc/current/ref/strings.html

> 2) Please advise where I can find documentation on IDLE and/or Python 
> errors such as "[Errno 2]" above?

The interesting part is the IOError string "No such file or directory." It says
all there is to know ;-) Errno 2 is AFAIK just the error code from the
underlying operating system, and IMO normally irrelevant.

> 3) Is there a parser utility which can massage these formatted document files
> for faster/easier subsequent manipulation?

For M$ Word .doc files, you could try to automate Word with the win32 COM
extensions. Abiword uses an XML-based format, right? So you can operate with
XML libraries on it. And on .txt files, you can operate with the standard
Python libraries.

Gerhard
-- 
mail:   gerhard <at> bigfoot <dot> de       registered Linux user #64239
web:    http://www.cs.fhm.edu/~ifw00065/    OpenPGP public key id 86AB43C0
public key fingerprint: DEC1 1D02 5743 1159 CD20  A4B6 7B22 6575 86AB 43C0
reduce(lambda x,y:x+y,map(lambda x:chr(ord(x)^42),tuple('zS^BED\nX_FOY\x0b')))



More information about the Python-list mailing list