MS Word parser
Tim Golden
mail at timgolden.me.uk
Wed Jun 13 04:28:40 EDT 2007
kenicheema at gmail.com wrote:
> Hi all,
> I'm currently using antiword to extract content from MS Word files.
> Is there another way to do this without relying on any command prompt
> application?
Well you haven't given your environment, but is there
anything to stop you from controlling Word itself via
COM? I'm no Word expert, but looking around, this
seems to work:
<code>
import win32com.client
word = win32com.client.Dispatch ("Word.Application")
doc = word.Documents.Open ("c:/temp/temp.doc")
text = doc.Range ().Text
open ("c:/temp/temp.txt", "w").write (text.encode ("UTF-8"))
</code>
TJG
More information about the Python-list
mailing list