[python-win32] Translating MS-Word documents

Paul Weimer Paul.Weimer at harlandfs.com
Tue Oct 11 18:44:00 CEST 2005


Could always try the Google language tools... 

-----Original Message-----
From: python-win32-bounces at python.org
[mailto:python-win32-bounces at python.org] On Behalf Of Richard Kerry
Sent: Tuesday, October 11, 2005 9:31 AM
To: Tim Roberts; python-win32 at python.org
Subject: Re: [python-win32] Translating MS-Word documents


If I understand correctly, this process it to be used to translate
languages.
I trust that all concerned appreciate that the differences between
languages' grammars mean that the results are very unlikely to be
particularly legible.  You're into Machine Translation issues really and
probably need to pass whole sentences into a proper language translator
that will also deal with the grammar.  Even then the results are likely
to be somewhat stilted.


Linguistically,
Richard.
 

-----Original Message-----
From: python-win32-bounces at python.org
[mailto:python-win32-bounces at python.org] On Behalf Of Tim Roberts
Sent: Tuesday, October 11, 2005 5:14 PM
To: python-win32 at python.org
Subject: Re: [python-win32] Translating MS-Word documents

On Tue, 11 Oct 2005 11:32:53 +0200 (CEST), ?yvind
<python at kapitalisten.no> wrote:

>I need to translate several Word-documents. I have a list with 
>approximately 5000 words and its translation, and would like to read 
>thru a Word-document, look for the words in the list and replace them.
>However, I need to keep the current formating of the Word-documents. 
>(Using Word
>2003 and XP).
>
>What is the best way of doing this as fast and efficient as possible?
>
>1) Search and replace for each word directly in Word
>
>2) Exctract the text, run it thru regex and thereafter do a search and 
>replace in Word.
>
>3) Some other way?
>
>(The only language I know is Python, so writing some C++ stuff that can

>do it a lot faster is not an option).
>  
>

This is a hard problem.

If you can let this run for a number of hours, the simplest answer is to
use the Word object model to open each file in turn and use the
Document.Find method to search and replace.  It'll take a while, but the
computer won't complain.  Here's an MSDN article that shows how to use
Find and Replace within a selection; the same syntax should work with a
Document:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_wrco
re/html/wrtskhowtoreplacetext.asp

However, in many cases, it is easier to use the Word macro recorder to
record what you want to do ONCE, and then use the generated VBA to
create your script.

If your document formatting will survive a change to RTF and back, you
could convert to RTF (which is easily machine readable) and do the
replacements in plain text.  However, few documents survive that change
completely intact.

--
Tim Roberts, timr at probo.com
Providenza & Boekelheide, Inc.

_______________________________________________
Python-win32 mailing list
Python-win32 at python.org
http://mail.python.org/mailman/listinfo/python-win32


http://www.bbc.co.uk/

This e-mail (and any attachments) is confidential and may contain
personal views which are not the views of the BBC unless specifically
stated.
If you have received it in error, please delete it from your system. 
Do not use, copy or disclose the information in any way nor act in
reliance on it and notify the sender immediately. Please note that the
BBC monitors e-mails sent or received. 
Further communication will signify your consent to this.
_______________________________________________
Python-win32 mailing list
Python-win32 at python.org
http://mail.python.org/mailman/listinfo/python-win32




More information about the Python-win32 mailing list