MS Word -- finding text
Prema
prema at prema.co.nz
Sun Jun 16 22:42:00 EDT 2002
Hi Mike !
Excellent -- thanks very much for your time and sharing.
Would have taken a while to work through that -- I can't wait to try it out!
I'll post to this thread as to how I got on
Kind regards
Thanks again
Mike
Mike Brenner <mikeb at mitre.org> wrote in message news:<mailman.1024247060.12127.python-list at python.org>...
> The COM objects (like Project, Word, Excel, etc.) sometimes return stuff in Unicode format. When they do, the python str() function dies when converting non-ASCII unicode characters.
>
> To avoid this problem, I use the following conversion routine. After making the necessary check for None, it attempts a quick conversion str() first. When necessary, it slowly goes through each character, handling the exceptions that are raised.
>
> The default is a prime because that is the most common character that hits me in Word and Excel documents. Instead of coding it as an ASCII single-quote characters, these applications code it as a more "beautiful" character, so it kills the python str() function.
>
> You may or may not wish to change the return to eliminate the string.strip there, depending on your needs.
>
> You could make a separate function that has just the TRY and the EXCEPT in it, in order to use the MAP function instead of the for loop.
>
> Mike Brenner
>
>
> def phrase_unicode2string(message):
> """
> phrase_unicode2string works around the built-in function str(message)
> which aborts when non-ASCII unicode characters are given to it.
> """
> if type(message)==types.NoneType:
> return ""
> try: st=str(message)
> except: # untranslatable unicode character
> list=[]
> for uc in message:
> try:
> c=str(uc)
> except:
> c="`"
> list.append(c)
> # Note: because it raises exception instead of returning
> # a default characters, we cannot use map() here.
> st=string.join(list,"")
> return string.strip(st)
>
> ------------------------
>
> Mike Prema wrote:
>
> #######
> from win32com.client import Dispatch
> W=Dispatch('Word.Application')
> D=W.Documents.Open('c:\\windows\\Desktop\\TOR.doc') ## Test Doc
> FindRange=D.Content
> F=FindRange.Find.Execute('Conman','True')
> print FindRange.Text
> #######
> str() doesn't seem to work in this case
> I tried using the codecs library but I think I am missing something
More information about the Python-list
mailing list