[python-win32] UnicodeEncodingError when print a doc file
Tim Roberts
timr at probo.com
Wed Jun 15 03:02:06 CEST 2011
cool_go_blue wrote:
> Thanks. It works. Actually, what I want to do is to parse the whole
> document. How can I retrieve the list of words in the
> document? I use the following code:
>
> for word in doc.Content.Text.encode("cp1252", "replace"):
> print word
>
> It seems that word is each a character.
>
No, what you are getting back is a Python string. When you enumerate
through a string, you get characters. This is basic Python.
If your words are all separated by spaces, you can use split:
for word in doc.Content.Text.encode("cp1252","replace").split():
print word
Note, however, that you don't need to convert it to an 8-bit character
set until you want to print it. If you are going to process these
words, then you might as well leave them in Unicode.
--
Tim Roberts, timr at probo.com
Providenza & Boekelheide, Inc.
More information about the python-win32
mailing list