can't open word document after string replacements

Antoine De Groote antoine at vo.lu
Tue Oct 24 12:15:40 CEST 2006


Bruno Desthuilliers wrote:
> Antoine De Groote wrote:
>> Hi there,
>>
>> I have a word document containing pictures and text. This documents
>> holds several 'ABCDEF' strings which serve as a placeholder for names.
>> Now I want to replace these occurences with names in a list (members).
> 
> Do you know that MS Word already provides this kind of features ?


No, I don't. Sounds interesting... What is this feature called?

> 
>> I
>> open both input and output file in binary mode and do the
>> transformation. However, I can't open the resulting file, Word just
>> telling that there was an error. Does anybody what I am doing wrong?
> 
> Hand-editing a non-documented binary format may lead to undesirable
> results...
> 
>> Oh, and is this approach pythonic anyway? 
> 
> The pythonic approach is usually to start looking for existing
> solutions... In this case, using Word's builtin features and Python/COM
> integration would be a better choice IMHO.
> 
>> (I have a strong Java
>> background.)
> 
> Nobody's perfect !-)
> 
>> Regards,
>> antoine
>>
>>
>> import os
>>
>> members = somelist
>>
>> os.chdir(somefolder)
>>
>> doc = file('ttt.doc', 'rb')
>> docout = file('ttt1.doc', 'wb')
>>
>> counter = 0
>>
>> for line in doc:
> 
> Since you opened the file as binary, you should use file.read() instead.
> Ever wondered what your 'lines' look like ?-)
> 
>>     while line.find('ABCDEF') > -1:
> 
> .doc is a binary format. You may find such a byte sequence in it's
> content in places that are *not* text content.
> 
>>         try:
>>             line = line.replace('ABCDEF', members[counter], 1)
>>             docout.write(line)
> 
> You're writing back the whole chunk on each iteration. No surprise the
> resulting document is corrupted.
> 
>>             counter += 1
> 
> seq = list("abcd")
> for indice, item in enumerate(seq):
>   print "%02d : %s" % (indice, item)
> 
> 
>>         except:
>>             docout.write(line.replace('ABCDEF', '', 1))
>>     else:
>>         docout.write(line)
>>
>> doc.close()
>> docout.close()
>>
> 
> 
> 



More information about the Python-list mailing list