file.write() of non-ASCII characters differs in Interpreted Python than in script run

RAH rene.heymans at
Tue Aug 25 23:19:53 CEST 2015

Dear All,

I experienced an incomprehensible behavior (I've spent already many hours on this subject): the `file.write('string')` provides an error in run mode and not when interpreted at the console. The string must contain non-ASCII characters. If all ASCII, there is no error.

The following example shows what I can see. I must overlook something because I cannot think Python makes a difference between interpreted and run modes and yet ... Can someone please check that subject.

Thank you in advance.

Code extract from WSGI application (

    request_body = environ['wsgi.input'].read(request_body_size)    # bytes
    rb = request_body.decode()                                      # string
    d = parse_qs(rb)                                                # dict

    f = open('logbytes', 'ab')
    g = open('logstr', 'a')
    h = open('logdict', 'a')

    g.write(str(type(request_body)) + '\t' + str(type(rb)) + '\t' + str(type(d)) + '\n')
    h.write(str(d) + '\n')      <--- line 28 of the application


Tail of Apache2 error.log

[Tue Aug 25 20:24:04.657933 2015] [wsgi:error] [pid 3677:tid 3029764928] [remote]   File "", line 28, in application
[Tue Aug 25 20:24:04.658001 2015] [wsgi:error] [pid 3677:tid 3029764928] [remote]     h.write(str(d) + '\\n')
[Tue Aug 25 20:24:04.658201 2015] [wsgi:error] [pid 3677:tid 3029764928] [remote] UnicodeEncodeError: 'ascii' codec can't encode character '\\xc7' in position 15: ordinal not in range(128)

Checking what has been logged

rse at Alibaba:~/test$ cat logbytes
userName=Ça va !               <--- this was indeed the input (notice the
                                    french C + cedilla)
                                    Unicode U+00C7    ALT-0199    UTF-8 C387
                                    Reading the logbytes file one can verify
                                    that Ç is indeed represented by the 2 bytes
                                    \xC3 and \x87
rse at Alibaba:~/test$ cat logstr
<class 'bytes'>    <class 'str'>    <class 'dict'>
rse at Alibaba:~/test$ cat logdict
rse at Alibaba:~/test$             <--- Obviously empty because of error

Trying similar code within the Python interpreter

rse at Alibaba:~/test$ python
Python 3.4.0 (default, Jun 19 2015, 14:18:46)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> di = {'userName': ['Ça va !']}    <--- A dictionary
>>> str(di)
"{'userName': ['Ça va !']}"           <--- and its string representation
>>> type(str(di))
<class 'str'>                   <--- Is a string indeed
>>> fi = open('essai', 'a')
>>> fi.write(str(di) + '\n')
26                              <--- It works well
>>> fi.close()

Checking what has been written

rse at Alibaba:~/test$ cat essai
{'userName': ['Ça va !']}       <--- The result is correct
rse at Alibaba:~/test$

No error if all ASCII

If the input is `userName=Rene` for instance then there is no error and the
`logdict' does indeed then contain the text of the dictionary
`{'userName': ['Rene']}`

More information about the Python-list mailing list