file.write() of non-ASCII characters differs in Interpreted Python than in script run
Chris Kaynor
ckaynor at zindagigames.com
Tue Aug 25 17:28:36 EDT 2015
On Tue, Aug 25, 2015 at 2:19 PM, RAH <rene.heymans at gmail.com> wrote:
> Dear All,
>
> I experienced an incomprehensible behavior (I've spent already many hours on this subject): the `file.write('string')` provides an error in run mode and not when interpreted at the console. The string must contain non-ASCII characters. If all ASCII, there is no error.
>
> The following example shows what I can see. I must overlook something because I cannot think Python makes a difference between interpreted and run modes and yet ... Can someone please check that subject.
>
> Thank you in advance.
> René
>
> Code extract from WSGI application (reply.py)
> =============================================
>
> request_body = environ['wsgi.input'].read(request_body_size) # bytes
> rb = request_body.decode() # string
> d = parse_qs(rb) # dict
>
> f = open('logbytes', 'ab')
> g = open('logstr', 'a')
> h = open('logdict', 'a')
>
> f.write(request_body)
> g.write(str(type(request_body)) + '\t' + str(type(rb)) + '\t' + str(type(d)) + '\n')
> h.write(str(d) + '\n') <--- line 28 of the application
>
> h.close()
> g.close()
> f.close()
>
>
> Tail of Apache2 error.log
> =========================
>
> [Tue Aug 25 20:24:04.657933 2015] [wsgi:error] [pid 3677:tid 3029764928] [remote 192.168.1.5:27575] File "reply.py", line 28, in application
> [Tue Aug 25 20:24:04.658001 2015] [wsgi:error] [pid 3677:tid 3029764928] [remote 192.168.1.5:27575] h.write(str(d) + '\\n')
> [Tue Aug 25 20:24:04.658201 2015] [wsgi:error] [pid 3677:tid 3029764928] [remote 192.168.1.5:27575] UnicodeEncodeError: 'ascii' codec can't encode character '\\xc7' in position 15: ordinal not in range(128)
>
What version of Python is Apache2 using? From the looks of the error,
it is probably using some version of Python2, in which case you'll
need to manually encode the string and pick an encoding for the file
(via an encoding argument to the open function). I'd recommend using
UTF-8.
You can log out the value of sys.version to find out the version number.
> Trying similar code within the Python interpreter
> =================================================
>
> rse at Alibaba:~/test$ python
> Python 3.4.0 (default, Jun 19 2015, 14:18:46)
> [GCC 4.8.2] on linux
> Type "help", "copyright", "credits" or "license" for more information.
>>>> di = {'userName': ['Ça va !']} <--- A dictionary
>>>> str(di)
> "{'userName': ['Ça va !']}" <--- and its string representation
>>>> type(str(di))
> <class 'str'> <--- Is a string indeed
>>>> fi = open('essai', 'a')
>>>> fi.write(str(di) + '\n')
> 26 <--- It works well
>>>> fi.close()
>>>>
In this run, you are using Python 3.4, which defaults to UTF-8.
More information about the Python-list
mailing list