<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">

<HTML>

<HEAD>

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">

<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7654.12">

<TITLE>Re: [Web-SIG] Python 3.0 and WSGI 1.0.</TITLE>

</HEAD>

<BODY>

<!-- Converted from text/plain format -->


<P><FONT SIZE=2>Graham Dumpleton wrote:<BR>

&gt; 2009/5/5 Armin Ronacher &lt;armin.ronacher@active-4.com&gt;:<BR>

&gt;&gt; Graham Dumpleton wrote:<BR>

&gt;&gt;&gt; I can't see but have choice but to pass such settings through as<BR>

&gt;&gt;&gt; strings, else more than likely would cause problems for applications.<BR>

&gt;&gt;&gt; Problem is it isn't clear what encoding stuff can be in Apache<BR>

&gt;&gt;&gt; configuration. At the moment latin-1 is assumed.<BR>

&gt;&gt; Because those information does not have a specified encoding I can see<BR>

&gt;&gt; nothing wrong with it passing that information as bytestrings.&nbsp; I would<BR>

&gt;&gt; have no problem passing *all* values as bytestrings.<BR>

&gt;<BR>

&gt; At what point does that become an inconvenience though? I guess that<BR>

&gt; is my concern, because if one has to do too many manual conversions in<BR>

&gt; an application, people will start to complain it becomes unwieldy to<BR>

&gt; use. In other words, you make it easier or more logical for<BR>

&gt; frameworks, but do you end up putting more burden on applications for<BR>

&gt; stuff outside those core values.<BR>

&gt;<BR>

&gt; So, for those core CGI values which the framework is going to modify<BR>

&gt; even before an application sees them, then fine. Is the framework also<BR>

&gt; going to set the rules as to what encoding is used for other values in<BR>

&gt; the WSGI environment and convert them per that encoding when an<BR>

&gt; application requests them, or is the application always going to have<BR>

&gt; to deal with them as bytes?<BR>

&gt;<BR>

&gt; As I keep saying, you guys who write the frameworks and applications<BR>

&gt; are going to know better than I, I am just challenging the notions as<BR>

&gt; a way of making people think about it so the end result is what is the<BR>

&gt; most logical thing to do. ;-)<BR>

<BR>

In short: it's pretty easy for a framework to default to utf-8 for<BR>

everything, yet give application developers ways to override that. See,<BR>

for example, the cherrypy.tools.encoding Tool in our python3<BR>

branch--it's moved from running &quot;sometime&quot; after the page handler, to<BR>

wrapping the page handler so all page handlers emit bytes. That makes it<BR>

possible for everyone to use unicode strings everywhere, yet still allow<BR>

some to specify exact bytes as necessary. In shorter: don't worry about<BR>

that part, we've got it covered. ;)<BR>

<BR>

<BR>

Robert Brewer<BR>

fumanchu@aminus.org<BR>

<BR>

<BR>

</FONT>

</P>


</BODY>

</HTML>