[Web-SIG] parsing of urlencoded data and Unicode
janssen at parc.com
Tue Jul 29 22:22:51 CEST 2008
> Common practice is by now long established, and cannot simply be
> changed 10 years after the fact to conform to what the standard says
> it should've been. Therefore, it *is* now a problem with the standard:
> the standard is wrong. If you follow it, you're going to create
> totally broken software.
> For instance, treating form posts as being 7bit unless they have a
> Content-Transfer-Encoding. The RFC says you should do that. But it's
> an absolutely nonsensical thing to do. Your code would not work with
> any existing web browser if you did. Or, if you're writing a web
> browser: don't even think of using Content-Transfer-Encoding to encode
> your response. Few servers/frameworks would understand your submission
> if you tried.
I had lots of various charset errors with UpLib, as people tried
various broken browsers, because I was trying to guess "common
practice" and follow it. Until I actually read the RFCs and made the
server follow them. Now that it does, almost all of those errors have
gone away. So, my experience seems to differ from yours.
More information about the Web-SIG