[Bug 1779445] Re: edithtml.py saves en templates using html entity reference with raw iso-8859-1 character

Yasuhito FUTATSUKI at POEM futatuki at poem.co.jp
Sat Jul 7 22:50:30 EDT 2018


I understand that your fix is to preserve character entity reference in
the text of TextArea through the post method and I made sure it have
been fixed in Rev 1788. Thank you.

I think one more problem about charset of query strings from Text or
TextArea which is not restricted to ascii text for all language. If a
text contains raw non-ascii character, its charset depends on
implementation of browsers, even if the HTML 4.01 specification mentions
its default is "UNKNOWN", which means "User agents may interpret this
value as the character encoding that was used to transmit the document
containing this FORM element."
(https://www.w3.org/TR/html401/interact/forms.html)

It seems that it is not a problem in most case on browsers nowadays
respecting the specification, but it is still problem in some case. At
least I put into non-breaking space ('\xa0' in iso-8859-1) character in
Text field in us-ascii form using Firefox 61 on FreeBSD, it encoded as
'%A0' in query string although characters in Unicode are encoded as
numeric character references. The code to handle this special care for
'us-ascii' is found in Utils.canonstr(), so it may be needed to use it
in some place including TextArea in edithtml.py (Though using non-ascii
characters in us-ascii form is irregular, of course)

-- 
You received this bug notification because you are a member of Mailman
Coders, which is subscribed to GNU Mailman.
https://bugs.launchpad.net/bugs/1779445

Title:
  edithtml.py saves en templates using html entity reference with raw
  iso-8859-1 character

To manage notifications about this bug go to:
https://bugs.launchpad.net/mailman/+bug/1779445/+subscriptions


More information about the Mailman-coders mailing list