
Doesn't xml.sax.saxutils.escape do what you want (together with htmlentitydefs)? I was going to say that this is quite a small change to warrant a PEP - but there are two obvious approaches (working from scratch, or working on top of xml.sax.saxutils.escape - perhaps modifying and relocating that function), so *some* design probably needs to be recorded in a PEP. Regards, Martin

* Martin von Loewis | | Doesn't xml.sax.saxutils.escape do what you want (together with | htmlentitydefs)? I was going to say that this is quite a small | change to warrant a PEP - but there are two obvious approaches | (working from scratch, or working on top of xml.sax.saxutils.escape | - perhaps modifying and relocating that function), so *some* design | probably needs to be recorded in a PEP. It would probably help, but now that Python has Unicode support there should be some way to convert string data to a legacy encoding and represent all characters not available in that encoding using numeric character references. This would be very useful for both XML and HTML. The difficulty, I assume, lies in figuring out which encodings support what characters. --Lars M.

Note that cgi.escape() does this too.
Indeed. That justifies a PEP that proposes to concentrate the functionality in one place, deprecating the other places. One approach would be to enhance the codec facilites, so that "string".encode("iso-entities") becomes possible - but that is already in the middle of discussing the PEP. Regards, Martin

* Martin von Loewis | | Doesn't xml.sax.saxutils.escape do what you want (together with | htmlentitydefs)? I was going to say that this is quite a small | change to warrant a PEP - but there are two obvious approaches | (working from scratch, or working on top of xml.sax.saxutils.escape | - perhaps modifying and relocating that function), so *some* design | probably needs to be recorded in a PEP. It would probably help, but now that Python has Unicode support there should be some way to convert string data to a legacy encoding and represent all characters not available in that encoding using numeric character references. This would be very useful for both XML and HTML. The difficulty, I assume, lies in figuring out which encodings support what characters. --Lars M.

Note that cgi.escape() does this too.
Indeed. That justifies a PEP that proposes to concentrate the functionality in one place, deprecating the other places. One approach would be to enhance the codec facilites, so that "string".encode("iso-entities") becomes possible - but that is already in the middle of discussing the PEP. Regards, Martin
participants (3)
-
Guido van Rossum
-
Lars Marius Garshol
-
Martin von Loewis