amk at amk.ca
Wed Feb 18 17:06:49 CET 2004
On Wed, 18 Feb 2004 17:20:38 +0300 (MSK),
Denis S. Otkidach <ods at strana.ru> wrote:
> I have the same question as stated in comments: should we really
> enforce this and forget the idea to define some specialized
> encodings like 'html'?
I suppose it depends on what the codecs system is *for*. If it's an
interface that goes between between the abstract world of Unicode code
points and the concrete world of 8-bit characters that represent those code
points, then the idea of returning anything but an 8-bit string from
.encode() doesn't make sense. If codecs are for arbitrary string-to-string
transformations, then the restriction should be relaxed.
In any case, it's straightforward to define a separate string-like class
that escapes the string, e.g. as Quixote does:
>>> from quixote.html import htmltext as h
>>> h('<h1>%s</h1>') % 'Page title'
<htmltext '<h1>Page title</h1>'>
>>> h('<h1>%s</h1>') % 'Page title with <, &, > in it'
<htmltext '<h1>Page title with <, &, > in it</h1>'>
>>> h('<p>This is a test.') + '<a href="http://example.com>'
<htmltext '<p>This is a test.<a href="http://example.com>'>
More information about the Python-list