[Python-Dev] Docs for string methods ?
Wed, 05 Jul 2000 13:59:34 +0200
Fredrik Lundh wrote:
> mal wrote:
> > > > > talking about string methods: how about providing an
> > > > > "encode" method for 8-bit strings too?
> > > >
> > > > I've tossed that idea around a few times too... it could
> > > > have the same interface as the Unicode one (without default
> > > > encoding though). The only problem is that there are currently
> > > > no codecs which could deal with strings on input.
> > >
> > > imho, a consistent interface is more important than a truly
> > > optimal implementation (string are strings yada yada). or in
> > > other words,
> > >
> > > def encode(self, encoding):
> > > if encoding is close enough:
> > > return self
> > > return unicode(self).encode(encoding)
> > >
> > > ought to be good enough for now.
> > Note that 'abc'.encode('utf8') would fail because the UTF-8
> > codec expects Unicod on input to its encode method (hmm, perhaps
> > I ought to make the method use the 'u' parser marker instead
> > of 'U' -- that way, the method would auto-convert the 'abc'
> > string to Unicode using the default encoding and then proceed
> > to encode it in UTF-8).
> sorry, I wasn't clear: the "def encode" snippet above should be
> a string method, not a function.
> "abc".encode("utf8") would be "return self" if the default encoding
> is "ascii" or "utf8", and "return unicode("abc").encode("utf8")" other-
I've just checked in modifications to the builtin codecs
which allow them to accept 8-bit strings too. They will convert
the strings to Unicode and then encode them as usual.
So given that the .encode() method gets accepted (I haven't
heard any other opinions yet), "abc".encode("utf8") will
work just like any other builtin codec (the 8-bit string
will be interpreted under the default encoding assumption).
Python Pages: http://www.lemburg.com/python/