[Python-Dev] Codecs and StreamCodecs
M.-A. Lemburg
mal@lemburg.com
Wed, 17 Nov 1999 10:55:05 +0100
"Fred L. Drake, Jr." wrote:
>
> M.-A. Lemburg writes:
> > Wouldn't it be possible to have the read/write methods set up
> > the state when called for the first time ?
>
> That slows the down; the constructor should handle initialization.
> Perhaps what gets registered should be: encoding function, decoding
> function, stream encoder factory (can be a class), stream decoder
> factory (again, can be a class).
Guido proposed the factory approach too, though not seperated
into these 4 APIs (note that your proposal looks very much like
what I had in the early version of my proposal).
Anyway, I think that factory functions are the way to go,
because they offer more flexibility w/r to reusing already
instantiated codecs, importing modules on-the-fly as was
suggested in another thread (thereby making codec module
import lazy) or mapping encoder and decoder requests all
to one class.
So here's a new registry approach:
unicodec.register(encoding,factory_function,action)
with
encoding - name of the supported encoding, e.g. Shift_JIS
factory_function - a function that returns an object
or function ready to be used for action
action - a string stating the supported action:
'encode'
'decode'
'stream write'
'stream read'
The factory_function API depends on the implementation of
the codec. The returned object's interface on the value of action:
Codecs:
-------
obj = factory_function_for_<action>(errors='strict')
'encode': obj(u,slice=None) -> Python string
'decode': obj(s,offset=0,chunksize=0) -> (Unicode object, bytes consumed)
factory_functions are free to return simple function objects
for stateless encodings.
StreamCodecs:
-------------
obj = factory_function_for_<action>(stream,errors='strict')
obj should provide access to all methods defined for the stream
object, overriding these:
'stream write': obj.write(u,slice=None) -> bytes written to stream
obj.flush() -> ???
'stream read': obj.read(chunksize=0) -> (Unicode object, bytes read)
obj.flush() -> ???
errors is defined like in my Codec spec. The codecs are
expected to use this argument to handle error conditions.
I'm not sure what Fredrik intended with the .flush() methods,
so the definition is still open. I would expect it to do some
finalization of state.
Perhaps we need another set of actions for the .feed()/.close()
approach...
As in earlier version of the proposal:
The registry should provide default implementations for
missing action factory_functions using the other registered
functions, e.g. 'stream write' can be emulated using
'encode' and 'stream read' using 'decode'. The same probably
holds for feed approach.
--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 44 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/