[Tutor] subtyping builtin type
Steven D'Aprano
steve at pearwood.info
Wed Jan 1 01:26:20 CET 2014
On Tue, Dec 31, 2013 at 03:35:55PM +0100, spir wrote:
> Hello,
>
> I don't remember exactly how to do that. As an example:
>
> class Source (str):
> __slots__ = ['i', 'n']
> def __init__ (self, string):
> self.i = 0 # current matching index in source
> self.n = len(string) # number of ucodes (Unicode code points)
> #~ str.__init__(self, string)
The easiest way to do that is:
class Source(str):
def __init__(self, *args, **kwargs):
self.i = 0
self.n = len(self)
As a (premature) memory optimization, you can use __slots__ to reduce
the amount of memory per instance. But this (probably) is the wrong way
to solve this problem. Your design makes Source a kind of string:
issubclass(Source, str)
=> True
I expect that it should not be. (Obviously I'm making some assumptions
about the design here.) To decide whether you should use subclassing
here, ask yourself a few questions:
* Does it make sense to call string methods on Source objects? In
Python 3.3, there are over 40 public string methods. If *just one*
of them makes no sense for a Source object, then Source should not
be a subclass of str.
e.g. source.isnumeric(), source.isidentifier()
* Do you expect to pass Source objects to arbitrary functions which
expect strings, and have the result be meaningful?
* Does it make sense for Source methods to return plain strings?
source.upper() returns a str, not a Source object.
* Is a Source thing a kind of string? If so, what's the difference
between a Source and a str? Why not just use a str?
If all you want is to decorate a string with a couple of extra
pieces of information, then a limitation of Python is that you
can only do so by subclassing.
* Or does a Source thing *include* a string as a component part of
it? If that is the case -- and I think it is -- then composition
is the right approach.
The difference between has-a and is-a relationships are critical. I
expect that the right relationship should be:
a Source object has a string
rather than "is a string". That makes composition a better design than
inheritance. Here's a lightweight mutable solution, where all three
attributes are public and free to be varied after initialisation:
class Source:
def __init__(self, string, i=0, n=None):
if n is None:
n = len(string)
self.i = i
self.n = n
self.string = string
An immutable solution is nearly as easy:
from collections import namedtuple
class Source(namedtuple("Source", "string i n")):
def __new__(cls, string, i=0, n=None):
if n is None:
n = len(string)
return super(Source, cls).__new__(cls, string, i, n)
Here's a version which makes the string attribute immutable, and the i
and n attributes mutable:
class Source:
def __init__(self, string, i=0, n=None):
if n is None:
n = len(string)
self.i = i
self.n = n
self._string = string
@property
def string(self):
return self._string
--
Steven
More information about the Tutor
mailing list