[Tutor] subclassing strings
Kent Johnson
kent37 at tds.net
Wed Jan 9 04:49:59 CET 2008
Eric Abrahamsen wrote:
> When I create a string like so:
>
> x = 'myvalue'
>
> my understanding is that this is equivalent to:
>
> x = str('myvalue')
>
> and that this second form is more fundamental: the first is a
> shorthand for the second.
The second does nothing that the first doesn't already do.
'myvalue' is a string:
In [4]: s='myvalue'
In [5]: type(s)
Out[5]: <type 'str'>
So is str('myvalue'):
In [6]: t=str(s)
In [7]: type(t)
Out[7]: <type 'str'>
In fact they are the *same* string - str(s) is the same as s if s is
already a string:
In [8]: s is t
Out[8]: True
What is 'str()' exactly? Is it a class name?
Close; str is a type name. str() is an invocation of the type.
> If so, is the string value I pass in assigned to an attribute, the way
> I might create a "self.value =" statement in the __init__ function of
> a class I made myself? If so, does that interior attribute have a
> name? I've gone poking in the python lib, but haven't found anything
> enlightening.
No, not really. At the C level, IIUC there is a structure containing a
pointer to a byte array, but there is no access to this level of
internals from Python. For Python, strings are fundamental types like
integers and floats. The internal representation is not available.
I guess you may have a background in C++ where a char array is different
from an instance of the string class. Python does not have this
distinction; you don't have access to a bare char array that is not
wrapped in some class.
> I started out wanting to subclass str so I could add metadata to
> objects which would otherwise behave exactly like strings. But then I
> started wondering where the actual value of the string was stored,
> since I wasn't doing it myself, and whether I'd need to be careful of
> __repr__ and __str__ so as not to interfere with the basic string
> functioning of the object. As far as I can tell the object functions
> normally as a string without my doing anything – where does the string
> value 'go', and is there any way I might inadvertently step on it by
> overriding the wrong attribute or method?
No, you can't access the actual byte array from Python and you can't
damage it.
You might want to take a look at BeautifulSoup, which subclasses unicode
to create a page element, and path.py which subclasses string to add
file path manipulation operations.
http://www.crummy.com/software/BeautifulSoup/
file://localhost/Users/kent/Desktop/Downloads/Python/path-2.1/index.html
The actual string object implementation is in stringobject.h & .c:
http://svn.python.org/view/python/trunk/Include/stringobject.h?rev=59564&view=markup
http://svn.python.org/view/python/trunk/Objects/stringobject.c?rev=59564&view=markup
Kent
More information about the Tutor
mailing list