[1.6]: UserList, Dict: Do we need a UserString class?

On Mon, 27 Mar 2000 12:00:31 -0500 (EST), Peter Funk wrote:
Do we need a UserString class?
This will probably be useful on top of the i18n stuff in due course, so I'd like it. Something Mike Da Silva and I have discussed a lot is implementing a higher-level 'typed string' library on top of the Unicode stuff. A 'typed string' is like a string, but knows what encoding it is in - possibly Unicode, possibly a native encoding and embodies some basic type safety and convenience notions, like not being able to add a Shift-JIS and an EUC string together. Iteration would always be per character, not per byte; and a certain amount of magic would say that if the string was (say) Japanese, it would acquire a few extra methods for doing some Japan-specific things like expanding half-width katakana. Of course, we can do this anyway, but I think defining the API clearly in UserString is a great idea. - Andy Robinson

Agreed. Please somebody send a patch! --Guido van Rossum (home page: http://www.python.org/~guido/)

I wrote:
Guido van Rossum:
Agreed. Please somebody send a patch!
I feel unable to do, what Andy proposed. What I had in mind was a simple wrapper class around the builtin string type similar to UserDict and UserList which can be used to derive other classes from. I use UserList and UserDict quite often and find them very useful. They are simple and powerful and easy to extend. May be the things Andy Robinson proposed above belong into a sub class which inherits from a simple UserString class? Do we need an additional UserUnicode class for unicode string objects? Regards, Peter -- Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260 office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen)

[Peter Funk]
[PF]
Yes. I think Andy wanted his class to be a subclass of UserString.
I use UserList and UserDict quite often and find them very useful. They are simple and powerful and easy to extend.
Agreed.
It would be great if there was a single UserString class which would work with either Unicode or 8-bit strings. I think that shouldn't be too hard, since it's just a wrapper. So why don't you give the UserString.py a try and leave Andy's wish alone? --Guido van Rossum (home page: http://www.python.org/~guido/)

Hi!
Okay. Here we go. Could someone please have a close eye on this? I've haccked it up in hurry. ---- 8< ---- 8< ---- cut here ---- 8< ---- schnipp ---- 8< ---- schnapp ---- #!/usr/bin/env python """A user-defined wrapper around string objects Note: string objects have grown methods in Python 1.6 This module requires Python 1.6 or later. """ import sys # XXX Totally untested and hacked up until 2:00 am with too less sleep ;-) class UserString: def __init__(self, string=""): self.data = string def __repr__(self): return repr(self.data) def __cmp__(self, string): if isinstance(string, UserString): return cmp(self.data, string.data) else: return cmp(self.data, string) def __len__(self): return len(self.data) # methods defined in alphabetical order def capitalize(self): return self.__class__(self.data.capitalize()) def center(self, width): return self.__class__(self.data.center(width)) def count(self, sub, start=0, end=sys.maxint): return self.data.count(sub, start, end) def encode(self, encoding=None, errors=None): # XXX improve this? if encoding: if errors: return self.__class__(self.data.encode(encoding, errors)) else: return self.__class__(self.data.encode(encoding)) else: return self.__class__(self.data.encode()) def endswith(self): raise NotImplementedError def find(self, sub, start=0, end=sys.maxint): return self.data.find(sub, start, end) def index(self): return self.data.index(sub, start, end) def isdecimal(self): return self.data.isdecimal() def isdigit(self): return self.data.isdigit() def islower(self): return self.data.islower() def isnumeric(self): return self.data.isnumeric() def isspace(self): return self.data.isspace() def istitle(self): return self.data.istitle() def isupper(self): return self.data.isupper() def join(self, seq): return self.data.join(seq) def ljust(self, width): return self.__class__(self.data.ljust(width)) def lower(self): return self.__class__(self.data.lower()) def lstrip(self): return self.__class__(self.data.lstrip()) def replace(self, old, new, maxsplit=-1): return self.__class__(self.data.replace(old, new, maxsplit)) def rfind(self, sub, start=0, end=sys.maxint): return self.data.rfind(sub, start, end) def rindex(self, sub, start=0, end=sys.maxint): return self.data.rindex(sub, start, end) def rjust(self, width): return self.__class__(self.data.rjust(width)) def rstrip(self): return self.__class__(self.data.rstrip()) def split(self, sep=None, maxsplit=-1): return self.data.split(sep, maxsplit) def splitlines(self, maxsplit=-1): return self.data.splitlines(maxsplit) def startswith(self, prefix, start=0, end=sys.maxint): return self.data.startswith(prefix, start, end) def strip(self): return self.__class__(self.data.strip()) def swapcase(self): return self.__class__(self.data.swapcase()) def title(self): return self.__class__(self.data.title()) def translate(self, table, deletechars=""): return self.__class__(self.data.translate(table, deletechars)) def upper(self): return self.__class__(self.data.upper()) def __add__(self, other): if isinstance(other, UserString): return self.__class__(self.data + other.data) elif isinstance(other, type(self.data)): return self.__class__(self.data + other) else: return self.__class__(self.data + str(other)) def __radd__(self, other): if isinstance(other, type(self.data)): return self.__class__(other + self.data) else: return self.__class__(str(other) + self.data) def __mul__(self, n): return self.__class__(self.data*n) __rmul__ = __mul__ def _test(): s = UserString("abc") u = UserString(u"efg") # XXX add some real tests here? return [0] if __name__ == "__main__": import sys sys.exit(_test()[0])

Good job! Go get some sleep, and tomorrow morning when you're fresh, compare it to UserList. From visual inpsection, you seem to be missing __getitem__ and __getslice__, and maybe more (of course not __set*__). --Guido van Rossum (home page: http://www.python.org/~guido/)

Agreed. Please somebody send a patch! --Guido van Rossum (home page: http://www.python.org/~guido/)

I wrote:
Guido van Rossum:
Agreed. Please somebody send a patch!
I feel unable to do, what Andy proposed. What I had in mind was a simple wrapper class around the builtin string type similar to UserDict and UserList which can be used to derive other classes from. I use UserList and UserDict quite often and find them very useful. They are simple and powerful and easy to extend. May be the things Andy Robinson proposed above belong into a sub class which inherits from a simple UserString class? Do we need an additional UserUnicode class for unicode string objects? Regards, Peter -- Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260 office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen)

[Peter Funk]
[PF]
Yes. I think Andy wanted his class to be a subclass of UserString.
I use UserList and UserDict quite often and find them very useful. They are simple and powerful and easy to extend.
Agreed.
It would be great if there was a single UserString class which would work with either Unicode or 8-bit strings. I think that shouldn't be too hard, since it's just a wrapper. So why don't you give the UserString.py a try and leave Andy's wish alone? --Guido van Rossum (home page: http://www.python.org/~guido/)

Hi!
Okay. Here we go. Could someone please have a close eye on this? I've haccked it up in hurry. ---- 8< ---- 8< ---- cut here ---- 8< ---- schnipp ---- 8< ---- schnapp ---- #!/usr/bin/env python """A user-defined wrapper around string objects Note: string objects have grown methods in Python 1.6 This module requires Python 1.6 or later. """ import sys # XXX Totally untested and hacked up until 2:00 am with too less sleep ;-) class UserString: def __init__(self, string=""): self.data = string def __repr__(self): return repr(self.data) def __cmp__(self, string): if isinstance(string, UserString): return cmp(self.data, string.data) else: return cmp(self.data, string) def __len__(self): return len(self.data) # methods defined in alphabetical order def capitalize(self): return self.__class__(self.data.capitalize()) def center(self, width): return self.__class__(self.data.center(width)) def count(self, sub, start=0, end=sys.maxint): return self.data.count(sub, start, end) def encode(self, encoding=None, errors=None): # XXX improve this? if encoding: if errors: return self.__class__(self.data.encode(encoding, errors)) else: return self.__class__(self.data.encode(encoding)) else: return self.__class__(self.data.encode()) def endswith(self): raise NotImplementedError def find(self, sub, start=0, end=sys.maxint): return self.data.find(sub, start, end) def index(self): return self.data.index(sub, start, end) def isdecimal(self): return self.data.isdecimal() def isdigit(self): return self.data.isdigit() def islower(self): return self.data.islower() def isnumeric(self): return self.data.isnumeric() def isspace(self): return self.data.isspace() def istitle(self): return self.data.istitle() def isupper(self): return self.data.isupper() def join(self, seq): return self.data.join(seq) def ljust(self, width): return self.__class__(self.data.ljust(width)) def lower(self): return self.__class__(self.data.lower()) def lstrip(self): return self.__class__(self.data.lstrip()) def replace(self, old, new, maxsplit=-1): return self.__class__(self.data.replace(old, new, maxsplit)) def rfind(self, sub, start=0, end=sys.maxint): return self.data.rfind(sub, start, end) def rindex(self, sub, start=0, end=sys.maxint): return self.data.rindex(sub, start, end) def rjust(self, width): return self.__class__(self.data.rjust(width)) def rstrip(self): return self.__class__(self.data.rstrip()) def split(self, sep=None, maxsplit=-1): return self.data.split(sep, maxsplit) def splitlines(self, maxsplit=-1): return self.data.splitlines(maxsplit) def startswith(self, prefix, start=0, end=sys.maxint): return self.data.startswith(prefix, start, end) def strip(self): return self.__class__(self.data.strip()) def swapcase(self): return self.__class__(self.data.swapcase()) def title(self): return self.__class__(self.data.title()) def translate(self, table, deletechars=""): return self.__class__(self.data.translate(table, deletechars)) def upper(self): return self.__class__(self.data.upper()) def __add__(self, other): if isinstance(other, UserString): return self.__class__(self.data + other.data) elif isinstance(other, type(self.data)): return self.__class__(self.data + other) else: return self.__class__(self.data + str(other)) def __radd__(self, other): if isinstance(other, type(self.data)): return self.__class__(other + self.data) else: return self.__class__(str(other) + self.data) def __mul__(self, n): return self.__class__(self.data*n) __rmul__ = __mul__ def _test(): s = UserString("abc") u = UserString(u"efg") # XXX add some real tests here? return [0] if __name__ == "__main__": import sys sys.exit(_test()[0])

Good job! Go get some sleep, and tomorrow morning when you're fresh, compare it to UserList. From visual inpsection, you seem to be missing __getitem__ and __getslice__, and maybe more (of course not __set*__). --Guido van Rossum (home page: http://www.python.org/~guido/)
participants (3)
-
andy@reportlab.com
-
Guido van Rossum
-
pf@artcom-gmbh.de