
On 21Dec2022 17:00, Steven D'Aprano <steve@pearwood.info> wrote:
On Wed, Dec 21, 2022 at 09:42:51AM +1100, Cameron Simpson wrote:
With str subtypes, the case that comes to my mind is mixing str subtypes. [...] So, yes, for many methods I might reasonably expect a new html(str). But I can contrive situations where I'd want a plain str
The key word there is *contrive*.
Surely. I think my notion is that most of the ad hoc lexical str methods don't know anything about a str-with-special-semantics and therefore may well generally want to return a plain str, so it isn't the disasterous starting point I think you're suggesting. Obviously that's a generalisation.
Obviously there are methods that are expected to return plain old strings. If you have a html.extract_content() method which extracts the body of the html document as plain text, stripping out all markup, there is no point returning a html object and a str will do. But most methods will need to keep the markup, and so they will need to return a html object.
Hypothetical. I'm not sure I entirely agree. I think we can both agree there will be methods which _should_ return a str and methods which should return the same type as the source object. How the mix plays out depends on the class.
[...] The status quo mostly hurts *lightweight* subclasses:
class TurkishString(str): def upper(self): return TurkishString(str.upper(self.replace('i', 'İ'))) def lower(self): return TurkishString(str.lower(self.replace('I', 'ı')))
That's fine so long as the *only* operations you do to a TurkishString is upper or lower. As soon as you do concatenation, substring replacement, stripping, joining, etc you get a regular string.
So we've gone from a lightweight subclass that needs to override two methods, to a heavyweight subclass that needs to override 30+ methods.
I think __getattribute__ may be the go here. There's a calling cost of course, but you could fairly easily write a __getattribute__ which (a) checked for a superclass matching method and (b) special cases a few methods, and otherwise made all methods return either the same class (TurkishString) or plain str depending on the majority method flavour. In fact, if I were doing this for real I might make a mixing or intermediate class with such a __getattribute__, provided there was a handy TurkishString(str)-ilke call to promote a plain str back into the subclass. (My personal preference is solidifying to a .promote(anything) method, which is a discuassion for elsewhere.)
This is probably why we don't rely on subclassing that much. Easier to just write a top-level function and forget about subclassing.
Oooh, I do a _lot_ of subclassing :-) Cheers, Cameron Simpson <cs@cskk.id.au>