Protected Methods and Python

Wed Apr 16 05:08:26 EDT 2003

Venkatesh Prasad Ranganath wrote:
   ...
> All said, how do you ensure "design by interface" in Python?   If every
> feature of a class is open to the user then he/she
> will use them as he/she sees fit.  However, the writer of the class could
> have coded some methods for ancillary purposes and
> later decides to do away with them in the next version of the class.  How
> should the user deal with this situation?

Quite simple, really: a name that starts with a leading underscore
indicates that the name is there "for ancillary purposes" as you put
it -- it's not in the "published public interface" of the module,
class, etc; a name _without_ such a leading underscore is part of
the public interface, and won't be "done away with" in the next
version of the package, module, class, and so forth.

Thus, the user should deal with this situation by using only names
exposed in the public interface of the package, module, class, etc --
names that don't start with a leading underscore -- unless of course
said user *WANTS* to tie himself or herself to the specific version
of the reusable component.

> In my opinion, protected and
> private as in C++ or Java empowers the library developer to expose just
> the interface that he/she
> intends the user to see.  Nothing more and nothing less.  (It's an issue
> dealing with abstraction.)  

Not quite true in C++, since casts allow the user to work around this
(in most implementations) -- just as Python allows working around
exactly the same attempts by the library designer to impose his or her
views of appropriate abstraction by using marked-as-internal names.

In Python and Java the user can also work around such limitations by
introspection, unless the user's code is treated as untrusted and
is run inside a sandbox that tries to limit the damage the user can
do (that was the design intent of standard library modules rexec and
Bastion in Python -- enforcing those restrictions that are normally
just respected by convention, by treating user code as untrusted and
potentially intended to be actively harmful -- they're currently broken,
but equivalent alternatives such as package Sandbox are being explored).

But issues of security are really very different from issues of
abstraction, IMHO, even though Java does its best to conflate them --
if you must think of user code as actively trying to do maximum damage
to the system, you've got a LOT of things to worry about.  For normal
use, having the library designer consider the future library user as
a nasty adversary to be foiled at every turn is really an unproductive
mindset, even though it's the one that accessibility restrictions suggest.

> As a result, the user can be more care-free
> while using the library code as he cannot screw up the internals if the
> library code doesn't do it on it's
> own.  On the other hand, if all members of a class are visible to the user
> then the user needs to be more diligent while using
> the library.  I prefer the first option.

The user can rather easily "screw up" things if and when he or she
does it DELIBERATELY -- and in fact, even an accident is quite likely.
E.g., consider the hypersimplified code where the library has
something like:

class Base {
  private: virtual int another() { 
    return 100; 
  }
  public: virtual int amethod() {
    return 23+another();
  };
};

Method another is virtual because class Base is actually derived from
others and that method is overridden inside the library here and there,
etc, etc.  But it's private, so the user can't tamper accidentally
with it, right?  Yeah, right...:-).  So, way over there in userland,
the user, not even "seeing" (you do keep saying "visible" while what's
supposed to happen has to do with "accessible" instead...) the private
parts of the class Base it gets from the library, happily codes:

class Derived: public Base {
  public: virtual int another() { return 1000; }
};

int xx(Base* d)
{
  return d->amethod();
}

int main()
{
  Derived dd;

  std::cout << xx(&dd) << '\n';
  return 0;
}

guess what this prints...?

In Python, you can choose between calling the meant-to-be-internal
method _amethod (the user then must KNOW not to use it) or avoiding
accidental clashes by calling it __amethod (so it gets name-mangled
by the Python compiler to help avoid _accidental_ clashes).  Not
foolproof, yet (homonymous classes in different modules are still
*possible*, unless the designer takes care about that...), but quite
a bit better than what pure *accessibility* restrictions achieve in C++
in cases such as this.

> On the issue of working around bugs in a library via runtime protected
> attributes, it would be better if the skillful coder
> fixed the bug ;-) or use another library.  This is just what I would do.

Modifying source code inside a component that is maintained by
another party -- basically "forking" that component -- even where
_possible_ at all because you do get source, is a grievous mistake
(correctly identified as such in the important book "Antipatterns").

By making your own fixes (forking) the component, you make your
situation *UNTENABLE* when new releases of that component come out --
you're contemned to keep maintaining your fixes forever, porting
them to each successive release, rebuilding from source every time,
distributing the .DLL for YOUR fixed component (when you can even
have the LICENSE to do so...) rather than relying on system-provided
versions...

The option to avoid buggy libraries isn't always there - sometimes
one IS condemned to e.g. develop on top of Microsoft MFC, because
that's part of the specs for your component.  Then, you start cursing
a language which has let the designer of those libraries impede you
from access to a zillion things you desperately DO need to access
(or, you start doing heavy-duty casting, but there are still some
limitations that will stump you quite badly, such as the use of
concrete classes rather than factory-idioms where you NEED to be
able to interpose your own subclass of some widget and the deuced
framework insists on generating the concrete baseclass instead...).

Alex