[Python-ideas] Make Python code read-only
Eric Snow
ericsnowcurrently at gmail.com
Wed May 21 00:04:22 CEST 2014
An interesting idea. Comments below.
On May 20, 2014 10:58 AM, "Victor Stinner" <victor.stinner at gmail.com> wrote:
> Make Python code read-only
> ==========================
>
> I propose to add an option to Python to make the code read-only. In
> this mode, module namespace, class namespace and function attributes
> become read-only. It is still be possible to add a "__readonly__ =
> False" marker to keep a module, a class and/or a function modifiable.
Make __readonly__ a data descriptor (getset in the C-API) on
ModuleType, type, and FunctionType and people could toggle it as
needed. The descriptor could look something like this (in pure
Python):
class ReadonlyDescriptor:
DEFAULT = os.environ.get(b'PYTHONREADONLY', False) # i.e. ignore
changes to PYTHONREADONLY
def __init__(self, *, default=None):
if default is None:
default = cls.DEFAULT
self.default = default
def __get__(self, obj, cls):
if obj is None:
return self
try:
return obj.__dict__['__readonly__']
except KeyError:
readonly = bool(self.default)
obj.__dict__['__readonly__'] = readonly
return readonly
def __set__(self, obj, value):
obj.__dict__['__readonly__'] = value
Alternately, the object structs for the 3 types (e.g. PyModuleObject)
could each grow a "readonly" field (or an extra flag option if there
is an appropriate flag). The descriptor (in C) would use that instead
of obj.__dict__['__readonly__']. However, I'd prefer going through
__dict__.
Either way, the 3 types would share a tp_setattro implementation that
checked the read-only flag. That way there's no need to make sweeping
changes to the 3 types, nor to the dict type.
def __setattr__(self, name, value):
if self.__readonly__:
raise AttributeError('readonly')
super().__setattr__(name, value)
FWIW, the idea of a flag for read-only could be applied to objects in
general, particularly in a future language addition. "__readonly__"
is a good name for the flag so the precedent set by the three types in
this proposal would be a good one.
>
> I chose to make the code read-only by default instead of the opposite.
> In my test, almost all code can be made read-only without major issue,
> few code requires the "__readonly__ = False" marker.
Read-only by default would be backwards-incompatible, but having a
commandline flag (and/or env var) to enable it would be useful.
For classes a decorator could be nice, though it should wait until it
was more obviously worth doing. I'm not sure it would matter for
functions, though the same decorator would probably work.
>
> A module is only made read-only by importlib after the module is
> loaded. The module is stil modifiable when code is executed until
> importlib has set all its attributes (ex: __loader__).
With a data descriptor and __setattr__ like I described above, there
is no need to make any changes to importlib.
> Optimizations possible when the code is read-only
> =================================================
...
> More optimizations
> ==================
+1
> One point remains unclear to me. There is a short time window between
> a module is loaded and the module is made read-only. During this
> window, we cannot rely on the read-only property of the code.
> Specialized code cannot be used safetly before the module is known to
> be read-only.
How big a problem would this be in practice?
> Issues with read-only code
> ==========================
>
> * Currently, it's not possible to allow again to modify a module,
> class or function to keep my implementation simple. With a registry of
> callbacks, it may be possible to enable again modification and call
> code to disable optimizations.
With the data descriptor approach toggling read-only would work.
Enabling/disabling optimizations at that point would depend on how
they were implemented.
> * Lazy initialization of module variables does not work anymore. A
> workaround is to use a mutable type. It can be a dict used as a
> namespace for module modifiable variables.
What do you mean by "lazy initialization of module variables"?
> * It is not possible yet to make the namespace of packages read-only.
> For example, "import encodings.utf_8" adds the symbol "utf_8" to the
> encodings namespace. A workaround is to load all submodules before
> making the namespace read-only. This cannot be done for some large
> modules. For example, the encodings has a lot of submodules, only a
> few are needed.
If read-only is only enforced via __setattr__ then the workaround is
to bind the submodule directly via pkg.__dict__.
-eric
More information about the Python-ideas
mailing list