When implementing getstate in coöoerative inheritance, the typical thing to
do is to call super to get dictionary and add the appropriate entries.
Setstate is similar: you extract what you need out of the dictionary and
call super with the remaining entries. Unfortunately, object does not have
a default implementation, so you need a base class like so:
class DefaultSetstateAndGetstate:
"""
Define default getstate and setstate for use in coöperative inheritance.
"""
def __getstate__(self):
return self.__dict__.copy()
def __setstate__(self, state):
self.__dict__.update(state)
I suggest that this be added to object.
Best,
Neil
That would be great.
On Friday, May 16, 2014 12:16:52 AM UTC-4, Antony Lee wrote:
>
> Actually, a more reasonable solution would be to have range handle keyword
> arguments and map "range(start=x)" to "count(x)". Or, perhaps more simply,
> "range(x, None)" (so that no keyword arguments are needed).
>
>
> 2014-05-15 13:04 GMT-07:00 Ram Rachum <ram.r...(a)gmail.com <javascript:>>:
>
>> Now that I think about it, I would ideally want `itertools.count` to be
>> deprecated in favor of `range(float('inf'))`, but I know that would never
>> happen.
>>
>>
>> On Thursday, May 15, 2014 11:02:56 PM UTC+3, Ram Rachum wrote:
>>>
>>> I suggest exposing `itertools.count.start` and implementing
>>> `itertools.count.__eq__` based on it. This'll provide the same benefits
>>> that `range` got by exposing `range.start` and allowing `range.__eq__`.
>>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python...(a)python.org <javascript:>
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
On 7 June 2014 16:05, Neil Girdhar <mistersheik(a)gmail.com> wrote:
> I use cooperative multiple inheritance throughout my (large-ish) project,
> and I find it very comfortable and powerful. I am currently using the class
> below to serve as an anchor point. The thing is that this behavior is
> already implemented somewhere in Python (where?) since it is the default
> behaviour if getstate or setstate don't exist. Why not explicitly make it
> available to call super?
There is fallback behaviour in the pickle and copy modules that
doesn't rely on the getstate/setstate APIs. Those fallbacks are
defined by the protocols, not by the object model.
https://docs.python.org/3/library/pickle.html#pickle-inst covers the
available protocols for instance pickling.
https://docs.python.org/3/library/copy.html covers (towards the end)
some of the options for making class instances copyable
https://docs.python.org/3/library/copyreg.html is an additional
registry that allows third parties to make instances of classes
defined elsewhere support pickling and copying without relying on
monkeypatching.
> I think I saw or got an email from Guido that I can't seem to find that
> rightly points out that object doesn't have __dict__ so this can't be done.
> I'm curious why object doesn't have __dict__? Where does the __dict__ comes
> into existence? I assume that objects of type object and instantiated
> objects of other types have the same metaclass; does the metaclass treat
> them differently?
Types defined in C extensions and those defined dynamically on the
heap share a metaclass at runtime, but their initialisation code is
different. You can also define Python level types without a __dict__
by declaring a __slots__ attribute with no __dict__ entry (for
example, collections.namedtuple uses that to ensure namedtuple
instances are exactly the same size as ordinary tuples - the mapping
from field names to tuple indices is maintained on the class).
Cheers,
Nick.
P.S. Posting through Google Groups doesn't work properly - it messes
up the reply headers completely. gmane does a better job of
interoperating with the mailing list software (as far as I am aware,
Google just don't care whether or not interaction with non-Google
lists actually works)
--
Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia
Hi,
I'm trying to find the best option to make CPython faster. I would
like to discuss here a first idea of making the Python code read-only
to allow new optimizations.
Make Python code read-only
==========================
I propose to add an option to Python to make the code read-only. In
this mode, module namespace, class namespace and function attributes
become read-only. It is still be possible to add a "__readonly__ =
False" marker to keep a module, a class and/or a function modifiable.
I chose to make the code read-only by default instead of the opposite.
In my test, almost all code can be made read-only without major issue,
few code requires the "__readonly__ = False" marker.
A module is only made read-only by importlib after the module is
loaded. The module is stil modifiable when code is executed until
importlib has set all its attributes (ex: __loader__).
I have a proof of concept: a fork of Python 3.5 making code read-only
if the PYTHONREADONLY environment variable is set to 1. Commands to
try it:
hg clone http://hg.python.org/sandbox/readonly
cd readonly && ./configure && make
PYTHONREADONLY=1 ./python -c 'import os; os.x = 1'
# ValueError: read-only dictionary
Status of the standard library (Lib/*.py): 139 modules are read-only,
25 are modifiable. Except of the sys module, all modules writen in C
are read-only.
I'm surprised that so few code rely on the ability to modify
everything. Most of the code can be read-only.
Optimizations possible when the code is read-only
=================================================
* Inline calls to functions.
* Replace calls to pure functions (without side effect) with the
result. For example, len("abc") can be replaced with 3.
* Constants can be replaced with their values (at least for simple
types like bytes, int and str).
It is for example possible to implement these optimizations by
manipulating the Abstract Syntax Tree (AST) during the compilation
from the source code to bytecode. See my astoptimizer project which
already implements similar optimizations:
https://bitbucket.org/haypo/astoptimizer
More optimizations
==================
My main motivation to make code read-only is to specialize a function:
optimize a function for a specific environment (type of parameters,
external symbols like other functions, etc). Checking the type of
parameters can be fast (especially when implemented in C), but it
would be expensive to check that all global variables used in the
function were not modified since the function has been "specialized".
For example, if os.path.isabs(path) is called: you have to check that
"os.path" and "os.path.isabs" attributes were not modified and that
the isabs() was not modified. If we know that globals are read-only,
these checks are no more needed and so it becomes cheap to decide if
the specialized function can be used or not.
It becomes possible to "learn" types (trace the execution of the
application, and then compile for the recorded types). Knowing the
type of function parameters, result and local variables opens an
interesting class of new optimizations, but I prefer to discuss this
later, after discussing the idea of making the code read-only.
One point remains unclear to me. There is a short time window between
a module is loaded and the module is made read-only. During this
window, we cannot rely on the read-only property of the code.
Specialized code cannot be used safetly before the module is known to
be read-only. I don't know yet how the switch from "slow" code to
optimized code should be implemented.
Issues with read-only code
==========================
* Currently, it's not possible to allow again to modify a module,
class or function to keep my implementation simple. With a registry of
callbacks, it may be possible to enable again modification and call
code to disable optimizations.
* PyPy implements this but thanks to its JIT, it can optimize again
the modified code during the execution. Writing a JIT is very complex,
I'm trying to find a compromise between the fast PyPy and the slow
CPython. Add a JIT to CPython is out of my scope, it requires too much
modifications of the code.
* With read-only code, monkey-patching cannot be used anymore. It's
annoying to run tests. An obvious solution is to disable read-only
mode to run tests, which can be seen as unsafe since tests are usually
used to trust the code.
* The sys module cannot be made read-only because modifying sys.stdout
and sys.ps1 is a common use case.
* The warnings module tries to add a __warningregistry__ global
variable in the module where the warning was emited to not repeat
warnings that should only be emited once. The problem is that the
module namespace is made read-only before this variable is added. A
workaround would be to maintain these dictionaries in the warnings
module directly, but it becomes harder to clear the dictionary when a
module is unloaded or reloaded. Another workaround is to add
__warningregistry__ before making a module read-only.
* Lazy initialization of module variables does not work anymore. A
workaround is to use a mutable type. It can be a dict used as a
namespace for module modifiable variables.
* The interactive interpreter sets a "_" variable in the builtins
namespace. I have no workaround for this. The "_" variable is no more
created in read-only mode. Don't run the interactive interpreter in
read-only mode.
* It is not possible yet to make the namespace of packages read-only.
For example, "import encodings.utf_8" adds the symbol "utf_8" to the
encodings namespace. A workaround is to load all submodules before
making the namespace read-only. This cannot be done for some large
modules. For example, the encodings has a lot of submodules, only a
few are needed.
Read the documentation for more information:
http://hg.python.org/sandbox/readonly/file/tip/READONLY.txt
More optimizations
==================
See my notes for all ideas to optimize CPython:
http://haypo-notes.readthedocs.org/faster_cpython.html
I explain there why I prefer to optimize CPython instead of working on
PyPy or another Python implementation like Pyston, Numba or similar
projects.
Victor
>From the "idle speculation" files (inspired by the recent thread on
python-dev): has anyone ever experimented with offering string methods like
find() on StringIO objects?
I don't work in any sufficiently memory constrained environments these days
that that style of API would be worth the hassle relative to a normal
string, it just struck me as a potentially interesting approach to the
notion of a string manipulation type that didn't generally copy data around
and could use different code point sizes internally for different parts of
the text data.
Cheers,
Nick.