Mailman 3 June 2014 - Python-ideas

Put default setstate and getstate on object for use in coöperative inheritance.
by Neil Girdhar June 8, 2014

June 8, 2014

When implementing getstate in coöoerative inheritance, the typical thing to do is to call super to get dictionary and add the appropriate entries. Setstate is similar: you extract what you need out of the dictionary and call super with the remaining entries. Unfortunately, object does not have a default implementation, so you need a base class like so: class DefaultSetstateAndGetstate: """ Define default getstate and setstate for use in coöperative inheritance. """ def … [View More]

3 6

Re: [Python-ideas] Expose `itertools.count.start` and implement `itertools.count.__eq__` based on it, like `range`.
by Neil Girdhar June 8, 2014

June 8, 2014

That would be great. On Friday, May 16, 2014 12:16:52 AM UTC-4, Antony Lee wrote: > > Actually, a more reasonable solution would be to have range handle keyword > arguments and map "range(start=x)" to "count(x)". Or, perhaps more simply, > "range(x, None)" (so that no keyword arguments are needed). > > > 2014-05-15 13:04 GMT-07:00 Ram Rachum <ram.r...(a)gmail.com <javascript:>>: > >> Now that I think about it, I would ideally want `itertools.count` … [View More]

4 4

Re: [Python-ideas] Put default setstate and getstate on object for use in coöperative inheritance.
by Nick Coghlan June 7, 2014

June 7, 2014

On 7 June 2014 16:05, Neil Girdhar <mistersheik(a)gmail.com> wrote: > I use cooperative multiple inheritance throughout my (large-ish) project, > and I find it very comfortable and powerful. I am currently using the class > below to serve as an anchor point. The thing is that this behavior is > already implemented somewhere in Python (where?) since it is the default > behaviour if getstate or setstate don't exist. Why not explicitly make it > available to call super? … [View More]

5 10

Make Python code read-only
by Victor Stinner June 5, 2014

June 5, 2014

Hi, I'm trying to find the best option to make CPython faster. I would like to discuss here a first idea of making the Python code read-only to allow new optimizations. Make Python code read-only ========================== I propose to add an option to Python to make the code read-only. In this mode, module namespace, class namespace and function attributes become read-only. It is still be possible to add a "__readonly__ = False" marker to keep a module, a class and/or a function modifiable.… [View More] I chose to make the code read-only by default instead of the opposite. In my test, almost all code can be made read-only without major issue, few code requires the "__readonly__ = False" marker. A module is only made read-only by importlib after the module is loaded. The module is stil modifiable when code is executed until importlib has set all its attributes (ex: __loader__). I have a proof of concept: a fork of Python 3.5 making code read-only if the PYTHONREADONLY environment variable is set to 1. Commands to try it: hg clone http://hg.python.org/sandbox/readonly cd readonly && ./configure && make PYTHONREADONLY=1 ./python -c 'import os; os.x = 1' # ValueError: read-only dictionary Status of the standard library (Lib/*.py): 139 modules are read-only, 25 are modifiable. Except of the sys module, all modules writen in C are read-only. I'm surprised that so few code rely on the ability to modify everything. Most of the code can be read-only. Optimizations possible when the code is read-only ================================================= * Inline calls to functions. * Replace calls to pure functions (without side effect) with the result. For example, len("abc") can be replaced with 3. * Constants can be replaced with their values (at least for simple types like bytes, int and str). It is for example possible to implement these optimizations by manipulating the Abstract Syntax Tree (AST) during the compilation from the source code to bytecode. See my astoptimizer project which already implements similar optimizations: https://bitbucket.org/haypo/astoptimizer More optimizations ================== My main motivation to make code read-only is to specialize a function: optimize a function for a specific environment (type of parameters, external symbols like other functions, etc). Checking the type of parameters can be fast (especially when implemented in C), but it would be expensive to check that all global variables used in the function were not modified since the function has been "specialized". For example, if os.path.isabs(path) is called: you have to check that "os.path" and "os.path.isabs" attributes were not modified and that the isabs() was not modified. If we know that globals are read-only, these checks are no more needed and so it becomes cheap to decide if the specialized function can be used or not. It becomes possible to "learn" types (trace the execution of the application, and then compile for the recorded types). Knowing the type of function parameters, result and local variables opens an interesting class of new optimizations, but I prefer to discuss this later, after discussing the idea of making the code read-only. One point remains unclear to me. There is a short time window between a module is loaded and the module is made read-only. During this window, we cannot rely on the read-only property of the code. Specialized code cannot be used safetly before the module is known to be read-only. I don't know yet how the switch from "slow" code to optimized code should be implemented. Issues with read-only code ========================== * Currently, it's not possible to allow again to modify a module, class or function to keep my implementation simple. With a registry of callbacks, it may be possible to enable again modification and call code to disable optimizations. * PyPy implements this but thanks to its JIT, it can optimize again the modified code during the execution. Writing a JIT is very complex, I'm trying to find a compromise between the fast PyPy and the slow CPython. Add a JIT to CPython is out of my scope, it requires too much modifications of the code. * With read-only code, monkey-patching cannot be used anymore. It's annoying to run tests. An obvious solution is to disable read-only mode to run tests, which can be seen as unsafe since tests are usually used to trust the code. * The sys module cannot be made read-only because modifying sys.stdout and sys.ps1 is a common use case. * The warnings module tries to add a __warningregistry__ global variable in the module where the warning was emited to not repeat warnings that should only be emited once. The problem is that the module namespace is made read-only before this variable is added. A workaround would be to maintain these dictionaries in the warnings module directly, but it becomes harder to clear the dictionary when a module is unloaded or reloaded. Another workaround is to add __warningregistry__ before making a module read-only. * Lazy initialization of module variables does not work anymore. A workaround is to use a mutable type. It can be a dict used as a namespace for module modifiable variables. * The interactive interpreter sets a "_" variable in the builtins namespace. I have no workaround for this. The "_" variable is no more created in read-only mode. Don't run the interactive interpreter in read-only mode. * It is not possible yet to make the namespace of packages read-only. For example, "import encodings.utf_8" adds the symbol "utf_8" to the encodings namespace. A workaround is to load all submodules before making the namespace read-only. This cannot be done for some large modules. For example, the encodings has a lot of submodules, only a few are needed. Read the documentation for more information: http://hg.python.org/sandbox/readonly/file/tip/READONLY.txt More optimizations ================== See my notes for all ideas to optimize CPython: http://haypo-notes.readthedocs.org/faster_cpython.html I explain there why I prefer to optimize CPython instead of working on PyPy or another Python implementation like Pyston, Numba or similar projects. Victor [View Less]

14 28

String-like methods on StringIO objects?
by Nick Coghlan June 5, 2014

June 5, 2014

>From the "idle speculation" files (inspired by the recent thread on python-dev): has anyone ever experimented with offering string methods like find() on StringIO objects? I don't work in any sufficiently memory constrained environments these days that that style of API would be worth the hassle relative to a normal string, it just struck me as a potentially interesting approach to the notion of a string manipulation type that didn't generally copy data around and could use different code … [View More]

2 1