Hi, I noticed that people start rushing for the next season on Python's GSoC mailing lists. Do we have any interested developers here, or general ideas about suitable topics? I would expect that we'll do as in the last years and participate under Python's umbrella. Please reply on the cython-devel mailing list. Stefan
On 8 March 2012 14:27, Stefan Behnel <stefan_ml@behnel.de> wrote:
Hi,
I noticed that people start rushing for the next season on Python's GSoC mailing lists. Do we have any interested developers here, or general ideas about suitable topics? I would expect that we'll do as in the last years and participate under Python's umbrella.
I will likely be submitting a proposal for the OpenCL support CEP. As for other proposals, anyone can come up with something themselves or choose a suitable CEP. Some ideas: - fused cdef classes (probably not as an entire gsoc project) - profile guided optimizations (using python's profilers and/or a custom profiler that collects data such as types etc, which can be used to specialize variables inside loops (or entire functions) with a fallback to normal mode in case the type changes) - cython library to contain common functionality (although that might be a bit boring and rather involved) - better type inference, that would be enabled by default and again handle thing like reassignments of variables and fallbacks to the default object type. With entry caching Cython could build a database of types ((extension) classes, functions, variables) used in the modules and functions that are compiled (also def functions), and infer the types used and specialize on those. Maybe a switch should be added to cython to handle circular dependencies, or maybe with the distutils preprocessing it can run all the type inference first and keep track of unresolved entries, and try to fill those in after building the database. For bonus points the user can be allowed to write plugins to aid the process. - llvm based JIT :), i.e. have Cython instrument the generated code to record information and use that to create specializations at runtime (probably far out for a gsoc) I'd be willing to (co)mentor if wanted and possible within the constraints of the gsoc program.
Please reply on the cython-devel mailing list.
Stefan
I'm pretty sure you can't be a student and mentor at the same time... Something to keep in mind... -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. mark florisson <markflorisson88@gmail.com> wrote: On 8 March 2012 14:27, Stefan Behnel <stefan_ml@behnel.de> wrote:
Hi,
I noticed that people start rushing for the next season on Python's GSoC mailing lists. Do we have any interested developers here, or general ideas about suitable topics? I would expect that we'll do as in the last years and participate under Python's umbrella.
I will likely be submitting a proposal for the OpenCL support CEP. As for other proposals, anyone can come up with something themselves or choose a suitable CEP. Some ideas: - fused cdef classes (probably not as an entire gsoc project) - profile guided optimizations (using python's profilers and/or a custom profiler that collects data such as types etc, which can be used to specialize variables inside loops (or entire functions) with a fallback to normal mode in case the type changes) - cython library to contain common functionality (although that might be a bit boring and rather involved) - better type inference, that would be enabled by default and again handle thing like reassignments of variables and fallbacks to the default object type. With entry caching Cython could build a database of types ((extension) classes, functions, variables) used in the modules and functions that are compiled (also def functions), and infer the types used and specialize on those. Maybe a switch should be added to cython to handle circular dependencies, or maybe with the distutils preprocessing it can run all the type inference first and keep track of unresolved entries, and try to fill those in after building the database. For bonus points the user can be allowed to write plugins to aid the process. - llvm based JIT :), i.e. have Cython instrument the generated code to record information and use that to create specializations at runtime (probably far out for a gsoc) I'd be willing to (co)mentor if wanted and possible within the constraints of the gsoc program.
Please reply on the cython-devel mailing list.
Stefan
cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
mark florisson, 11.03.2012 07:44:
- better type inference, that would be enabled by default and again handle thing like reassignments of variables and fallbacks to the default object type. With entry caching Cython could build a database of types ((extension) classes, functions, variables) used in the modules and functions that are compiled (also def functions), and infer the types used and specialize on those. Maybe a switch should be added to cython to handle circular dependencies, or maybe with the distutils preprocessing it can run all the type inference first and keep track of unresolved entries, and try to fill those in after building the database. For bonus points the user can be allowed to write plugins to aid the process.
That would be my favourite. We definitely need control flow driven type inference, local type specialisation, variable renaming, etc. Maybe even whole program (or at least module) analysis, like ShedSkin and PyPy do for their restricted Python dialects. Any serious step towards that goal would be a good outcome of a GSoC. There's also better support for PyPy through its cpyext C-API layer, but that currently involves much more work on PyPy than on Cython, including a lot of performance optimisation on their side. And there doesn't seem to be much interest in the PyPy project for doing this. Stefan
2012/3/11 Stefan Behnel <stefan_ml@behnel.de>:
mark florisson, 11.03.2012 07:44:
- better type inference, that would be enabled by default and again handle thing like reassignments of variables and fallbacks to the default object type. With entry caching Cython could build a database of types ((extension) classes, functions, variables) used in the modules and functions that are compiled (also def functions), and infer the types used and specialize on those. Maybe a switch should be added to cython to handle circular dependencies, or maybe with the distutils preprocessing it can run all the type inference first and keep track of unresolved entries, and try to fill those in after building the database. For bonus points the user can be allowed to write plugins to aid the process.
That would be my favourite. We definitely need control flow driven type inference, local type specialisation, variable renaming, etc. Maybe even whole program (or at least module) analysis, like ShedSkin and PyPy do for their restricted Python dialects. Any serious step towards that goal would be a good outcome of a GSoC.
I think we should be careful here and try to avoid making Cython code more complicated.
There's also better support for PyPy through its cpyext C-API layer, but that currently involves much more work on PyPy than on Cython, including a lot of performance optimisation on their side. And there doesn't seem to be much interest in the PyPy project for doing this.
I'm intrested in function/method call inlining based on CF analysis and on generic cyfunction's signature. I'll do some benchmarks to see how much do we get from this optimization. -- vitja.
Vitja Makarov, 11.03.2012 09:51:
2012/3/11 Stefan Behnel:
mark florisson, 11.03.2012 07:44:
- better type inference, that would be enabled by default and again handle thing like reassignments of variables and fallbacks to the default object type. With entry caching Cython could build a database of types ((extension) classes, functions, variables) used in the modules and functions that are compiled (also def functions), and infer the types used and specialize on those. Maybe a switch should be added to cython to handle circular dependencies, or maybe with the distutils preprocessing it can run all the type inference first and keep track of unresolved entries, and try to fill those in after building the database. For bonus points the user can be allowed to write plugins to aid the process.
That would be my favourite. We definitely need control flow driven type inference, local type specialisation, variable renaming, etc. Maybe even whole program (or at least module) analysis, like ShedSkin and PyPy do for their restricted Python dialects. Any serious step towards that goal would be a good outcome of a GSoC.
I think we should be careful here and try to avoid making Cython code more complicated.
I agree that WPA is probably way out of scope. However, control flow driven type inference would allow us to infer the type of a variable in a given block, e.g. for code like this: if isinstance(x, list): ... else: ... or handle cases like this: def test(x): x = list(x) # ... do read-only stuff with x below this point ... Here, we currently infer that x is an unknown object that is being assigned to twice, even though it's obviously a list in all interesting parts of the function. Stefan
2012/4/2 Stefan Behnel <stefan_ml@behnel.de>:
Vitja Makarov, 11.03.2012 09:51:
2012/3/11 Stefan Behnel:
mark florisson, 11.03.2012 07:44:
- better type inference, that would be enabled by default and again handle thing like reassignments of variables and fallbacks to the default object type. With entry caching Cython could build a database of types ((extension) classes, functions, variables) used in the modules and functions that are compiled (also def functions), and infer the types used and specialize on those. Maybe a switch should be added to cython to handle circular dependencies, or maybe with the distutils preprocessing it can run all the type inference first and keep track of unresolved entries, and try to fill those in after building the database. For bonus points the user can be allowed to write plugins to aid the process.
That would be my favourite. We definitely need control flow driven type inference, local type specialisation, variable renaming, etc. Maybe even whole program (or at least module) analysis, like ShedSkin and PyPy do for their restricted Python dialects. Any serious step towards that goal would be a good outcome of a GSoC.
I think we should be careful here and try to avoid making Cython code more complicated.
I agree that WPA is probably way out of scope. However, control flow driven type inference would allow us to infer the type of a variable in a given block, e.g. for code like this:
if isinstance(x, list): ... else: ...
or handle cases like this:
def test(x): x = list(x) # ... do read-only stuff with x below this point ...
Here, we currently infer that x is an unknown object that is being assigned to twice, even though it's obviously a list in all interesting parts of the function.
What to do if an entry is of PyObject type in some block and of some C-type in another? Should it be splitten into two different entries? -- vitja.
Vitja Makarov, 02.04.2012 14:14:
2012/4/2 Stefan Behnel:
Vitja Makarov, 11.03.2012 09:51:
2012/3/11 Stefan Behnel:
mark florisson, 11.03.2012 07:44:
- better type inference, that would be enabled by default and again handle thing like reassignments of variables and fallbacks to the default object type. With entry caching Cython could build a database of types ((extension) classes, functions, variables) used in the modules and functions that are compiled (also def functions), and infer the types used and specialize on those. Maybe a switch should be added to cython to handle circular dependencies, or maybe with the distutils preprocessing it can run all the type inference first and keep track of unresolved entries, and try to fill those in after building the database. For bonus points the user can be allowed to write plugins to aid the process.
That would be my favourite. We definitely need control flow driven type inference, local type specialisation, variable renaming, etc. Maybe even whole program (or at least module) analysis, like ShedSkin and PyPy do for their restricted Python dialects. Any serious step towards that goal would be a good outcome of a GSoC.
I think we should be careful here and try to avoid making Cython code more complicated.
I agree that WPA is probably way out of scope. However, control flow driven type inference would allow us to infer the type of a variable in a given block, e.g. for code like this:
if isinstance(x, list): ... else: ...
or handle cases like this:
def test(x): x = list(x) # ... do read-only stuff with x below this point ...
Here, we currently infer that x is an unknown object that is being assigned to twice, even though it's obviously a list in all interesting parts of the function.
What to do if an entry is of PyObject type in some block and of some C-type in another?
Should it be split into two different entries?
Yes, that's what I meant with "variable renaming". I admit that I have no idea how complex that would be, though... Stefan
On Sat, Mar 10, 2012 at 10:44 PM, mark florisson <markflorisson88@gmail.com> wrote:
On 8 March 2012 14:27, Stefan Behnel <stefan_ml@behnel.de> wrote:
Hi,
I noticed that people start rushing for the next season on Python's GSoC mailing lists. Do we have any interested developers here, or general ideas about suitable topics? I would expect that we'll do as in the last years and participate under Python's umbrella.
Also, we'd like to see patches from anyone interesting in being a GSoC student, as this will be a requirement as in past years.
I will likely be submitting a proposal for the OpenCL support CEP.
OpenCL would be an interesting experiment, but I think still has limited utility. Dag and I were talking the other day about the challenge of generating the best possible code for evaluating array expressions (think inlined memoryview arithmetic) taking into account memory layout, blocking, etc. which Fortran does really well which could be an interesting direction.
As for other proposals, anyone can come up with something themselves or choose a suitable CEP. Some ideas:
Numbering items for clarity: 1.
- fused cdef classes (probably not as an entire gsoc project)
2.
- profile guided optimizations (using python's profilers and/or a custom profiler that collects data such as types etc, which can be used to specialize variables inside loops (or entire functions) with a fallback to normal mode in case the type changes)
3.
- cython library to contain common functionality (although that might be a bit boring and rather involved)
4.
- better type inference, that would be enabled by default and again handle thing like reassignments of variables and fallbacks to the default object type. With entry caching Cython could build a database of types ((extension) classes, functions, variables) used in the modules and functions that are compiled (also def functions), and infer the types used and specialize on those. Maybe a switch should be added to cython to handle circular dependencies, or maybe with the distutils preprocessing it can run all the type inference first and keep track of unresolved entries, and try to fill those in after building the database. For bonus points the user can be allowed to write plugins to aid the process.
5.
- llvm based JIT :), i.e. have Cython instrument the generated code to record information and use that to create specializations at runtime (probably far out for a gsoc)
What I would most like to see is the common component in 2 and 4, i.e. the ability to generate optimized code for imperfectly-inferred types, with transparent fallback to generic code if conditions are not met at runtime (during as well as at entering the optimized code path). Too many times we have to decide between being safe for all cases or fast for the common case. That being said, there are few people I'd trust with such an ambitious project, and you're on that short list :). - Robert
On 03/21/2012 01:56 PM, Robert Bradshaw wrote:
On Sat, Mar 10, 2012 at 10:44 PM, mark florisson <markflorisson88@gmail.com> wrote:
On 8 March 2012 14:27, Stefan Behnel<stefan_ml@behnel.de> wrote:
Hi,
I noticed that people start rushing for the next season on Python's GSoC mailing lists. Do we have any interested developers here, or general ideas about suitable topics? I would expect that we'll do as in the last years and participate under Python's umbrella.
Also, we'd like to see patches from anyone interesting in being a GSoC student, as this will be a requirement as in past years.
I will likely be submitting a proposal for the OpenCL support CEP.
OpenCL would be an interesting experiment, but I think still has limited utility. Dag and I were talking the other day about the challenge of generating the best possible code for evaluating array expressions (think inlined memoryview arithmetic) taking into account memory layout, blocking, etc. which Fortran does really well which could be an interesting direction.
Yes, a lot of water has run in the river since March 8 here. Anyone interested in reading up on current ideas on what Mark is thinking for his GSoC proposal should read up on the numpy-discussion thread "Looking for people interested in helping with Python compiler to LLVM". Dag
As for other proposals, anyone can come up with something themselves or choose a suitable CEP. Some ideas:
Numbering items for clarity:
1.
- fused cdef classes (probably not as an entire gsoc project)
2.
- profile guided optimizations (using python's profilers and/or a custom profiler that collects data such as types etc, which can be used to specialize variables inside loops (or entire functions) with a fallback to normal mode in case the type changes)
3.
- cython library to contain common functionality (although that might be a bit boring and rather involved)
4.
- better type inference, that would be enabled by default and again handle thing like reassignments of variables and fallbacks to the default object type. With entry caching Cython could build a database of types ((extension) classes, functions, variables) used in the modules and functions that are compiled (also def functions), and infer the types used and specialize on those. Maybe a switch should be added to cython to handle circular dependencies, or maybe with the distutils preprocessing it can run all the type inference first and keep track of unresolved entries, and try to fill those in after building the database. For bonus points the user can be allowed to write plugins to aid the process.
5.
- llvm based JIT :), i.e. have Cython instrument the generated code to record information and use that to create specializations at runtime (probably far out for a gsoc)
What I would most like to see is the common component in 2 and 4, i.e. the ability to generate optimized code for imperfectly-inferred types, with transparent fallback to generic code if conditions are not met at runtime (during as well as at entering the optimized code path). Too many times we have to decide between being safe for all cases or fast for the common case.
That being said, there are few people I'd trust with such an ambitious project, and you're on that short list :).
- Robert _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
On Wed, Mar 21, 2012 at 2:11 PM, Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:
On 03/21/2012 01:56 PM, Robert Bradshaw wrote:
On Sat, Mar 10, 2012 at 10:44 PM, mark florisson <markflorisson88@gmail.com> wrote:
On 8 March 2012 14:27, Stefan Behnel<stefan_ml@behnel.de> wrote:
Hi,
I noticed that people start rushing for the next season on Python's GSoC mailing lists. Do we have any interested developers here, or general ideas about suitable topics? I would expect that we'll do as in the last years and participate under Python's umbrella.
Also, we'd like to see patches from anyone interesting in being a GSoC student, as this will be a requirement as in past years.
I will likely be submitting a proposal for the OpenCL support CEP.
OpenCL would be an interesting experiment, but I think still has limited utility. Dag and I were talking the other day about the challenge of generating the best possible code for evaluating array expressions (think inlined memoryview arithmetic) taking into account memory layout, blocking, etc. which Fortran does really well which could be an interesting direction.
Yes, a lot of water has run in the river since March 8 here. Anyone interested in reading up on current ideas on what Mark is thinking for his GSoC proposal should read up on the numpy-discussion thread "Looking for people interested in helping with Python compiler to LLVM".
Thanks for the pointer. Clearly I'm not subscribed to all the relevant lists (and, admittedly, just finally starting to catch up on the ones I am subscribed to). - Robert
Robert Bradshaw, 21.03.2012 21:56:
- profile guided optimizations (using python's profilers and/or a custom profiler that collects data such as types etc, which can be used to specialize variables inside loops (or entire functions) with a fallback to normal mode in case the type changes)
On Sat, Mar 10, 2012 at 10:44 PM, mark florisson wrote: 2. 4.
- better type inference, that would be enabled by default and again handle thing like reassignments of variables and fallbacks to the default object type. With entry caching Cython could build a database of types ((extension) classes, functions, variables) used in the modules and functions that are compiled (also def functions), and infer the types used and specialize on those. Maybe a switch should be added to cython to handle circular dependencies, or maybe with the distutils preprocessing it can run all the type inference first and keep track of unresolved entries, and try to fill those in after building the database. For bonus points the user can be allowed to write plugins to aid the process.
What I would most like to see is the common component in 2 and 4, i.e. the ability to generate optimized code for imperfectly-inferred types, with transparent fallback to generic code if conditions are not met at runtime (during as well as at entering the optimized code path).
Absolutely. The dict iteration change is pointing exactly in that direction, but there are so many other places where the same thing applies. Basically, it has separate C implementations for different cases all folded into an inlined helper function with flag parameters, and then sets some of the flags to 1/0 constants at compile time and determines others at runtime. The C compiler can then just drop any inaccessible code and use the inlined remainings of the function to infer a good way of streamlining the rest of the surrounding code. This also turned out to be a perfect way to enable or disable certain optimisations and/or implementation details for different backends (PyPy and CPython currently). Much better than actually generating different C code for them. Stefan
On Thu, Mar 8, 2012 at 10:27 PM, Stefan Behnel <stefan_ml@behnel.de> wrote:
I noticed that people start rushing for the next season on Python's GSoC mailing lists. Do we have any interested developers here, or general ideas about suitable topics? I would expect that we'll do as in the last years and participate under Python's umbrella.
Function overloading would be nice: http://wiki.cython.org/enhancements/overloading Mathieu
Mathieu Blondel, 11.03.2012 09:19:
On Thu, Mar 8, 2012 at 10:27 PM, Stefan Behnel wrote:
I noticed that people start rushing for the next season on Python's GSoC mailing lists. Do we have any interested developers here, or general ideas about suitable topics? I would expect that we'll do as in the last years and participate under Python's umbrella.
Function overloading would be nice:
Don't we have (most of) that already? At least the complete infrastructure should be there now, given that we have C++ support and fused types. Doesn't sound non-trivial enough for a GSoC to me. Stefan
participants (6)
-
Dag Sverre Seljebotn -
mark florisson -
Mathieu Blondel -
Robert Bradshaw -
Stefan Behnel -
Vitja Makarov