From stefan_ml at behnel.de Tue Jan 2 10:23:31 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 2 Jan 2018 16:23:31 +0100 Subject: [Cython] Type Inference: Inter Procedural Analysis In-Reply-To: References: Message-ID: Hi! Thanks for working on this! usama hameed schrieb am 29.12.2017 um 19:31: > I recently suggested implementing Inter-Procedural Analysis to infer > function types and made the following Github issue > , and I was advised to > communicate on this channel. > > I went through the code base, and have implemented a rudimentary type > inference system with inter procedural analysis of function types and > arguments, and have handled recursive cases. However, the code base needs > to be cleaned up a lot and is quite buggy right now. Furthermore, I am > pretty sure a lot of edge cases need to be handled, i.e. closures etc. I guess you are referring to this repository: https://github.com/usama54321/Cython/commits/master > The reason I am sending out this email is to get some suggestions. Right > now, the code I have written is pretty hacky, since the current code base > of the project does not accommodate much flexibility to perform inter > procedural analysis. Interesting. Could you elaborate on what you found missing or badly designed? Would be interesting to know for us. Here are a couple of comments on your changes: 1) The functionality looks really nice. Since you weren't accustomed with the code base before, it's understandable that things aren't perfectly integrated with the existing architecture. That can be cleaned up. 2) I was surprised to see that you didn't git-clone the existing repository but created a new one from a source copy instead. But that's probably ok for getting started because (I think) you wrote the code experimentally and didn't focus that much on ready-to-merge commits anyway. Also, you accidentally added .pyc and .so files. Those shouldn't be under version control. It would probably be best to start over from a fresh clone and apply your changes as patch. 3) The commits are a bit difficult to follow because the commit comments are essentially free of information. It would help if you had used them to explain the steps you took and what your intentions were. 4) Is there a reason why you didn't merge the Graph building with the control flow analysis in FlowControl.py? 5) I can't see any test code, but since you implemented this in multiple iterations, I'm sure that you had test code on your side that you tried to compile. Could you add some examples that show how this change improves things? There are hints on writing tests in the hacker guide: https://github.com/cython/cython/wiki/HackerGuide#getting-started Specifically, look at the "*infer*.pyx" file tests in tests/run/. I think it would be best to add a new one. > I found an enhancement suggestion > on the > GitHub project, and was wondering whether this should be done first in > order to make a more flexible type inference system before trying to > properly implement inter procedural analysis into the project. Type inference was implemented long ago and has been improved a couple of times since then. It's not perfect, but it's actually quite good and can further be improved in gradual steps. Inter-procedural analysis seems like one such improvement. > I just started on this as part of a university course project, but I want > to continue working on this. I am not really familiar with the project's > development ecosystem, and it would be really helpful if I'm given some > guidance. It would certainly be great to have this feature added. Could you explain some of your design decisions? That would help me understand what you did and why, so that I can start giving advice on where to go from here. Generally speaking, I think it would be good if we could make it reuse more of the existing infrastructure for type inference and control flow analysis. I would only want to diverge from those if you could convince me that this feature is fundamentally independent from what's there, but that would surprise me. Correct me if I'm wrong, but what I would expect is basically an incremental type inference that (partially) re-infers functions when their dependencies (i.e. the functions that they call) change their return type. Was the incremental part here something that you found difficult? Stefan From drsalists at gmail.com Sun Jan 7 01:57:17 2018 From: drsalists at gmail.com (Dan Stromberg) Date: Sat, 6 Jan 2018 22:57:17 -0800 Subject: [Cython] Segfault with large cdef'd list Message-ID: I'm getting a weird segfault from a tiny function (SSCCE) using cython with python 2.7. I'm seeing something similar with cython and python 3.5, though I did not create an SSCCE for 3.5. This same code used to work with slightly older cythons and pythons, and a slightly older version of Linux Mint. The code is at http://stromberg.dnsalias.org/svn/why-is-python-slow/trunk (more specifically at http://stromberg.dnsalias.org/svn/why-is-python-slow/trunk/tst.pyx ) In short, cdef'ing a list of doubles with about a million elements, and using only the 0th element once, segfaults - but cdef'ing a slightly smaller array does not segfault under otherwise identical conditions. Any suggestions? Does Cython have a limit on the max size of a stack frame? Thanks! I quite like cython. From robertwb at math.washington.edu Sun Jan 7 03:48:43 2018 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 7 Jan 2018 00:48:43 -0800 Subject: [Cython] Segfault with large cdef'd list In-Reply-To: References: Message-ID: Cython itself doesn't impose any limits, but it does inherit whatever limit exists in the C complier and runtime. The variance may be due to whatever else happens to be placed on the stack. On Sat, Jan 6, 2018 at 10:57 PM, Dan Stromberg wrote: > I'm getting a weird segfault from a tiny function (SSCCE) using cython > with python 2.7. I'm seeing something similar with cython and python > 3.5, though I did not create an SSCCE for 3.5. > > This same code used to work with slightly older cythons and pythons, > and a slightly older version of Linux Mint. > > The code is at http://stromberg.dnsalias.org/svn/why-is-python-slow/trunk > (more specifically at > http://stromberg.dnsalias.org/svn/why-is-python-slow/trunk/tst.pyx ) > > In short, cdef'ing a list of doubles with about a million elements, > and using only the 0th element once, segfaults - but cdef'ing a > slightly smaller array does not segfault under otherwise identical > conditions. > > Any suggestions? Does Cython have a limit on the max size of a stack frame? > > Thanks! I quite like cython. > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > https://mail.python.org/mailman/listinfo/cython-devel From stefan_ml at behnel.de Sun Jan 7 05:18:09 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 7 Jan 2018 11:18:09 +0100 Subject: [Cython] Segfault with large cdef'd list In-Reply-To: References: Message-ID: <463c1b55-7264-e680-4bb5-4b509aa60b3b@behnel.de> Robert Bradshaw schrieb am 07.01.2018 um 09:48: > On Sat, Jan 6, 2018 at 10:57 PM, Dan Stromberg wrote: >> I'm getting a weird segfault from a tiny function (SSCCE) using cython >> with python 2.7. I'm seeing something similar with cython and python >> 3.5, though I did not create an SSCCE for 3.5. >> >> This same code used to work with slightly older cythons and pythons, >> and a slightly older version of Linux Mint. >> >> The code is at http://stromberg.dnsalias.org/svn/why-is-python-slow/trunk >> (more specifically at >> http://stromberg.dnsalias.org/svn/why-is-python-slow/trunk/tst.pyx ) >> >> In short, cdef'ing a list of doubles with about a million elements, >> and using only the 0th element once, segfaults - but cdef'ing a >> slightly smaller array does not segfault under otherwise identical >> conditions. >> >> Any suggestions? Does Cython have a limit on the max size of a stack frame? >> > Cython itself doesn't impose any limits, but it does inherit whatever > limit exists in the C complier and runtime. The variance may be due to > whatever else happens to be placed on the stack. Let me add that I wouldn't consider it a good idea to allocate large chunks of memory on the stack. If it's meant to hold substantial amounts of data (which also suggests that there is a substantial amount of processing and/or copying involved), it's probably also worth a [PyMem_]malloc() call. Heap allocation allows you to respond to allocation failures with a MemoryError rather than a crash, as you get now. How much stack space you have left is user controlled through call depth and recursion, which makes it a somewhat easy target. Stefan From usama54321 at gmail.com Mon Jan 8 01:21:34 2018 From: usama54321 at gmail.com (usama hameed) Date: Mon, 8 Jan 2018 11:21:34 +0500 Subject: [Cython] Type Inference: Inter Procedural Analysis In-Reply-To: References: Message-ID: Hey! Thank you for the reply. Interesting. Could you elaborate on what you found missing or badly designed? Would be interesting to know for us. I'll elaborate a bit on what I had in mind while implementing inter procedural analysis, and what I found to be a bit inflexible. Also, I am attaching a progress report I wrote as part of the coursework, which elaborates my overall strategy in a bit more depth. Right now, I'm storing incoming and outgoing callsites of a function at the function scope level. I think it makes more sense to store this information at the AST node, however, that results in some limitations later on during the Type Inference stage, as only information about the scope is passed to the Inference System. Furthermore, I think I needed to add a transform between the infer_types and the analyse_expressions stage, which are done in a single Transform, but I think that's because my way of doing things was hackish, and could be done in a lot better way. After storing the callsites information at the scope level, I made a separate Inter Procedural Inferer, that handles recursive functions separately from non recursive functions. The non recursive case is handled by traversing the call graph to the first function with no incoming nodes (I have still not handled cycles in the graph, except recursion), and starting type inference from there, and traversing down the graph until all the descendants of this ancestor function have been inferred. The recursive case is handled a bit separately, where I take all the return statements in the function before the first recursive call, infer their type if it's consistent, and then re-run type inference on the whole function. If the result is consistent, i.e. all the return nodes have the same type, then the type is inferred. Otherwise, the code falls back to PyObjects. My current code breaks the compiler in certain cases, and I'm working on fixing that. Furthermore, I think I'll need to mark return statements in the FlowControl stage too in addition to the assignment statements, as they can contain expressions too which I'll need to infer. My overall plan had been to change as little of the core code as possible to get a solution up and running, so as not to break anything. I think I have commented out about 4,5 lines in the original repo only. 1) The functionality looks really nice. Since you weren't accustomed with the code base before, it's understandable that things aren't perfectly integrated with the existing architecture. That can be cleaned up. 2) I was surprised to see that you didn't git-clone the existing repository but created a new one from a source copy instead. But that's probably ok for getting started because (I think) you wrote the code experimentally and didn't focus that much on ready-to-merge commits anyway. Also, you accidentally added .pyc and .so files. Those shouldn't be under version control. It would probably be best to start over from a fresh clone and apply your changes as patch. 3) The commits are a bit difficult to follow because the commit comments are essentially free of information. It would help if you had used them to explain the steps you took and what your intentions were. That's the repository I was working on. However, the code base is pretty hacky now, and the commits aren't really consistent with their descriptions. I was just developing experimentally, as I was not really familiar with the code base. I'll fork the repo, and make some commits with comments, and clean up the code. Once I've done that, and my code is a bit more understandable, I'll 4) Is there a reason why you didn't merge the Graph building with the control flow analysis in FlowControl.py? My overall strategy was to change/edit as little of the core code as possible. I'll merge it in FlowControl in the updated commits 5) I can't see any test code, but since you implemented this in multiple iterations, I'm sure that you had test code on your side that you tried to compile. Could you add some examples that show how this change improves things? There are hints on writing tests in the hacker guide: I'll add some of the test files I used locally in the tests in the new commits. Lastly, the Type Inference System I implemented is pretty simple, and might have limitations that I'm not aware of. However, I would love to work on this further, to fix/improve on the whole system. I do not know about Incremental Type Inference, but if that's the way to go and my above strategy is overly simplistic, I'll be happy to work on that too. Usama On Tue, Jan 2, 2018 at 8:23 PM, Stefan Behnel wrote: > Hi! > > Thanks for working on this! > > usama hameed schrieb am 29.12.2017 um 19:31: > > I recently suggested implementing Inter-Procedural Analysis to infer > > function types and made the following Github issue > > , and I was advised to > > communicate on this channel. > > > > I went through the code base, and have implemented a rudimentary type > > inference system with inter procedural analysis of function types and > > arguments, and have handled recursive cases. However, the code base needs > > to be cleaned up a lot and is quite buggy right now. Furthermore, I am > > pretty sure a lot of edge cases need to be handled, i.e. closures etc. > > I guess you are referring to this repository: > > https://github.com/usama54321/Cython/commits/master > > > > The reason I am sending out this email is to get some suggestions. Right > > now, the code I have written is pretty hacky, since the current code base > > of the project does not accommodate much flexibility to perform inter > > procedural analysis. > > Interesting. Could you elaborate on what you found missing or badly > designed? Would be interesting to know for us. > > Here are a couple of comments on your changes: > > 1) The functionality looks really nice. Since you weren't accustomed with > the code base before, it's understandable that things aren't perfectly > integrated with the existing architecture. That can be cleaned up. > > 2) I was surprised to see that you didn't git-clone the existing repository > but created a new one from a source copy instead. But that's probably ok > for getting started because (I think) you wrote the code experimentally and > didn't focus that much on ready-to-merge commits anyway. Also, you > accidentally added .pyc and .so files. Those shouldn't be under version > control. It would probably be best to start over from a fresh clone and > apply your changes as patch. > > 3) The commits are a bit difficult to follow because the commit comments > are essentially free of information. It would help if you had used them to > explain the steps you took and what your intentions were. > > 4) Is there a reason why you didn't merge the Graph building with the > control flow analysis in FlowControl.py? > > 5) I can't see any test code, but since you implemented this in multiple > iterations, I'm sure that you had test code on your side that you tried to > compile. Could you add some examples that show how this change improves > things? There are hints on writing tests in the hacker guide: > > https://github.com/cython/cython/wiki/HackerGuide#getting-started > > Specifically, look at the "*infer*.pyx" file tests in tests/run/. I think > it would be best to add a new one. > > > > I found an enhancement suggestion > > on > the > > GitHub project, and was wondering whether this should be done first in > > order to make a more flexible type inference system before trying to > > properly implement inter procedural analysis into the project. > > Type inference was implemented long ago and has been improved a couple of > times since then. It's not perfect, but it's actually quite good and can > further be improved in gradual steps. Inter-procedural analysis seems like > one such improvement. > > > > I just started on this as part of a university course project, but I want > > to continue working on this. I am not really familiar with the project's > > development ecosystem, and it would be really helpful if I'm given some > > guidance. > > It would certainly be great to have this feature added. Could you explain > some of your design decisions? That would help me understand what you did > and why, so that I can start giving advice on where to go from here. > > Generally speaking, I think it would be good if we could make it reuse more > of the existing infrastructure for type inference and control flow > analysis. I would only want to diverge from those if you could convince me > that this feature is fundamentally independent from what's there, but that > would surprise me. Correct me if I'm wrong, but what I would expect is > basically an incremental type inference that (partially) re-infers > functions when their dependencies (i.e. the functions that they call) > change their return type. Was the incremental part here something that you > found difficult? > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > https://mail.python.org/mailman/listinfo/cython-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: report2.odt Type: application/vnd.oasis.opendocument.text Size: 42030 bytes Desc: not available URL: From drsalists at gmail.com Mon Jan 8 15:36:24 2018 From: drsalists at gmail.com (Dan Stromberg) Date: Mon, 8 Jan 2018 12:36:24 -0800 Subject: [Cython] Segfault with large cdef'd list In-Reply-To: <463c1b55-7264-e680-4bb5-4b509aa60b3b@behnel.de> References: <463c1b55-7264-e680-4bb5-4b509aa60b3b@behnel.de> Message-ID: On Sun, Jan 7, 2018 at 2:18 AM, Stefan Behnel wrote: > Robert Bradshaw schrieb am 07.01.2018 um 09:48: >> Cython itself doesn't impose any limits, but it does inherit whatever >> limit exists in the C complier and runtime. The variance may be due to >> whatever else happens to be placed on the stack. > > Let me add that I wouldn't consider it a good idea to allocate large chunks > of memory on the stack. If it's meant to hold substantial amounts of data > (which also suggests that there is a substantial amount of processing > and/or copying involved), it's probably also worth a [PyMem_]malloc() call. > Heap allocation allows you to respond to allocation failures with a > MemoryError rather than a crash, as you get now. How much stack space you > have left is user controlled through call depth and recursion, which makes > it a somewhat easy target. Thanks - it's working now with malloc() and free(). Code at: http://stromberg.dnsalias.org/svn/why-is-python-slow/trunk/cython3_types_t.pyx It turns out the 2.x and 3.x versions are identical. :) From J.Demeyer at UGent.be Thu Jan 25 06:12:14 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Thu, 25 Jan 2018 12:12:14 +0100 Subject: [Cython] Multiple inheritance with old-style classes in Python 2? Message-ID: <5A69BB8E.50601@UGent.be> Do we want to support Python 2 old-style classes for multiple inheritance? Personally, I don't think that we should, but it's something that has to be decided. The reason I ask is that my code from https://github.com/cython/cython/pull/2033 is actually broken when given an old-style class. From elizabeth.fischer at columbia.edu Thu Jan 25 12:40:41 2018 From: elizabeth.fischer at columbia.edu (Elizabeth A. Fischer) Date: Thu, 25 Jan 2018 17:40:41 +0000 Subject: [Cython] Multiple inheritance with old-style classes in Python 2? In-Reply-To: <5A69BB8E.50601@UGent.be> References: <5A69BB8E.50601@UGent.be> Message-ID: No. Python2 is obsolete, old style classes even more so. On Thu, Jan 25, 2018 at 12:22 PM Jeroen Demeyer wrote: > Do we want to support Python 2 old-style classes for multiple > inheritance? Personally, I don't think that we should, but it's > something that has to be decided. > > The reason I ask is that my code from > https://github.com/cython/cython/pull/2033 is actually broken when given > an old-style class. > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > https://mail.python.org/mailman/listinfo/cython-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertwb at gmail.com Thu Jan 25 12:54:21 2018 From: robertwb at gmail.com (Robert Bradshaw) Date: Thu, 25 Jan 2018 09:54:21 -0800 Subject: [Cython] Multiple inheritance with old-style classes in Python 2? In-Reply-To: References: <5A69BB8E.50601@UGent.be> Message-ID: No, we don't care about supporting this, but we should detect and reject it informatively when possible. On Thu, Jan 25, 2018 at 9:40 AM, Elizabeth A. Fischer wrote: > No. Python2 is obsolete, old style classes even more so. > > On Thu, Jan 25, 2018 at 12:22 PM Jeroen Demeyer wrote: >> >> Do we want to support Python 2 old-style classes for multiple >> inheritance? Personally, I don't think that we should, but it's >> something that has to be decided. >> >> The reason I ask is that my code from >> https://github.com/cython/cython/pull/2033 is actually broken when given >> an old-style class. >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> https://mail.python.org/mailman/listinfo/cython-devel > > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > https://mail.python.org/mailman/listinfo/cython-devel > From J.Demeyer at UGent.be Thu Jan 25 15:25:48 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Thu, 25 Jan 2018 21:25:48 +0100 Subject: [Cython] Multiple inheritance with old-style classes in Python 2? In-Reply-To: <6271c8c05e154bf087c39050d85e2ef8@xmail201.UGent.be> References: <5A69BB8E.50601@UGent.be> <6271c8c05e154bf087c39050d85e2ef8@xmail201.UGent.be> Message-ID: <5A6A3D4C.4000605@UGent.be> On 2018-01-25 18:54, Robert Bradshaw wrote: > No, we don't care about supporting this, but we should detect and > reject it informatively when possible. Good. I'll create a PR.