
A couple of days ago I heard about the Parallella [1] project which is an open hardware platform similar to the Raspberry Pi but with much higher capabilities. It has a Zynq Z-7010 which has both a dual core ARM A9 (800 MHz) processor and a Artix-7 FPGA, a 16 core Epiphany multicore accelerator, 1GB ram (see [2] for more info) and currently boots up in Ubuntu. The goal of the Parallella project is to develop an open parallel hardware platform and development tools. Recently they announced support for Python with Mark Dewing [3] leading the effort. I had asked Mark if he considered PyPy but at this time he doesn't have time for this investigation and he reposted my comment on the forum [4] with a couple of question. Maybe one of you could answer them. Working with the Parallella project maybe a good opportunity for the PyPy project from both a PR perspective and as well as the technical challenges it would present. On the technical side it would give the opportunity to test STM on a reasonable number of cores while also dealing with cores from different architectures (ARM and Epiphany). I could see all the JITting occurring on the ARM cores with it producing output for both architectures based on which type of core STM decides to use for a chunk of work to execute on. Of course there is also the challenge of bridging between the 2 architectures. Maybe even some of the more expensive STM operations could be offloaded to the FPGA or even a limited amount of very hot sections of code could be JITted to the FPGA (although this might be more work than its worth). the niche market that would get PyPy to take off. But there have been a couple of issue with this approach. There is a tremendous amount of work that needs to be done so that PyPy can look attractive to this niche market. It requires supporting both NumPy and SciPy and their was an expectation that if PyPy supports NumPy others would come to help out with the SciPy support. The problem is that there doesn't seam to be many who are eager to pitch in for the SciPy effort and there also has not been a whole lot willing to help will the ongoing NumPy work. I think in general the ratio of people who use NumPy and SciPy to those willing to contribute is quite small. So the idea of going after this market was a good idea and can definitely have the opportunity to showing the strength of PyPy project it hasn't done much to improve the image of the PyPy project. It also doesn't help that there is some commercial interests that have popped up recently that have decided to play hard ball against PyPy by spreading FUD. Unlike the Raspberry Pi hardware which can only support hobbyist the Parallella hardware can support both hobbyists and commercial interests. They cost $100 which is more than the $35 for Raspberry Pi but still within reach of most hobbyists and they didn't cut out the many features that are needed for commercial interests. The Parallella project raised nearly $0.9 million on kickstarter [5] for the project with nearly 5000 backers. Since many who will use the Parallella hardware also have experience on embedded systems they and are more likely used to writing low level code in assembly, FPGAs, and even lots of C code and I'm sure have hit many issues with programming in parallel/multithreaded and would welcome a better developer experience. I bet many of them would be willing to contribute both financially and time to supporting such an effort. I believe the Architecture of PyPy could lend it self to becoming the core of such a development system and would allow Python to be used in this space. This could provide a lot of good PR for the PyPy project. Now I'm not saying PyPy shouldn't devote any more time to supporting NumPy as I'm sure when PyPy has very good support for both NumPy and SciPy it's going to be a very good day for all Python supporters. I just think that the PyPy team needs to think about a strategy that in the end will help its PR and gain support from a much larger community. This project is doing a lot of good things technically and now it just needs to get the attention of the development community at large. Now I can't predict if working with the Parallella project would be the break though in PR that PyPy needs but it's at least an option that's out there. BTW I don't have any commercial interests in the Parallella project. If some time in the future I use their hardware it would likely be as a hobbyist and it would be nice to program it in Python. My real objective of this post to see the PyPy project gain wider interest as it would be a good thing for Python. [1] - http://www.parallella.org/ [2] - http://www.parallella.org/board/ [3] - http://forums.parallella.org/memberlist.php?mode=viewprofile&u=3344 [4] - http://forums.parallella.org/viewtopic.php?f=26&t=139 [5] - http://www.kickstarter.com/projects/adapteva/parallella-a-supercomputer-for-... John

On Fri, Feb 1, 2013 at 12:01 AM, John Camara <john.m.camara@gmail.com> wrote:
From my own perspective PyPy should excel at one thing - providing kick ass Python VM that's universally fast. We're missing quite a few
Hi John To answer the question from the forum - the JIT emits assembler (x86, arm) it does not emit C code. As far as PR is concerned, there is no such things as PyPy team meets and decided where to go. Everyone works what they feel like doing where volunteer time is concerned. Obviously things are a little different when there is a commercial interest in something. things (like library support), but the things has improved quite drastically, due to things like cffi. The startup time is another one on the list to consider and it affects ARM even more (since it's slower in general). Cheers, fijal

Hi John, Sorry if I misread you, but you seem to be only saying "it would be nice if the PyPy team worked on the support for <X> rather than <Y>". While this might be true under some point of view, it is not constructive. What would be nice is if *you* seriously proposed to work on <X>, or helped us raise commercial interest, or otherwise contributed towards <X>. If you're not up to it, and nobody steps up, then it's the end of the story (but thanks anyway for the nice description of Parallela). A bientôt, Armin.

Hi Armin, It's even worse I'm asking you to support <X> and I don't even need it. When I posted this thread it was getting rather long and unfortunately I didn't really make all the points I wanted to make. At this point, and even for some time now PyPy has a great foundation but it's use remains low. Every now and then it's good to step back a little bit and reflect on the current situation and come up with a strategy that helps the project's popularity grow. I know that PyPy has done things to help with the growth such as writing blog posts, being quick to fix bugs, helping others with their performance issues and even rapidly adding optimizations to PyPy, presenting at conferences, and often actively being engaged in commenting any posts or comments made about PyPy. So PyPy is doing a lot of things right to help it's PR but yet there is this issue of slow growth. Now we know what the main issue is with it's growth is the fact that the Python ecosystem relies on a lot of libraries that use the CPython API and PyPy just doesn't have full support for this interface. I understand the reasons why PyPy is not going to support the full interface and PyPy has come up with the cffi library as a way to bridge the gap. And of course I don't expect the PyPy project to take on the responsibility of porting all the popular 3rd party libraries that use the CPython API to cffi. It's going to have to be a community effort. One thing that could help would be more marketing of cffi as very few Python developers know it exists. But that along is not going to be enough. History tells us that most successful products/projects that become popular do so by first supporting the needs of some niche market. As time goes by that niche market starts providing PR that helps other markets to discover the product/project and the cycle can sometimes continue until there is mass adoption. Now when PyPy started to place a focus on NumPy I had hoped that the market it serves would turn out to be the market that would help PyPy grow. But at this point in time it does not appear like that is going to happen. For a while I have been trying to think of a niche market that maybe helpful. But to do so you have to consider the current state of PyPy which means eliminating markets that heavily rely on libraries that use the CPython API, also going to avoid the NumPy market as that's currently being worked on, there is the mobile market but that's a tough one to get into, maybe the gaming market could be a good one, etc. It turns out with the current state of PyPy many markets need to be eliminated if you looking for one that is going to help with growth. The parrallella project on the other hand looks like it could be a promising one and I'll share so thoughts a little later in this post as to why I feel this way. Right now you have been putting a lot of effort into STM in which your trying to solve what is likely the biggest challenge that the developer community is facing. That is how to write software that effective leverages many cores in a way that is straight forward and in the spirit of Python. When you solve this problem and I have the faith that you will, most would think that it would cause PyPy's popularity to sky rocket. What most likely will happen is that PyPy gets a temporary boost in popularity as there is another lesson in history to be concerned about. Often the first to solve a problem does not become popular in the long run. As usually the first to solve the problem does so via a unique solution but once people start using it issues with the approach gets discovered. Then often many others will use the original solution solution as a starting point and modify it to eliminate these new issues. Then one of the second generation solutions ends up being the defacto standard. Now PyPy is able to move fairly quickly in terms of implementing new approaches so it may in fact be able to compete just fine against other 2nd generation solutions. But there may be some benefits to exposing STM for a smaller market to help PyPy buy some additional time before releasing it as a solution for the general developer community. So why the Parallella project. Well I think it can be helpful in a number of ways. First I don't believe that this market is going to need much from the libraries that use the CPython APIs. Many who are in this market are used to having to program for embedded systems and are more likely have the skills to help out the PyPy project in a number of areas and would likely also have a financial incentive to contribute back to PyPy such as helping keep various back ends up to date such as Arm, PPC, and additional architectures. Some in this market are used to using a number of graphical languages to program their devices but unfortunately for them some of the new products that need to enter the market can't be built fully with these graphical languages. Well with the PyPy framework it's possible for them to implement a VM for that graphical language and be able to create products that contain elements programmed in both the graphical languages as well as text based languages. Also the VMs on many embedded systems are typically simple and don't have a JIT. PyPy can help with this but I don't believe any one who maintains these VMs are aware of the PyPy project. As far as STM is concerned, working with embedded systems will force finding solutions to the many issue that arise with various hardware architectures which would help STM become a more general solution. Right now your currently writing STM in a way that will support multiple cores on a single processor well. I know you have to start some where. But soon you will have to deal with issues that arise once you span to multiple processors such as dealing more often with the slower L3 cache and it sync issues and local vs remote memory issues. But on the embedded side you have to deal with processors of multiple architectures on the same system plus FPGAs as well as having to consider the various issues that a arise from the various buses involved which makes the STM problem quit a bit harder in how it gets optimized to handle all these variations. Of course many of these same issues exists if you want to have STM support GPUs in a normal computing device. The embedded side just adds additional complications as they come in more complex configurations. The Raspberry Pi has become popular, as many want to hack on these devices and the Raspberry Pi happens to be the first devices that is both cheap and allows programming at a high level. Previously if you wanted cheap it meant you needed to program using low level approach or you had to buy an expensive solution to program at a high level. Many who get interested in the Raspberry Pi soon find them self in the position where they have an idea and want to create a product to sell. But they realize you can't use a Raspberry Pi for production as it missing many features they would be required but they also like the idea of programming at a high level but the traditional embedded systems that support this may be too expensive for their product. That's where the Parallella project comes into play. They see there is a market for a low cost devices that can be programmed with higher level tools to build production systems. This market values programming at a high level and would highly appreciate being able to program them in Python. They also have a need to support multi cores and thus could use STM and it would be incredibly usefully if the STM approach could seamlessly support multiple architectures. There is a lot of value here for the companies that want to produce these devices and PyPy should try to tap into it. This new market segment using these low cost devices are going to have a large impact and also will play a role in the manufacturing revolution that is about to take place. This manufacturing revolution is likely to be on the same scale as the Internet revolution. Just think about what the effect 3D printing is going to have. It will be huge. PyPy getting a foot hold into this market before it takes off would be huge for PyPy as well as in general for Python. Also there are some big players who currently sell these more expensive embedded systems who are not going to be happy about these cheaper alternatives and are also going to want a piece of the action. I think for many of them who may not be able to quickly change their development and run-time processes may decide it's much easier for them to port their VMs over to PyPy to get into the action. Hopefully this gives some better insight as to why I feel it may be a good strategy to consider supporting the Parallella project. The possibility of getting a foot hold into a market that is about to take off doesn't come around too often. All I know is if PyPy would like to support this market right now is the best time to get started. This might be the ticket PyPy needs to gets it growth up which could then lead to additional markets taking notice and more of the Python ecosystem becoming compatible with PyPy. Of course this is just my opinion and maybe someone else could come up with another strategy that can help PyPy grow faster. Even an Open Source project can use a strategy. John On Tue, Feb 5, 2013 at 9:47 AM, Armin Rigo <arigo@tunes.org> wrote:

Hi John. Let me summarize your long post how I understood it. "You guys should bet everything on platform <X> that both does not need PyPy and expressed no real interest. The reason why is because PyPy is not growing fast enough and we need a niche market. On top of that we should answer a lot of unanswered questions, like memory and warmup requirements on embedded devices". So, I think you're wrong in very many regards here. I think we should try to excel at providing a kick ass Python VM, but also I have seriously no say in what people work on (except me). We already have some niche markets, notably people who are willing to invest R&D and need serious power (but are unable or unwilling to use C or C++ for that). You just don't know about it, because those are typically not people writing blog posts. Having a dedicated web stack is another good step and we'll eventuall get there. I don't know why you think this particular niche market is better than any other, but it really does not matter all that much. There is no way you can convince people to do something else in their volunteer time than what they already feel like doing. Things you can do if you're interested: * do the work yourself * work with parallela project to have a first-class pypy support if they care about performance * spark commercial interest however, trying to convince volunteers that they should do what you think they should do is not really one of the helpful things you can be doing. Cheers, fijal

Fijal, In the past you have complained about it being hard to make money in open source. One way to make it easier for you is grow the popularity of PyPy. So I would think you would at least have some interest in thinking of ways to accomplish that. I'm not trying to dictate what PyPy should do but merely providing an opinion of mine that I see an opportunity that potential could be a great thing for PyPy. A year ago if someone asked me if PyPy should support embedded systems I would have given a firm no but I see the market changing in ways I didn't expect. The people hacking on these devices are fairly similar to open source developers and in some cases they even do open source development. They do things differently from the establishment which has provided a new way to think about manufacturing. Their ways are so different from the establishment and have become a game changer that it has ignited what is becoming a manufacturing revolution. Now because many who are involved in hacking with this hardware have no prior experience with the established ways of doing this type of business they are moving in directions that differ in how these devices get programmed. They are also in need of tools and new infrastructure and I feel that what PyPy has to offer can give them a starting point. Now at the end of the day I don't believe many of their requirements are going to be much different than the requirements for other markets and not likely too different than the direction PyPy will likely take. So why not go where all the big money is going to be at. Ok enough of that. Lets take a look at your example of a web stack. I believe right now PyPy is in a position to be used in this market. Sure PyPy could use some additional optimizations to improve the situation but I think in general it's already able to kick ass compared to CPython in terms of performance when a light web framework is used which is becoming increasing popular as web apps push the front ends to do most of the layout/presentation work. Also with with the web becoming more dynamic and the number of requests increasing at a substantial rate it becomes more important to reduce latencies which tends to give PyPy an advantage. This is all great while the web stacks are running on traditional servers but servers are changing. There are some servers being sold today that have hundreds of small cores and in the not too distant future there will be systems that have a number of full cores and a much larger number of smaller cores which may or may not have similar architectures. For instance servers with Phi coprocessors (8 GB of memory (60) 1 GHz cores, with I believe 4 threads each, with a PCIe3 interface) and have become recently available. How is PyPy going to handle this. Is this any different than the needs of the embedded systems. No. PyPy is going to have to start paying attention to how data is accessed and will have to make optimizations based on the access patterns. That is you have to make sure computational loads can offset the data transfer overhead. Today PyPy does not take into this overhead cost which is not required when running on one core.. For a web application it would be nice to run multiple sessions on a given core, save session related data locally to that core so as to minimize data transfer to the smaller cores which means directing all request for the session to the same core, doing any necessary encryption on these small cores, etc. But there may also be some work for a particular request which might not be appropriate to run on a small core and may have to run on the main core maybe due to it requiring access too much data. How is this going to work. Is PyPy going to do all the analysis itself or will the programmer provide some hints to PyPy as to how to break up the work. Who is going to be responsible for the scheduling and cleaning up the session data that is cached locally to the cores and a boat load of other issues I'm not sure it's a tough problem.and one that is just around the corner. Another option would be to run an HTTP load balance on the main cores, PyPy web stacks running on say dedicated Phi cores, with the HTTP requests forwarded over the PCIe bus. That way each Phi core acts like an independent web server. But running 60-240 PyPy processes in 8GB of memory is quite the challenge Maybe some sort of PyPy hypervisor that is able to run virtualized PyPy instances so that each instance can share all the JITed code but have it's own data. I'm sure many issues and questions exists like who would do the JITting the hypervisor or the virualized PyPy instances? Now even if you feel right now is not the time to start worrying about these new server architectures there are still other issues PyPy will start to run into, in the web stack market. Typically for a web application that is being accessed from the Internet there is a certain amount of latency that is acceptable. But what happens when the same web stack technology is deployed in local environments (i.e. on a LAN) with heavy dynamic requests with some requiring near real time performance. When operating in an a networked environment with low latencies people are going to expect more from a web servers (actual not just the people but systems talking to other systems that will require it). This ends up being a problem for Python in general as the garbage collector is going to be an issue. This is going to require a concurrent garbage collector. The concurrent garbage collector is also needed by the embedded market, as well as the gaming market, and many others. Any way, this is just food for thought. I'm not going to keep on giving more examples in more replies. In the end this is where the world is headed and it's going to take a lot of work and resources to get PyPy to handle these situations and only strong growth can make it possible. If you want PyPy to get there I hope you can see why a strategy for growth is necessary. On a side note, I'm not all that comfortable writing these posts when I know that at this particular time I don't have the spare time to contribute. Right now I work 7 days a week from the time I wake up until I go to sleep. But I wrote it any way as I do believe there its a good opportunity for PyPy. John On Wed, Feb 6, 2013 at 6:11 AM, Maciej Fijalkowski <fijall@gmail.com> wrote:

On Thu, Feb 7, 2013 at 6:41 AM, John Camara <john.m.camara@gmail.com> wrote:
Before even reading further - how is being popular making money? Being popular is being popular. Do you know any CPython developer working full time on CPython? CPython is definitely popular by my standards

Fijal, Whether someone works full time on a project is a separate issue. Being popular helps attract additional resources and PyPy is a project that could use additional resources. How many additional optimizations could PyPy add to get to a similar level of optimization to say the JVM. We are talking many many man years of work. How much additional work is it to develop and maintain backends for the various ARM, PPC, MIPS, etc processors How much work would it take to have PyPy support multi-cores? What if RPython needs to be significantly refactored or replaced. And we can go on and on. Typically every 10 years or so a new language becomes dominate but that hasn't happen lately. Java had been in the role for quite some time and for quite a few years it has be on the decline but yet no language has taken it's place in terms of dominance. The main reason why this hasn't happen so far is that no language has successfully dealt with the multi-core issue in a way that also keeps other desirable features we currently have with popular languages. But at some point, a language will prevail and become dominate and when that happens there will be a mass migration to this language. It doesn't mean that Python and other currently popular languages are just going to go away, it just their use will decline. If Python's popularity declines significantly it will in turn impact PyPy. Also many of the earlier adopters of PyPy are more likely to move on to the new dominate language. So where does that leave you. I expect you earn a living by doing PyPy consulting and thus you need PyPy to be popular. Now you don't have to believe that a new dominate language will occur but history says otherwise and many have been fooled into thinking otherwise is the past. I feel PyPy is Python's best chance at being able to survive this change in language dominance as it has the best chance of being able to do something about the multi-core situation. I'm glad the other day you mentioned about the web stack as if you didn't mention it I likely would not have thought about the PyPy hypervisor scenario. I'm starting to believe that approach, may have some decent merit to it and allow a way to kick the can down the road on the multi-core issues. I don't have the time to get into it right now but I start a new thread on the topic. Maybe within the next few days. John On Thu, Feb 7, 2013 at 4:33 AM, Maciej Fijalkowski <fijall@gmail.com> wrote:

Hi John, Thanks for your lengthy analysis. I'm sure that it can be interesting for some to read. Unfortunately, I'm personally an Open Source hobbyist that happens to come from a university background and I'm still attached to some ideas behind it. You say about my hacking STM: "Often the first to solve a problem does not become popular in the long run". That is true, and I have no problem with that. My guess is that in the end STM will end up being common in programming languages. So I would like to help along the way --- by showing that it works in complicated languages like Python, using the unlimited flexibility of Software TM rather than as an exercice to fit it around some Hardware TM. It would be nice if PyPy also becomes the de-facto 2nd-generation standard, but that's less realistic --- and not a problem for me. My goal is *not* to write and sell the final product. What would also be nice is if this final product was Python, but unfortunately, it seems unlikely at this point that CPython will ever convert to STM. I guess that besides PyPy, Python as a whole will lag behind, and likely only end up using some HTM solution in 10-15 years when it's fully ready. (I consider the HTM that we have this year as preliminary at best.) That is my current analysis on the future of STM. It doesn't include huge monetary benefits for PyPy :-) but it doesn't change anything about my own research motivation: 1st-generation research, as you call it. Obviously, PyPy as a whole is such a 1st-generation project. What I would actually like a lot is to see the emergence of other 2nd-generation platforms that apply the same principles as PyPy --- for example, it would be a first step to see an efficient JavaScript JIT compiler not manually written from scratch. A bientôt, Armin.

On Fri, Feb 1, 2013 at 12:01 AM, John Camara <john.m.camara@gmail.com> wrote:
From my own perspective PyPy should excel at one thing - providing kick ass Python VM that's universally fast. We're missing quite a few
Hi John To answer the question from the forum - the JIT emits assembler (x86, arm) it does not emit C code. As far as PR is concerned, there is no such things as PyPy team meets and decided where to go. Everyone works what they feel like doing where volunteer time is concerned. Obviously things are a little different when there is a commercial interest in something. things (like library support), but the things has improved quite drastically, due to things like cffi. The startup time is another one on the list to consider and it affects ARM even more (since it's slower in general). Cheers, fijal

Hi John, Sorry if I misread you, but you seem to be only saying "it would be nice if the PyPy team worked on the support for <X> rather than <Y>". While this might be true under some point of view, it is not constructive. What would be nice is if *you* seriously proposed to work on <X>, or helped us raise commercial interest, or otherwise contributed towards <X>. If you're not up to it, and nobody steps up, then it's the end of the story (but thanks anyway for the nice description of Parallela). A bientôt, Armin.

Hi Armin, It's even worse I'm asking you to support <X> and I don't even need it. When I posted this thread it was getting rather long and unfortunately I didn't really make all the points I wanted to make. At this point, and even for some time now PyPy has a great foundation but it's use remains low. Every now and then it's good to step back a little bit and reflect on the current situation and come up with a strategy that helps the project's popularity grow. I know that PyPy has done things to help with the growth such as writing blog posts, being quick to fix bugs, helping others with their performance issues and even rapidly adding optimizations to PyPy, presenting at conferences, and often actively being engaged in commenting any posts or comments made about PyPy. So PyPy is doing a lot of things right to help it's PR but yet there is this issue of slow growth. Now we know what the main issue is with it's growth is the fact that the Python ecosystem relies on a lot of libraries that use the CPython API and PyPy just doesn't have full support for this interface. I understand the reasons why PyPy is not going to support the full interface and PyPy has come up with the cffi library as a way to bridge the gap. And of course I don't expect the PyPy project to take on the responsibility of porting all the popular 3rd party libraries that use the CPython API to cffi. It's going to have to be a community effort. One thing that could help would be more marketing of cffi as very few Python developers know it exists. But that along is not going to be enough. History tells us that most successful products/projects that become popular do so by first supporting the needs of some niche market. As time goes by that niche market starts providing PR that helps other markets to discover the product/project and the cycle can sometimes continue until there is mass adoption. Now when PyPy started to place a focus on NumPy I had hoped that the market it serves would turn out to be the market that would help PyPy grow. But at this point in time it does not appear like that is going to happen. For a while I have been trying to think of a niche market that maybe helpful. But to do so you have to consider the current state of PyPy which means eliminating markets that heavily rely on libraries that use the CPython API, also going to avoid the NumPy market as that's currently being worked on, there is the mobile market but that's a tough one to get into, maybe the gaming market could be a good one, etc. It turns out with the current state of PyPy many markets need to be eliminated if you looking for one that is going to help with growth. The parrallella project on the other hand looks like it could be a promising one and I'll share so thoughts a little later in this post as to why I feel this way. Right now you have been putting a lot of effort into STM in which your trying to solve what is likely the biggest challenge that the developer community is facing. That is how to write software that effective leverages many cores in a way that is straight forward and in the spirit of Python. When you solve this problem and I have the faith that you will, most would think that it would cause PyPy's popularity to sky rocket. What most likely will happen is that PyPy gets a temporary boost in popularity as there is another lesson in history to be concerned about. Often the first to solve a problem does not become popular in the long run. As usually the first to solve the problem does so via a unique solution but once people start using it issues with the approach gets discovered. Then often many others will use the original solution solution as a starting point and modify it to eliminate these new issues. Then one of the second generation solutions ends up being the defacto standard. Now PyPy is able to move fairly quickly in terms of implementing new approaches so it may in fact be able to compete just fine against other 2nd generation solutions. But there may be some benefits to exposing STM for a smaller market to help PyPy buy some additional time before releasing it as a solution for the general developer community. So why the Parallella project. Well I think it can be helpful in a number of ways. First I don't believe that this market is going to need much from the libraries that use the CPython APIs. Many who are in this market are used to having to program for embedded systems and are more likely have the skills to help out the PyPy project in a number of areas and would likely also have a financial incentive to contribute back to PyPy such as helping keep various back ends up to date such as Arm, PPC, and additional architectures. Some in this market are used to using a number of graphical languages to program their devices but unfortunately for them some of the new products that need to enter the market can't be built fully with these graphical languages. Well with the PyPy framework it's possible for them to implement a VM for that graphical language and be able to create products that contain elements programmed in both the graphical languages as well as text based languages. Also the VMs on many embedded systems are typically simple and don't have a JIT. PyPy can help with this but I don't believe any one who maintains these VMs are aware of the PyPy project. As far as STM is concerned, working with embedded systems will force finding solutions to the many issue that arise with various hardware architectures which would help STM become a more general solution. Right now your currently writing STM in a way that will support multiple cores on a single processor well. I know you have to start some where. But soon you will have to deal with issues that arise once you span to multiple processors such as dealing more often with the slower L3 cache and it sync issues and local vs remote memory issues. But on the embedded side you have to deal with processors of multiple architectures on the same system plus FPGAs as well as having to consider the various issues that a arise from the various buses involved which makes the STM problem quit a bit harder in how it gets optimized to handle all these variations. Of course many of these same issues exists if you want to have STM support GPUs in a normal computing device. The embedded side just adds additional complications as they come in more complex configurations. The Raspberry Pi has become popular, as many want to hack on these devices and the Raspberry Pi happens to be the first devices that is both cheap and allows programming at a high level. Previously if you wanted cheap it meant you needed to program using low level approach or you had to buy an expensive solution to program at a high level. Many who get interested in the Raspberry Pi soon find them self in the position where they have an idea and want to create a product to sell. But they realize you can't use a Raspberry Pi for production as it missing many features they would be required but they also like the idea of programming at a high level but the traditional embedded systems that support this may be too expensive for their product. That's where the Parallella project comes into play. They see there is a market for a low cost devices that can be programmed with higher level tools to build production systems. This market values programming at a high level and would highly appreciate being able to program them in Python. They also have a need to support multi cores and thus could use STM and it would be incredibly usefully if the STM approach could seamlessly support multiple architectures. There is a lot of value here for the companies that want to produce these devices and PyPy should try to tap into it. This new market segment using these low cost devices are going to have a large impact and also will play a role in the manufacturing revolution that is about to take place. This manufacturing revolution is likely to be on the same scale as the Internet revolution. Just think about what the effect 3D printing is going to have. It will be huge. PyPy getting a foot hold into this market before it takes off would be huge for PyPy as well as in general for Python. Also there are some big players who currently sell these more expensive embedded systems who are not going to be happy about these cheaper alternatives and are also going to want a piece of the action. I think for many of them who may not be able to quickly change their development and run-time processes may decide it's much easier for them to port their VMs over to PyPy to get into the action. Hopefully this gives some better insight as to why I feel it may be a good strategy to consider supporting the Parallella project. The possibility of getting a foot hold into a market that is about to take off doesn't come around too often. All I know is if PyPy would like to support this market right now is the best time to get started. This might be the ticket PyPy needs to gets it growth up which could then lead to additional markets taking notice and more of the Python ecosystem becoming compatible with PyPy. Of course this is just my opinion and maybe someone else could come up with another strategy that can help PyPy grow faster. Even an Open Source project can use a strategy. John On Tue, Feb 5, 2013 at 9:47 AM, Armin Rigo <arigo@tunes.org> wrote:

Hi John. Let me summarize your long post how I understood it. "You guys should bet everything on platform <X> that both does not need PyPy and expressed no real interest. The reason why is because PyPy is not growing fast enough and we need a niche market. On top of that we should answer a lot of unanswered questions, like memory and warmup requirements on embedded devices". So, I think you're wrong in very many regards here. I think we should try to excel at providing a kick ass Python VM, but also I have seriously no say in what people work on (except me). We already have some niche markets, notably people who are willing to invest R&D and need serious power (but are unable or unwilling to use C or C++ for that). You just don't know about it, because those are typically not people writing blog posts. Having a dedicated web stack is another good step and we'll eventuall get there. I don't know why you think this particular niche market is better than any other, but it really does not matter all that much. There is no way you can convince people to do something else in their volunteer time than what they already feel like doing. Things you can do if you're interested: * do the work yourself * work with parallela project to have a first-class pypy support if they care about performance * spark commercial interest however, trying to convince volunteers that they should do what you think they should do is not really one of the helpful things you can be doing. Cheers, fijal

Fijal, In the past you have complained about it being hard to make money in open source. One way to make it easier for you is grow the popularity of PyPy. So I would think you would at least have some interest in thinking of ways to accomplish that. I'm not trying to dictate what PyPy should do but merely providing an opinion of mine that I see an opportunity that potential could be a great thing for PyPy. A year ago if someone asked me if PyPy should support embedded systems I would have given a firm no but I see the market changing in ways I didn't expect. The people hacking on these devices are fairly similar to open source developers and in some cases they even do open source development. They do things differently from the establishment which has provided a new way to think about manufacturing. Their ways are so different from the establishment and have become a game changer that it has ignited what is becoming a manufacturing revolution. Now because many who are involved in hacking with this hardware have no prior experience with the established ways of doing this type of business they are moving in directions that differ in how these devices get programmed. They are also in need of tools and new infrastructure and I feel that what PyPy has to offer can give them a starting point. Now at the end of the day I don't believe many of their requirements are going to be much different than the requirements for other markets and not likely too different than the direction PyPy will likely take. So why not go where all the big money is going to be at. Ok enough of that. Lets take a look at your example of a web stack. I believe right now PyPy is in a position to be used in this market. Sure PyPy could use some additional optimizations to improve the situation but I think in general it's already able to kick ass compared to CPython in terms of performance when a light web framework is used which is becoming increasing popular as web apps push the front ends to do most of the layout/presentation work. Also with with the web becoming more dynamic and the number of requests increasing at a substantial rate it becomes more important to reduce latencies which tends to give PyPy an advantage. This is all great while the web stacks are running on traditional servers but servers are changing. There are some servers being sold today that have hundreds of small cores and in the not too distant future there will be systems that have a number of full cores and a much larger number of smaller cores which may or may not have similar architectures. For instance servers with Phi coprocessors (8 GB of memory (60) 1 GHz cores, with I believe 4 threads each, with a PCIe3 interface) and have become recently available. How is PyPy going to handle this. Is this any different than the needs of the embedded systems. No. PyPy is going to have to start paying attention to how data is accessed and will have to make optimizations based on the access patterns. That is you have to make sure computational loads can offset the data transfer overhead. Today PyPy does not take into this overhead cost which is not required when running on one core.. For a web application it would be nice to run multiple sessions on a given core, save session related data locally to that core so as to minimize data transfer to the smaller cores which means directing all request for the session to the same core, doing any necessary encryption on these small cores, etc. But there may also be some work for a particular request which might not be appropriate to run on a small core and may have to run on the main core maybe due to it requiring access too much data. How is this going to work. Is PyPy going to do all the analysis itself or will the programmer provide some hints to PyPy as to how to break up the work. Who is going to be responsible for the scheduling and cleaning up the session data that is cached locally to the cores and a boat load of other issues I'm not sure it's a tough problem.and one that is just around the corner. Another option would be to run an HTTP load balance on the main cores, PyPy web stacks running on say dedicated Phi cores, with the HTTP requests forwarded over the PCIe bus. That way each Phi core acts like an independent web server. But running 60-240 PyPy processes in 8GB of memory is quite the challenge Maybe some sort of PyPy hypervisor that is able to run virtualized PyPy instances so that each instance can share all the JITed code but have it's own data. I'm sure many issues and questions exists like who would do the JITting the hypervisor or the virualized PyPy instances? Now even if you feel right now is not the time to start worrying about these new server architectures there are still other issues PyPy will start to run into, in the web stack market. Typically for a web application that is being accessed from the Internet there is a certain amount of latency that is acceptable. But what happens when the same web stack technology is deployed in local environments (i.e. on a LAN) with heavy dynamic requests with some requiring near real time performance. When operating in an a networked environment with low latencies people are going to expect more from a web servers (actual not just the people but systems talking to other systems that will require it). This ends up being a problem for Python in general as the garbage collector is going to be an issue. This is going to require a concurrent garbage collector. The concurrent garbage collector is also needed by the embedded market, as well as the gaming market, and many others. Any way, this is just food for thought. I'm not going to keep on giving more examples in more replies. In the end this is where the world is headed and it's going to take a lot of work and resources to get PyPy to handle these situations and only strong growth can make it possible. If you want PyPy to get there I hope you can see why a strategy for growth is necessary. On a side note, I'm not all that comfortable writing these posts when I know that at this particular time I don't have the spare time to contribute. Right now I work 7 days a week from the time I wake up until I go to sleep. But I wrote it any way as I do believe there its a good opportunity for PyPy. John On Wed, Feb 6, 2013 at 6:11 AM, Maciej Fijalkowski <fijall@gmail.com> wrote:

On Thu, Feb 7, 2013 at 6:41 AM, John Camara <john.m.camara@gmail.com> wrote:
Before even reading further - how is being popular making money? Being popular is being popular. Do you know any CPython developer working full time on CPython? CPython is definitely popular by my standards

Fijal, Whether someone works full time on a project is a separate issue. Being popular helps attract additional resources and PyPy is a project that could use additional resources. How many additional optimizations could PyPy add to get to a similar level of optimization to say the JVM. We are talking many many man years of work. How much additional work is it to develop and maintain backends for the various ARM, PPC, MIPS, etc processors How much work would it take to have PyPy support multi-cores? What if RPython needs to be significantly refactored or replaced. And we can go on and on. Typically every 10 years or so a new language becomes dominate but that hasn't happen lately. Java had been in the role for quite some time and for quite a few years it has be on the decline but yet no language has taken it's place in terms of dominance. The main reason why this hasn't happen so far is that no language has successfully dealt with the multi-core issue in a way that also keeps other desirable features we currently have with popular languages. But at some point, a language will prevail and become dominate and when that happens there will be a mass migration to this language. It doesn't mean that Python and other currently popular languages are just going to go away, it just their use will decline. If Python's popularity declines significantly it will in turn impact PyPy. Also many of the earlier adopters of PyPy are more likely to move on to the new dominate language. So where does that leave you. I expect you earn a living by doing PyPy consulting and thus you need PyPy to be popular. Now you don't have to believe that a new dominate language will occur but history says otherwise and many have been fooled into thinking otherwise is the past. I feel PyPy is Python's best chance at being able to survive this change in language dominance as it has the best chance of being able to do something about the multi-core situation. I'm glad the other day you mentioned about the web stack as if you didn't mention it I likely would not have thought about the PyPy hypervisor scenario. I'm starting to believe that approach, may have some decent merit to it and allow a way to kick the can down the road on the multi-core issues. I don't have the time to get into it right now but I start a new thread on the topic. Maybe within the next few days. John On Thu, Feb 7, 2013 at 4:33 AM, Maciej Fijalkowski <fijall@gmail.com> wrote:

Hi John, Thanks for your lengthy analysis. I'm sure that it can be interesting for some to read. Unfortunately, I'm personally an Open Source hobbyist that happens to come from a university background and I'm still attached to some ideas behind it. You say about my hacking STM: "Often the first to solve a problem does not become popular in the long run". That is true, and I have no problem with that. My guess is that in the end STM will end up being common in programming languages. So I would like to help along the way --- by showing that it works in complicated languages like Python, using the unlimited flexibility of Software TM rather than as an exercice to fit it around some Hardware TM. It would be nice if PyPy also becomes the de-facto 2nd-generation standard, but that's less realistic --- and not a problem for me. My goal is *not* to write and sell the final product. What would also be nice is if this final product was Python, but unfortunately, it seems unlikely at this point that CPython will ever convert to STM. I guess that besides PyPy, Python as a whole will lag behind, and likely only end up using some HTM solution in 10-15 years when it's fully ready. (I consider the HTM that we have this year as preliminary at best.) That is my current analysis on the future of STM. It doesn't include huge monetary benefits for PyPy :-) but it doesn't change anything about my own research motivation: 1st-generation research, as you call it. Obviously, PyPy as a whole is such a 1st-generation project. What I would actually like a lot is to see the emergence of other 2nd-generation platforms that apply the same principles as PyPy --- for example, it would be a first step to see an efficient JavaScript JIT compiler not manually written from scratch. A bientôt, Armin.
participants (4)
-
Amaury Forgeot d'Arc
-
Armin Rigo
-
John Camara
-
Maciej Fijalkowski