[pypy-dev] Parallella open hardware platform
John Camara
john.m.camara at gmail.com
Thu Feb 7 05:41:43 CET 2013
Fijal,
In the past you have complained about it being hard to make money in open
source. One way to make it easier for you is grow the popularity of PyPy.
So I would think you would at least have some interest in thinking of ways
to accomplish that.
I'm not trying to dictate what PyPy should do but merely providing an
opinion of mine that I see an opportunity that potential could be a great
thing for PyPy.
A year ago if someone asked me if PyPy should support embedded systems I
would have given a firm no but I see the market changing in ways I didn't
expect. The people hacking on these devices are fairly similar to open
source developers and in some cases they even do open source development.
They do things differently from the establishment which has provided a new
way to think about manufacturing. Their ways are so different from the
establishment and have become a game changer that it has ignited what is
becoming a manufacturing revolution. Now because many who are involved in
hacking with this hardware have no prior experience with the established
ways of doing this type of business they are moving in directions that
differ in how these devices get programmed. They are also in need of tools
and new infrastructure and I feel that what PyPy has to offer can give them
a starting point.
Now at the end of the day I don't believe many of their requirements are
going to be much different than the requirements for other markets and not
likely too different than the direction PyPy will likely take. So why not
go where all the big money is going to be at.
Ok enough of that. Lets take a look at your example of a web stack. I
believe right now PyPy is in a position to be used in this market. Sure
PyPy could use some additional optimizations to improve the situation but I
think in general it's already able to kick ass compared to CPython in terms
of performance when a light web framework is used which
is becoming increasing popular as web apps push the front ends to do most
of the layout/presentation work. Also with with the web becoming more
dynamic and the number of requests increasing at a substantial rate it
becomes more important to reduce latencies which tends to give PyPy an
advantage.
This is all great while the web stacks are running on traditional servers
but servers are changing. There are some servers being sold today that
have hundreds of small cores and in the not too distant future there will
be systems that have a number of full cores and a much larger number of
smaller cores which may or may not have similar architectures. For
instance servers with Phi coprocessors (8 GB of memory (60) 1 GHz cores,
with I believe 4 threads each, with a PCIe3 interface) and have
become recently available. How is PyPy going to handle this. Is this any
different than the needs of the embedded systems. No. PyPy is going to
have to start paying attention to how data is accessed and will have to
make optimizations based on the access patterns. That is you have to make
sure computational loads can offset the data transfer overhead. Today PyPy
does not take into this overhead cost which is not required when running on
one core..
For a web application it would be nice to run multiple sessions on a given
core, save session related data locally to that core so as to minimize data
transfer to the smaller cores which means directing all request for the
session to the same core, doing any necessary encryption on these small
cores, etc. But there may also be some work for a particular request which
might not be appropriate to run on a small core and may have to run on the
main core maybe due to it requiring access too much data. How is this
going to work. Is PyPy going to do all the analysis itself or will the
programmer provide some hints to PyPy as to how to break up the work. Who
is going to be responsible for the scheduling and cleaning up the session
data that is cached locally to the cores and a boat load of other issues
I'm not sure it's a tough problem.and one that is just around the corner.
Another option would be to run an HTTP load balance on the main cores, PyPy
web stacks running on say dedicated Phi cores, with the HTTP requests
forwarded over the PCIe bus. That way each Phi core acts like
an independent web server. But running 60-240 PyPy processes in 8GB of
memory is quite the challenge Maybe some sort of PyPy hypervisor that is
able to run virtualized PyPy instances so that each instance can share all
the JITed code but have it's own data. I'm sure many issues and questions
exists like who would do the JITting the hypervisor or the virualized PyPy
instances?
Now even if you feel right now is not the time to start worrying about
these new server architectures there are still other issues PyPy will start
to run into, in the web stack market. Typically for a web application that
is being accessed from the Internet there is a certain amount of latency
that is acceptable. But what happens when the same web stack technology is
deployed in local environments (i.e. on a LAN) with heavy dynamic requests
with some requiring near real time performance. When operating in an a
networked environment with low latencies people are going to expect more
from a web servers (actual not just the people but systems talking to other
systems that will require it). This ends up being a problem for Python in
general as the garbage collector is going to be an issue. This is going to
require a concurrent garbage collector.
The concurrent garbage collector is also needed by the embedded market, as
well as the gaming market, and many others.
Any way, this is just food for thought. I'm not going to keep on giving
more examples in more replies. In the end this is where the world is
headed and it's going to take a lot of work and resources to get PyPy to
handle these situations and only strong growth can make it possible. If
you want PyPy to get there I hope you can see why a strategy for growth is
necessary.
On a side note, I'm not all that comfortable writing these posts when I
know that at this particular time I don't have the spare time to
contribute. Right now I work 7 days a week from the time I wake up until I
go to sleep. But I wrote it any way as I do believe there its a
good opportunity for PyPy.
John
On Wed, Feb 6, 2013 at 6:11 AM, Maciej Fijalkowski <fijall at gmail.com> wrote:
> Hi John.
>
> Let me summarize your long post how I understood it. "You guys should
> bet everything on platform <X> that both does not need PyPy and
> expressed no real interest. The reason why is because PyPy is not
> growing fast enough and we need a niche market. On top of that we
> should answer a lot of unanswered questions, like memory and warmup
> requirements on embedded devices".
>
> So, I think you're wrong in very many regards here. I think we should
> try to excel at providing a kick ass Python VM, but also I have
> seriously no say in what people work on (except me). We already have
> some niche markets, notably people who are willing to invest R&D and
> need serious power (but are unable or unwilling to use C or C++ for
> that). You just don't know about it, because those are typically not
> people writing blog posts. Having a dedicated web stack is another
> good step and we'll eventuall get there. I don't know why you think
> this particular niche market is better than any other, but it really
> does not matter all that much. There is no way you can convince people
> to do something else in their volunteer time than what they already
> feel like doing. Things you can do if you're interested:
>
> * do the work yourself
>
> * work with parallela project to have a first-class pypy support if
> they care about performance
>
> * spark commercial interest
>
> however, trying to convince volunteers that they should do what you
> think they should do is not really one of the helpful things you can
> be doing.
>
> Cheers,
> fijal
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20130206/fd4153e6/attachment.html>
More information about the pypy-dev
mailing list