[Web-SIG] Nodejs cluster

est electronixtar at gmail.com
Tue Mar 18 09:20:06 CET 2014


> is that client is limited to send operations only and requests to HTTP(S)
protocol only. Is that true? All other parts of the system can communicate
with whatever protocols they like.

Yes! Every incoming requests are HTTP.

There also need to be a long running process to hold all the fd pools for
DB connection, server side events (text/event-stream) connections, iOS APNS
push conn pools, XMPP connection, etc. And there may be multiple nodes
holding these fd for long opening

Example:

 * client HTTP requests are handled and finished on NodeA
 * NodeA notifies NodeB for events, NodeB gather the event for statistical
analystis by writing some data to DB
 * also NodeA notifies NodeC to Server-Side-Events to some other connection
 * next NodeA notifies NodeC to send out an  iOS push message
 * last NodeA notify NodeD on which sends out an XMPP message.

Stuff goes like that.

> XML-RPC is a bad way of operation and I am going to promote that belief.

Completely agree. But currently Telegraphy is one and only one-of-a-kind.

Other options like brubeck.io with mongrel2, webalchemy, etc.

> the generic clustering problem will include management DNS infrastructure

For a running cluster, yes. But Nodejs has its cluster batteries included
so it's easier to bootstrap a cluster in that way than say, build with
multiprocessing/billiard from scratch



On Tue, Mar 18, 2014 at 3:12 PM, anatoly techtonik <techtonik at gmail.com>wrote:

> On Tue, Mar 18, 2014 at 5:16 AM, est <electronixtar at gmail.com> wrote:
>
>>
>> IPython.parallel
>>
>>
>> http://ipython.org/ipython-doc/stable/install/install.html#dependencies-for-ipython-parallel-parallel-computing
>>
>> It's based on ZeroMQ(PyZMQ), and the `ssh` command. I don't think that's
>> lightweigh enough for busy web clusters.
>>
>
> You will need to secure you web cluster computations anyway. SSH may be
> slower that HTTPS, I agree, but I'd still see the benchmarks. IPython is
> good for handling long processing tasks. For myriad of tiny code+data
> workers I'd choose Stackless. Not sure about the web server part.
>
>
>> By QMachine I assume that's
>>
>> https://github.com/wilkinson/qmachine
>>
>> For web server cluster it's really not a good idea to amplify HTTP
>> requests. One client request amplifies several other HTTP requests on
>> server clusters.
>>
>
> Right. Because your workers are not trusted you need to distribute the
> load and validate results with multiple passes.
>
>
>> What I propose is something like Zed Shawn's Mongrel2 project (
>> http://mongrel2.org/), use a very lightweight server-side serialization
>> protocol as cluster IPC, you can pass states/data between nodes (workers)
>> easily. It should be agnostic to framework or libraries, the objective is
>> to unite python modules in the realtime web world. Because for
>> request-response web world, a synchronized gateway like WSGI is good
>> enough, between each requests, share nothing<https://docs.djangoproject.com/en/dev/faq/general/#does-django-scale>
>> .
>>
>> But for realtime web, server side state is very much required. There need
>> to be a fd pool for DBs, external services, and stuff like Server-Side-Push
>> technologies.
>>
>
> "realtime web" is a very broad term. Need a more concise definition. I see
> only one difference in "web" over standard protocol - is that client is
> limited to send operations only and requests to HTTP(S) protocol only. Is
> that true? All other parts of the system can communicate with whatever
> protocols they like.
>
> So, to unify the network under some standard, we need common base. Stick
> to limitations of client to make all nodes work the same. Limit choice to
> bare minimum and extend where it is needed.
>
> Let's assume the following scenario:
>>
>> One user submits a blog, his follower gets browser/iOS/Android push
>> notification. Because users are connected different nodes in one big
>> cluster, we need some kind of mechanism to broadcast this message.
>>
>> In such an architecture we can write simpler code like this:
>>
>> from django.db.models.signals import post_save
>>
>> @receiver(post_save, sender=BlogPostModel)
>> def my_handler(sender, **kwargs):
>>     msg = "User X just posted a new blog, check it out at http://..."
>>     browser_followers.send(msg)
>>     ios_followers.send(msg)
>>     android_followers.send(msg)
>>
>> Currently this library reall shines.
>>
>> https://pypi.python.org/pypi/telegraphy/
>>
>> Telegraphy architecture is like this:
>>
>> [image: Inline image 1]
>>
>> What I propose is to merge Web-app part and the AutobahnPython Gateway
>> part into *one* based on a community honored standard.
>>
>
> Just a side note - XML-RPC is a bad way of operation and I am going to
> promote that belief.
>
> The key component here that is not depicted is client limitations (able to
> only request events, and accept events after websocket connection is
> established with a single server). Channel description (WS, HTTP) are not
> informative in this regard to capture that limitation that this
> architecture should deal with.
>
> When client (browser) establishes connection to HTTP site, can it open a
> websocket to the site in other domain? If no - then cross-domain
> interaction should also be included into problem description before
> unifying Django and Autobahn. If this limitation exists - the generic
> clustering problem will include management DNS infrastructure (to make sure
> client can send requests to any node in the cluster) or clustering will
> require frontends on servers to reroute requests on established websocket
> connections to appropriate cluster nodes.
>
> Not sure I got the positioning of NodeJS cluster right, so feel free to
> fix that.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20140318/2206e1bc/attachment.html>


More information about the Web-SIG mailing list