Re: [Twisted-Python] Large project (IMS) architecture
On Thursday 16 January 2003 09:48 am, Itamar Shtull-Trauring wrote:
On Thu, 16 Jan 2003 08:44:35 -0600
"Patrick K. O'Brien"
wrote: PyPerSyst provides simple, robust persistence for Python objects. Note that the files are currently available only in CVS, and SourceForge has anonymous CVS shut down at the moment.
Twisted includes its own version of the idea behind PyPerSyst (www.prevayler.org) in twisted.persisted.journal.
Yes, I've looked at journal. I don't fully understand your implementation, because I don't fully understand Twisted yet. And, I had already started PyPerSyst before I knew about your implementation. Now I can't let it go. I'm sure you know how that is. :-)
We recently added support for Pyro, which gives you client/server RMI-like remote access. You can find Pyro here:
I dislike Pyro since it uses pickles over the network. This is insecure, and even if that is acceptable limits you to Python clients only.
I'm not too worried about the limitation to Python clients. The security issue is a different matter. It looks like Pyro now supports XML pickles: <quote> Whether the marshaling is done using the safe xml pickling (from Gnosis_utils or PyXML) or the default pickle. The xml_pickle is not vulnerable for the pickle trojan problem mentioned in Chapter 7, but it is an order of a magnitude slower, and requires more bandwith. Use "any" for any implementation (defaults to PyXML), "pyxml" for PyXML, "gnosis" for Gnosis. PyXML seems to be about three times faster than Gnosis. You need to have installed Gnosis_Utils (at least version 1.0.2, latest is 1.0.5 at the time of writing). For PyXML, you need at least version 0.8, latest is 0.8.1 at the time of writing. </quote>
Twisted's remote object protocol Perspective Broker on the other hand does not use pickles, and can in fact be used with other languages (a Java implementation exists), while still offering many of the features Pyro provides.
PyPerSyst works with Pyro, but doesn't depend on it. I'd like to accomplish the same with Twisted. I started with Pyro because it was easier for me to figure out. Any suggestions for getting up-to-speed on Perspective Broker? Anything else I need to be aware of for this kind of application? -- Patrick K. O'Brien Orbtech http://www.orbtech.com/web/pobrien ----------------------------------------------- "Your source for Python programming expertise." -----------------------------------------------
On Thu, 16 Jan 2003 10:48:40 -0600
"Patrick K. O'Brien"
Yes, I've looked at journal. I don't fully understand your implementation, because I don't fully understand Twisted yet. And, I had already started PyPerSyst before I knew about your implementation. Now I can't let it go. I'm sure you know how that is. :-)
Yah. And I need to write docs if I want people to use it :)
Whether the marshaling is done using the safe xml pickling (from Gnosis_utils or PyXML) or the default pickle. The xml_pickle is not vulnerable for the pickle trojan problem mentioned in Chapter 7, but it is an order of a magnitude slower, and requires more bandwith.
PB actually implements its own version of Pickle as well (jelly), except that it does *not* use a whole order of magnitude more of bandwidth - it's probably the same bandwidth usage as pickle. It's still not as fast as it could be, but we're working on it.
PyPerSyst works with Pyro, but doesn't depend on it. I'd like to accomplish the same with Twisted. I started with Pyro because it was easier for me to figure out. Any suggestions for getting up-to-speed on Perspective Broker?
http://twistedmatrix.com/documents/howto/pb-intro http://twistedmatrix.com/documents/howto/pb-usage http://twistedmatrix.com/documents/howto/pb-cred Version 1.0.2 will have additional documentation.
On Thursday 16 January 2003 12:01 pm, Steve Waterbury wrote:
Itamar Shtull-Trauring wrote:
PB actually implements its own version of Pickle as well (jelly)...
And jelly does nested objects, which Pickle can't.
I'm not sure what you mean by that. Could you explain in what way Pickle can't handle nested objects? Thanks. -- Patrick K. O'Brien Orbtech http://www.orbtech.com/web/pobrien ----------------------------------------------- "Your source for Python programming expertise." -----------------------------------------------
On Thu, 16 Jan 2003 12:38:40 -0600
"Patrick K. O'Brien"
I'm not sure what you mean by that. Could you explain in what way Pickle can't handle nested objects? Thanks.
Yes, what is a nested object? -- Itamar Shtull-Trauring http://itamarst.org/ Available for Python, Twisted, Zope and Java consulting ***> http://VoteNoWar.org -- vote/donate/volunteer <***
"Patrick K. O'Brien" wrote:
On Thursday 16 January 2003 12:01 pm, Steve Waterbury wrote:
And jelly does nested objects, which Pickle can't.
I'm not sure what you mean by that. Could you explain in what way Pickle can't handle nested objects? Thanks.
import pickle f = lambda x: x in (1,2,3) pf = pickle.dumps(f) Traceback (most recent call last): File "<stdin>", line 1, in ? File "/usr/local/lib/python2.2/pickle.py", line 978, in dumps Pickler(file, bin).dump(object) File "/usr/local/lib/python2.2/pickle.py", line 115, in dump self.save(object) File "/usr/local/lib/python2.2/pickle.py", line 225, in save f(self, object) File "/usr/local/lib/python2.2/pickle.py", line 519, in save_global raise PicklingError(
Sorry -- I didn't mean "nested objects", but nested
functions of certain kinds, such as lambdas or
curried functions. E.g.:
pickle.PicklingError: Can't pickle
from twisted.spread.jelly import jelly, unjelly jf = jelly(f) jf ['function', '__main__.<lambda>'] newf = unjelly(f) newf(1) 1 newf(4) 0
-- Steve.
On Thursday 16 January 2003 11:08 am, Itamar Shtull-Trauring wrote:
PyPerSyst works with Pyro, but doesn't depend on it. I'd like to accomplish the same with Twisted. I started with Pyro because it was easier for me to figure out. Any suggestions for getting up-to-speed on Perspective Broker?
http://twistedmatrix.com/documents/howto/pb-intro http://twistedmatrix.com/documents/howto/pb-usage http://twistedmatrix.com/documents/howto/pb-cred
Very nice. I can see how these capabilities would be useful. And now that I've read these I think I'm starting to understand Twisted. Thanks. -- Patrick K. O'Brien Orbtech http://www.orbtech.com/web/pobrien ----------------------------------------------- "Your source for Python programming expertise." -----------------------------------------------
On Thursday 16 January 2003 11:08 am, Itamar Shtull-Trauring wrote:
http://twistedmatrix.com/documents/howto/pb-intro http://twistedmatrix.com/documents/howto/pb-usage http://twistedmatrix.com/documents/howto/pb-cred
Version 1.0.2 will have additional documentation.
I just updated from CVS and read doc/howto/pb-copyable.html. Wow! That is very cool stuff. I'm convinced that Twisted is the way to go. -- Patrick K. O'Brien Orbtech http://www.orbtech.com/web/pobrien ----------------------------------------------- "Your source for Python programming expertise." -----------------------------------------------
On Thu, 16 Jan 2003 18:03:11 -0600, "Patrick K. O'Brien"
I just updated from CVS and read doc/howto/pb-copyable.html. Wow! That is very cool stuff. I'm convinced that Twisted is the way to go.
Okay, since this has been gnawing on my conscience every time somebody mentioned PB in the last month, let me try to at least state this publicly once for the record :-) There is still one protocol-breaking change that I think we will be making to PB. The details are still a little hazy, but I am pretty sure that we can get an order of magnitude performance increase (with no appreciable increase in memory usage) by memoizing every object created from a Banana'd list rather than creating explicit (reference 1 (foo ...)) expressions. I find it hard to summarize what I mean by this, but for those of you familiar with Jelly's implementation details, here's an attempt. Currently, whenever an object is serialized, if it points to itself, it must be "memo-ized" in order to give it a unique ID for the future. This is best shown with an example: >>> from twisted.spread.jelly import jelly >>> l = ['hello'] >>> l.append(l) >>> jelly(l) ['reference', 1, ['list', 'hello', ['dereference', 1]]] I believe that the technique we have been using to identify objects that participate in circular references is slow and unnecessary. Each object corresponds to a Jelly expression (Python list) whose start has some position in the banana stream relative to every other list start. So in this example, the resulting list could be: ['list', 'hello', ['dereference', 1]] and the decoder could still quite easily recognize what the ['dereference', 1] was pointing to because the list is the first in the stream. For a more complex example, this is what currently happens. >>> from twisted.spread.jelly import jelly >>> m = ['one'] >>> m.append(m) >>> n = ['two'] >>> n.append(n) >>> l = [n, m] >>> l = [m, n] >>> jelly(l) ['list', ['reference', 1, ['list', 'one', ['dereference', 1]]], ['reference', 2, ['list', 'two', ['dereference', 2]]]] I'd prefer that produce this instead: ['list', ['list', 'one', ['dereference', 2]], ['list', 'two', ['dereference', 4]]] To see why the numbers "2" and "4" make sense, count left brackets :). This is technically a change to Jelly, not to PB, but it has ramifications on the wire protocol. In the process of doing this, I plan to add a version-negotiation step to PB, similar to what currently exists for Jelly, so that any potential future changes of this variety do not cause similar problems. What this MAY break: * Version Compatibility. If you have a running PB server, older versions of PB clients will no longer be able to connect. This may be correctable but is probably not worth the effort, given the dearth of deployed PB services, and the fact that the primary incompatibility will be improved version negotiation so that this won't happen again :). * Things that use jellyFor directly The internal Jellier APIs may change in subtly incompatible ways. Basic usage of jellyFor will probably still continue to work. * Alternate language implementations of PB These will need to be updated. Emacs-PB looks to be in a non-working state already :-\ though I've already talked to Itamar about making similar changes to Java-PB, and they shouldn't be too hard. It already reflects the Python code. * Applications using Jelly directly If you are using Jelly to store data in files or something like that, this will cause your new versions to be incompatible. If there are really people doing this then perhaps we need version information for these files as well. What this WILL NOT break: * Applications using PB at the high level If you have an app that just has objects which are Referenceable, Cacheable and so on, the existing semantics will continue to work, and if you upgrade Twisted on both server and client, both will continue to work with each other as they did before. * Applications using Banana directly This will have no impact on the Banana APIs or protocol and those look stable for the forseeable future. Please keep in mind that nobody is working on this yet :-). It may be that it will not break any of the things above, but it will certainly not break the last two. There are other infrastructural changes to PB that I'd like to see (for example, making the internal dictionaries use weakrefs rather than __del__) but those should be completely transparent to end-users. I believe this will also make the unified banana+jelly streaming optimization (that Bruce has mentioned on IRC a few times) possible, but I'll have to leave that to him when I've actually got some code/specs to describe this more clearly. -- | <`'> | Glyph Lefkowitz: Travelling Sorcerer | | < _/ > | Lead Developer, the Twisted project | | < ___/ > | http://www.twistedmatrix.com |
On Friday 17 January 2003 03:01 am, Glyph Lefkowitz wrote:
On Thu, 16 Jan 2003 18:03:11 -0600, "Patrick K. O'Brien"
wrote: I just updated from CVS and read doc/howto/pb-copyable.html. Wow! That is very cool stuff. I'm convinced that Twisted is the way to go.
Okay, since this has been gnawing on my conscience every time somebody mentioned PB in the last month, let me try to at least state this publicly once for the record :-)
There is still one protocol-breaking change that I think we will be making to PB. The details are still a little hazy, but I am pretty sure that we can get an order of magnitude performance increase (with no appreciable increase in memory usage) by memoizing every object created from a Banana'd list rather than creating explicit (reference 1 (foo ...)) expressions.
Sounds reasonable. Thanks for the heads-up. -- Patrick K. O'Brien Orbtech http://www.orbtech.com/web/pobrien ----------------------------------------------- "Your source for Python programming expertise." -----------------------------------------------
participants (4)
-
Glyph Lefkowitz
-
Itamar Shtull-Trauring
-
Patrick K. O'Brien
-
Steve Waterbury