[IPython-dev] [sympy] Re: using reST for representing the notebook cells+text

Mikhail Terekhov termim at gmail.com
Wed Feb 24 18:35:38 EST 2010


On Wed, Feb 24, 2010 at 4:04 PM, Robert Kern <robert.kern at gmail.com> wrote:
> [Did you mean to post this to one of the lists?]
>

Robert, yes I meant to send this to the list, sorry - hit the wrong button.
Apology for the list members if this discussion is a closed matter
already, I just thought it is interesting. I'm attaching our
conversation with Robert in hope it will be somehow useful.



On Wed, Feb 24, 2010 at 14:31, Mikhail Terekhov <termim at gmail.com>  wrote:
> On Wed, Feb 24, 2010 at 10:49 AM, Robert Kern <robert.kern at gmail.com> wrote:
>> On Wed, Feb 24, 2010 at 01:52, Brian Granger <ellisonbg.net at gmail.com> wrote:
>>
>>> In many respects it seems almost perfect.  But, one day recently I was
>>> thinking about how successful Mathematica's notebook continues to be.  So I
>>> began to look more at the Mathematica notebook format.  Amazingly, the
>>> Mathematica notebook format is a plain text file that itself is *valid
>>> Mathematica code*.
>>
>> The problem with transferring this to Python is that Mathematica's
>> language is a very Lispy one. ExpressionCell[] contains the actual
>> expression, not a string representing a Mathematica expression. You
>
> Not quite so, see for example:
> http://reference.wolfram.com/mathematica/ref/format/NB.html
>
> It is true that nb format contains a valid Mathematica's code but it is not
> the code user usually deals with but a lower level one where expressions
> are constructed from strings and API calls. This is exactly as Brian suggests
> AFAICS.
>
>> can't do that with Python. You would have to put the Python code into
>> strings. You just have a "dumb" tree structure with text leaf nodes.
>
> That is exactly as Mathematica does that, at least as it is described in
> the link above.

Quite so.

>> At that point, you might as well use XML to describe the exact same
>> structure.
>
> Sure you can but you have to pay a price for that and the price is XML
> parsing. While it looks negligible when your nb is just a couple of dozen
> lines it will be huge when you have to load many big nbs at once (think
> library) and when performance and/or memory are important. Subversion
> actually is a good example here. They started to use XML to store their
> internal workspace data but then were forced to switch to some simple
> text format due to memory/performance issues.

I am almost certain that their use cases and workloads are much
different than the notebook's would be. Python's parser isn't exactly
a speed demon, either. A general statement like "XML is slow" followed
by an unrelated anecdote is not terribly convincing. Show me
experiments. I've attached mine. Python ends up being about 3 times
slower than the equivalent XML for a variety of file sizes.

>> With XML, you can push it across to other languages,
>
> This is very ambitious and most of the time is just an illusion :(
> The chances that other projects throw away their's own formats (XML
> or not) and embrace this one are quite slim IMHO.

I'm not talking about other projects adopting anything. I'm talking
about basic capabilities of other languages, like JavaScript's builtin
support for parsing XML. That enables *us* to build things in
JavaScript.

> BTW the fact that
> everyone can parse XML doesn't mean that every one can _use_ the
> data right away.

Nor am I saying that. I am saying that it is enormously easier to
build the JavaScript parser for the XML representation rather than the
Python one.

> One have to have an internal logic/library/API specific
> to the data represented by some particular XML document. If you take
> this into account then the value of the exchange document format
> somewhat reduces. It is still not zero though and IMHO it is easy to
> teach classes proposed by Brian to produce XML representation just
> for the mythical interchange with something :)

The need for interchange is not at all mythical. Web frontends are
exactly what we are talking about in this thread.

>> JavaScript being the hugely important player here. Certainly, you are
>
> Again, it is important to define to what degree the interoperability with
> something like JavaScript is needed. If you plan to work on/modify/execute
> the same nbs in Python and in JavaScript then you have to implement
> compatible engine/API in Python _and_ in JavaScript. Are you sure you
> want to do that? If only the representation or "computed" notebook is
> needed for display purposes by JavaScript, then it is something different
> and could be implemented through specialized repr methods.

Or you could use the same mechanism for both instead of duplicating efforts.

>> going to have a Python API that will represent that tree of text nodes
>> as Python objects, but I just don't see the point of making the repr()
>> of that be the lingua franca format of the notebook file. It's just a
>> wasted opportunity.
>
> The point is that nb became a first class python object - just a module,
> no need for specialized parser and you can work with it as with regular
> Python module - just import and use it. The only difference is that nb is
> mutable - if you modified it then you have to save it.

I really don't see why having the file format be Python code makes it
any more of a first class object. The objects are the first class
objects. As long as loading to those objects is easy, the format just
doesn't matter. Loading an object by importing is actually a very
inflexible and difficult to work with method compared to a function
call.
- Show quoted text -

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
notebook.py     notebook.py
2K   Download
-------------- next part --------------
A non-text attachment was scrubbed...
Name: notebook.py
Type: text/x-python
Size: 1558 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20100224/ee6d2613/attachment.py>


More information about the IPython-dev mailing list