<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hi all, Many responses in one Mail<div><br></div><div><br><div><div>Le 3 juil. 2013 à 11:48, Russell Neches a écrit :</div><div><br></div><blockquote type="cite"><div>In response to Matthias' points...<br><br><blockquote type="cite">Right now I thing storing the notebook as a directed graph is<br></blockquote><blockquote type="cite">problematic in a few way, the first being that it is incompatible with<br></blockquote><blockquote type="cite">the fact that people want to be able to run notebook in a headless<br></blockquote><blockquote type="cite">manner, which if you add explicit choice is not possible. <br></blockquote><br>If by "headless" execution you mean converting the notebook into a<br>regular .py script, I think the directed graph model isn't a problem. In<br>fact, it actually elegantly solves a number of thorny problems involved<br>with transforming a notebook into a script.<br></div></blockquote><div><br></div><div>Not exactly, I mean stripping cells from their output, and regenerating the output</div><div>without an open browser. But if you say the notebook keep references of choice</div><div>then this is not really a problem..</div><div><br></div><blockquote type="cite"><div>…<br></div></blockquote><br><blockquote type="cite"><div>Gabe is proposing to allow (but not force!) the user to build non-linear<br>code paths into their notebooks. This way, it is possible to SPECIFY a<br>PARTICULAR path through the code cells, and then output it in a linear<br>form. <br><br>The example he offered is a wonderful illustration of why this is<br>awesome. There are eighteen possible .py files (or PDFs, or HTML files,<br>etc etc) one could generate, and the user simply needs to choose one. If<br>the user wants to proceed to make another script representing a<br>different path, they can just activate one of the alternative paths and<br>output another script. This is extremely powerful. The user could<br>perhaps elect to generate all possible linear scripts for a given<br>collection of alternatives, and then dispatch them to different nodes to<br>compare and contrast the results. <br><br></div></blockquote><div><br></div><div>I understand what you want to do, but first IMHO this is too complicated.</div><div>Users are already lost by the fact that codecell can be executed out of order. </div><div>I also think this is close to a discussion that we had with the %with magic, </div><div>in the sense that most of the use case could be solved by using feature of </div><div>the underlying programming language. In this some case probably an if statement. </div><div><br></div><br><blockquote type="cite"><div>There is even a sensible default behavior; simply linearize according to<br>the current selection state of the task cell. The ability to swap around<br>chunks of a workflow and then output linearized scripts makes the<br>concept of "headless" execution vastly more powerful and interesting. If<br>you've ever had to work with things like Sun Grid Engine (a lot of us<br>scientists are pretty much stuck with it), this would be a Life Changing<br>Ability.</div></blockquote><div><br></div><blockquote type="cite"><div>There is also the matter of incompatible assumptions. I often create<br>notebooks that begin with a bunch of code cells, each of which loads a<br>different data set and sets parameters related to that particular data<br>set. When I use these notebooks, I execute ONE of these, skip the rest,<br>and then proceed to the actual analysis. At the moment there is NO WAY<br>to correctly output this kind of notebook as a script without modifying<br>it. <br></div></blockquote><div><br></div><div><div>I get that, and I myself would really like to execute notebook in a headless</div><div>manner that generate a report based on input data. It still have to be done, </div><div>even if it is not that hard it have to be designed.</div><div><br></div><div>It does not take more than 50 line to do the basic[1], you just have to start the</div><div>kernel yourself before evaluating the notebook, and execute the data loading </div><div>before evaluating each cell of the notebook, you can even inject dynamically</div><div> the codecell you would have like to run, and save the final notebook.</div><div><br></div></div><blockquote type="cite"><div><font class="Apple-style-span" color="#000000"><br></font>The directed graph model makes this problem simply go away. I can stick<br>these incompatible alternatives into a task, and just pick one of them. <br><br>Again, nothing forces me to use these features. As Gabe pointed out, a<br>linear document is a subset of a directed graph. It should be possible<br>to load old notebooks as (rather dull) directed graphs without making a<br>giant mess of the JavaScript.<br></div></blockquote><div><br></div><div>From a theoretical point of view I agree, nonetheless, inserting, searching and other</div><div>common operation would rapidly become rather difficult, and even if the cost would</div><div>be low here, this would mean that any software that would like to work with ipynb</div><div>should support directed graph.</div><div><br></div><blockquote type="cite"><div><br><blockquote type="cite">This also contradict the fact that the notebook capture both the input<br></blockquote><blockquote type="cite">and the output of the computation. As you showed there is actually 18<br></blockquote><blockquote type="cite">different combinations of data analysis, and they are not all stored<br></blockquote><blockquote type="cite">in the notebook. <br></blockquote><br>I haven't dug into Gabe's code, but this doesn't seem to be a problem. A<br>task cell has ONE input, ONE output, and at any given time, ONE selected<br>execution pathway. From the outside, it works just like a regular code<br>cell. It's just got some private state information about which execution<br>pathway is currently active.<br><br><blockquote type="cite">To sum up, I don't think the current JS client is in it's current<br></blockquote><blockquote type="cite">state the place to implement such an idea. The Dag for cell order<br></blockquote><blockquote type="cite">might be an idea for future notebook format but need to be well<br></blockquote><blockquote type="cite">thought, and wait for cell IDs.<br></blockquote><blockquote type="cite"><br></blockquote>You mean the JS client was PREVIOUSLY not in a sate to implement such an<br>idea, and so Gabe fixed it. Hooray! ;-)<br><br><br>Russell<br></div></blockquote><br></div><div><br></div><div><br></div><div><div>Le 4 juil. 2013 à 03:04, Gabriel Becker a écrit :</div><div><br></div><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex; position: static; z-index: auto; "><div style="word-wrap: break-word; "><div><div></div><div>This added to the fact that each cell can support arbitrary metadata, you</div><div>should be able to arrange preexisting in structure that work together. It might</div><div>be a little difficult to do it right now as our javascript is not yet modular</div><div>enough to be easily reused, but we are moving toward it.</div></div></div></blockquote><div><br></div><div>Respectfully, rolling my own frontend for ipynb files given all the work the IPython team</div></div></div></div></blockquote><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> has done on the excellent notebook browser interface would be an enormous and extremely</div></div></div></div></blockquote><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> wasteful duplication of effort. I don't think its the right way to pursue these features.<br></div></div></div></div></blockquote><div><br></div><div>With current architecture, I agree, but in the end you should be able to include only one javascript file</div><div>and the rest should be pulled with require.js so you would just need to overwrite what you need. </div><div>In a perfect world the notebook would just be a jslib you can use, so you wouldn't have to patch what we do</div><div>but pass a list of (exrta) cell type you want to support, and maybe custom read methods for the cell the core</div><div>don't know about. Not sure how far we would support that, but it should be pretty easy to make custom format On Top</div><div>of ipynb</div><br><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Furthermore, if I were going to write an application offering the types of features I am talking about from scratch,</div></div></div></div></blockquote><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> there wouldn't be any good reason to base it on the unaltered ipynb format, as they don't easily support the structure</div></div></div></div></blockquote><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> required by those features without the additional cell types I implemented in my forked version. <br><br></div><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex; position: static; z-index: auto; "><div style="word-wrap: break-word; "><div><div><br></div><div>Right now I thing storing the notebook as a directed graph is problematic in a</div><div>few way,</div></div></div></blockquote><div><br></div><div>I'm not talking about storing the notebook as an actual directed graph data structure. There would be benefits</div></div></div></div></blockquote><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> to that but its not necessary and it isn't want I did in my forked version.<br><br></div><div>The ability to have nested cells (cells which contain other cells) gets us everything we need structure wise,</div></div></div></div></blockquote><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> and is the basis of everything seen in both the video (other than interactive code cell stuff) and screenshots</div></div></div></div></blockquote><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> I posted. The ipynb file for the notebook pictured in the screenshot looks exactly like a normal ipynb file</div></div></div></div></blockquote><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> except that in the json there are cell declarations which have a cells field which contains the json descriptions</div></div></div></div></blockquote><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> of the cells contained in that cell.<br></div></div></div></div></blockquote><div><br></div><div>this is more or less what I called storing as a DAG (or more like a tree I guess here), this look a lot like what </div><div>we had with worksheet, and we are moving away from this data structure because of it's complexity to handle some cases</div><br><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>…</div></div></div></div></blockquote><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><div>I apologize for not being clear. As I said in a response above, the directed graph idea was intended to be conceptual</div></div></div></div></blockquote><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> for thinking about the documents, not structural for actually storing them.<br></div></div></div></div></blockquote><div><br></div><div>I don't think the 2 are unrelated. thinking and storing document as graph could make sens.</div><br><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>What I actually did was simply allow cell nesting and change indexing so that it is with respect to the parent/container</div></div></div></div></blockquote><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> (cell or notebook) instead of always with respect to the notebook. This required some machinery changes but not too</div></div></div></div></blockquote><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> many and it is an extension in the mathematical sense in that indexing will behave identically to the old system for</div></div></div></div></blockquote><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> notebooks without any nesting while now meaningfully functioning for notebooks with nesting.<br></div></div></div></div></blockquote><div><br></div><div>I'm still curious of that, and would be a little afraid of how you handle things in UI.</div><div><br></div><div><br></div><div><div>Le 4 juil. 2013 à 03:59, Brian Granger a écrit :</div><br class="Apple-interchange-newline"><blockquote type="cite"><div>Gabriel,<br>...<br><br>Second, while it is tempting to generalize the notion of input to<br>include widgety things, it is more appropriate to put these things in<br>the output:<br></div></blockquote><div><br></div><br><blockquote type="cite"><div>* Putting widgets in the input area forces you to do regular<br>expression matching to replace those variables in the code. This<br>limits you to an extremely simple event model where the only possible<br>event you can know about is substitute the regular expression and run<br>all the code. What if you want different UI controls in the browser<br>to trigger different bits of code in the kernels when different fine<br>grained events happen. Making the UI controls live on the Python and<br>JS side allows us to build this in a natural way.<br></div></blockquote><div><br></div><div>This is one place where I sometime disagree with Brian, where I think</div><div>input widget for codemirror would be great. To compare with Gabriel 'interactive'</div><div>code cell, I would be more inclined to provide the ability to bind with get to Codemirror</div><div>like in <a href="http://livecoding.io/3419309">http://livecoding.io/3419309</a></div><div>through reg-exp it bind to any variable in the codecell and pop a widget to change the value.</div><div>you don't have to explicitly state which variable should be bound. </div><div>Implementation detail CM provide a method to get token at cursor, which helps a lot…</div><div>It has also the advantage of working without changing the cell type.</div><div><br></div><br><blockquote type="cite"><div>Th alt-cells you show bring up the issue of providence. We have some<br>very initial thoughts about that, but it is way out of scope for the<br>project right now - we have a plates 10x overfull already. We will<br>get there though eventually.<br></div></blockquote><div><br></div><div>Personally I'm torn with alt cell. I feel they should be used using function and cases with the underlying language, </div><div>but they have a definitive advantage in teaching and exploring.</div><br><blockquote type="cite"><div>Thanks for sharing your ideas.<br></div></blockquote></div></div><br></div><div><br></div><div>Final thought, </div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>Seeing interactive widget is definitively awesome, and look fantastic.</div><div>I think we can avoid having a specific cell type for that altogether,</div><div>and IMHO input methods (like livecoding.io) and interactive widget are </div><div>2 complementary approaches. I think Gabriel way of adding specific widget</div><div>that are bound to specific line could also be done without specific cell type </div><div>using metadata. I can't wait to be able to select my matplotlib color in a color picker</div><div>directly in CM for example.</div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>The task cell are nice, but think should be covered using a different mechanism.</div><div>as brian pointed out we think of using implicit grouping using headers.</div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>As I already said, I'm torn about alt-cell. The UI is nice for teaching, but I think it covers a</div><div>too small use case. in particular to change the path you use, you need a user interaction, </div><div>or to use a specific tool to run the notebook by selecting a path. </div><div>This IMHO go agains the modularity. I see that those weeks because I'm polishing a notebook</div><div>for a publication, alt cell would have definitively been usefull for test, but now I'm curing myself for not having</div><div>written function I could have reused and add if statement to select the case.</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>So I would be inclined to have a semi-linear approaches she you write the functions you need, </div><div>and the path is selected using cases in python, why not with a radio-selector ui that set the value of a variable</div><div>which set the future path, but not actually selecting a cell at ipynb level. This has also the advantage of being </div><div>pure python compatible, without having to generate multiple script.</div><div><br></div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>Still really impressed by the work, and I really thing this is a good start for more discussion, and a nice starting point</div><div>to design stuff we will add later.</div><div><br></div><div>Sorry for the length and if I missed stuff.</div><div>-- </div><div>Matthias</div><div><br></div><div><br></div><div><br></div></body></html>