[IPython-dev] Notebook format "incompatible" changes

Nicholas Bollweg nick.bollweg at gmail.com
Mon Nov 3 17:51:51 EST 2014


Re: JSON-LD:

One of the goals of JSON-LD is to be able to add context to a well-defined
corpus of data without changing its native format: in this case, the
developers mean what they mean, and its important that it be meant this
way... so the challenge is how to extract a useful semantic model of the
document out of the existing format.

The simplest approach is to use the context of:

> {
>   "@context": {
>     "@vocab": "http://ipython.org/nbformat/v4/"
>   }
> }
>

This will map all object keys, and values which are told to be @ids, as
URIs in that namespace: example
<http://json-ld.org/playground/index.html#startTab=tab-expanded&json-ld=https%3A%2F%2Fgist.githubusercontent.com%2Fbollwyvl%2F09f0241b45fc3a98df8b%2Fraw%2F8e8928d9ae471f23af2eaba9365d0ac1d770a4f6%2Ftest4.base.vocab.ipynb&context=%7B%22%40context%22%3A%7B%22%40base%22%3A%22http%3A%2F%2Fipython.org%2Fnbformat%2Fv4%2F%22%2C%22%40vocab%22%3A%22http%3A%2F%2Fipython.org%2Fnbformat%2Fv4%2F%22%7D%7D>
.

(the playground doesn't have a way to do out-of-band expansion contexts,
but it wouldn't have to be embedded like that)


A more thorough-going approach might yield some more interesting things,
but this is a good starting point, and just a few additions to the above
context would be get really close to reflecting what is being said in a
given notebook.

IPython notebook nbformat v4 JSONSchema:
> https://github.com/minrk/ipython/blob/nbformat4/IPython/nbformat/v4/nbformat.v4.schema.json
>

The schema is a great start: I haven't tried it, but there are some tools
to automatically generate context from schema
<https://www.npmjs.org/package/schema-jsonld-context>:

some things that might be *interesting* to map to JSON-LD:

   - patternProperties in mimebundle.
   - In this case, though, it's referring to large, but agreed-upon
      enumeration of values (and not, like package.json's dependencies, an
      infinite number of package names).
      - with the above context, these would be lumped into the root of the
      namespace: http://ipython.org/nbformat/v4/text/html
         - this isn't so bad, probably
         - all the enums, i.e. cell_type: markdown
   - with the naive context above, it will map to a string
      - by setting the @type of cell_type to be @id in the context, markdown
      would expand to the URI
      http://ipython.org/nbformat/v4/markdown
      - this isn't so bad
         - Another option is to treat enums as more xml-like literals of a
      specific type, by setting @type of cell_type to be something like
      CellType
         - also not so bad
         - The advantage to having URIs vs. literals (they are both
      queryable) is that URIs can be the subject of something, and not just the
      object... not sure what we'd want to say in this case.
      - the wild west of metadata
      - JSON-LD can't tell the difference between cell metadata and
      notebook metadata
         - this is not so bad, as it is always "isolated" within the
         context of the <thing>s metadata, and wouldn't "pollute" the parent
         - with the naive context, everything will just fall into the main
      namespace.
         - this is bad. i don't see anything that can be done about it
         - all the lists
      - in JSON-LD, one can specify @container: @list
         - these are pretty bad in RDF, as it uses a bizarre lisp-like first
         and rest to represent them
      - nothing in the root that can map to an @id or @type
   - @id: not going there today
      - @type: nbformat is close, but there's nothing but duck typing to
      say, "I am a notebook"
      - loading these up into a graph would be *interesting*, as they would
      all just be blank nodes knocking around

I'll do some more poking around, but think this is worth having!

On Mon, Nov 3, 2014 at 4:21 AM, Wes Turner <wes.turner at gmail.com> wrote:

> Tardy to the party!
>
> IPython notebook nbformat v4 JSONSchema:
> https://github.com/minrk/ipython/blob/nbformat4/IPython/nbformat/v4/nbformat.v4.schema.json
>
>
> Metadata Documentation:
> https://github.com/minrk/ipython/blob/nbformat4/docs/source/notebook/nbformat.rst#metadata
>
>
> JSON-LD JSONSchema:
> https://github.com/json-ld/json-ld.org/blob/master/schemas/jsonld-schema.json
>
> On Mon, Nov 3, 2014 at 3:05 AM, Matthias Bussonnier <
> bussonniermatthias at gmail.com> wrote:
>
>> Hi,
>> Le 3 nov. 2014 à 09:32, Wes Turner <wes.turner at gmail.com> a écrit :
>>
>> Thanks!
>>
>> 1. Is there a link to the jsonschema? In the documentation?
>>
>>
>> Please see the PR. https://github.com/ipython/ipython/pull/6045
>>
>> 2. How feasible would it be to write a JSON-LD context?
>>
>>
>> This should have been discussed when we were in the refactoring of the
>> notebook format,
>> not once it's ready to merge. Metadata are free format you can add things
>> like that if you like though.
>>
>> 3. Where/how can/could Dublin Core metadata be added? It would be great
>> to be able to index these documents with a title and authors.
>>
>>
>> Same as above.
>>
>> The two last issues of extra metadata have extensively discussed, the new
>> format do not change
>> the discussions/problems that have been made. You can also refer to
>> theses.
>>
>> Cheers,
>> --
>> M
>>
>>
>>
>>
>>
>> https://en.wikipedia.org/wiki/Dublin_Core#DCMI_Metadata_Terms
>>
>> On Sun, Nov 2, 2014 at 5:23 PM, Fernando Perez <fperez.net at gmail.com>
>> wrote:
>>
>>>
>>> On Sat, Nov 1, 2014 at 11:01 AM, Brian Granger <ellisonbg at gmail.com>
>>> wrote:
>>>
>>>> And thanks to the whole team (esp Min) for working on all of this!
>>>>
>>>
>>> +lots. This has been a huge amount of slow, careful, not-very-sexy
>>> work.  I am really thankful for the patience the whole team has had working
>>> on this, to help us set the notebook format as a solid foundation for
>>> long-term sharing and archival of computational work.
>>>
>>> This kind of effort provides few immediate rewards, but can have very
>>> significant long-term value. Thanks a lot for everyone who pitched in on
>>> that PR...
>>>
>>> Cheers,
>>>
>>> f
>>>
>>>
>>> --
>>> Fernando Perez (@fperez_org; http://fperez.org)
>>> fperez.net-at-gmail: mailing lists only (I ignore this when swamped!)
>>> fernando.perez-at-berkeley: contact me here for any direct mail
>>>
>>> _______________________________________________
>>> IPython-dev mailing list
>>> IPython-dev at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>>
>>>
>>
>>
>> --
>> Wes Turner
>> https://westurner.github.io/
>>  _______________________________________________
>> IPython-dev mailing list
>> IPython-dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>
>>
>>
>> _______________________________________________
>> IPython-dev mailing list
>> IPython-dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>
>>
>
>
> --
> Wes Turner
> https://westurner.github.io/
>
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20141103/007e8142/attachment.html>


More information about the IPython-dev mailing list