[Neuroimaging] JSON-LD and DICOM?
matthew.brett at gmail.com
Mon Jul 3 15:58:34 EDT 2017
On Mon, Jul 3, 2017 at 6:58 PM, Satrajit Ghosh <satra at mit.edu> wrote:
> hi all,
> sorry, ohbm was busy, just catching up with this! it seems there are a few
> threads in here that i will attempt to summarize and comment.
> 1. the original question: is there a json-ld context that can be used or
> included in a nifti-header extension.
> this can be easily created based on both the work done by david clunie to
> expose each dicom term but also the extensive curation by karl helmer. we
> can create a simple json-ld context definition that behaves as a lexicon.
> there are pieces of dicom where this will get complicated, but for our needs
> a vocabulary may be a good starting point.
> here are the terms in neurolex: http://neurolex.org/wiki/Category:DICOM_term
> they are being transitioned off to scicrunch/interlex and i'm sure karl and
> i can put together a basic context for us.
That would be very useful indeed. Any idea of a time scale for that?
> 2. nibabel addons and the metadata extension in nifti.
> for those who are unfamiliar, we have been using a non-standardized
> extension based on dcmstack in nifti for several years now in the heudiconv
> tool. there is an opportunity to make this part of nibabel and create a
> standard extension. as with many extensions, most software tools may choose
> to ignore an extension, but the value of this extension to keep dicom
> metadata around with the raw converted nifti file is immense. currently, we
> simply discard this information (wait till point 3 for the dicom-nifti
> dimension). as we create this standard, it would be good to leverage json-ld
> to simply point to a context file such as this:
> "@context": "http://nipy.org/nibabel/dicom-context.jsonld"
> we don't have to expand this out in each embedded header.
Right - or, as you noted in the current JSON header draft, we can make
``dcm`` a prefix with its own context, within the context for the JSON
where jhe-context.jsonld contains:
and the JSON header goes something like:
> 3. the dicom-nifti dimension:
> a. state of the field.
> this dicom-nifti dimension reflects the reality of our field in many ways.
> most of us neuroimagers live in a research/exploratory space and mostly do
> not have any clinical applications that need to be embedded into hospital
> systems. the clinical imaging community is trying to make their algorithms
> work for clinical decision systems in the clinical enterprise, hence their
> need for dicom operators. much of cognitive neuroscience is not applicable
> to the clinic hence very little incentive for people to think about dicoms.
> b. the variations in dicom and nifti
> as nate noted there are some big differences in scanners as they apply to
> research institutions. trying to standardize dicom output across scanners is
> itself a big undertaking and not in the interest of many centers. i'm not
> even talking about metadata standardization here, i'm simply saying let all
> dicom scanners output multi-frame dicom. if this is something the community
> can achieve it would be a big step towards a common framework. however, if
> it requires every center to change their mode of operation, it's a huge
> barrier at present. nifti on the other hand is a compact format and fits
> easily into current filesystem views.
> c. software support
> as has been well noted in this thread, the brain imaging community for most
> relies on a set of software packages that support nifti extensively.
> updating these tools to support dicom i/o is a resource intensive
> undertaking. if magically, through a week long hackathon, every software
> supported dicom objects, i don't think we would be having this conversation.
> in addition better dicom support in nibabel could be very useful to a subset
> of the community developing tools in python. for example, from a memory
> representation perspective, it doesn't matter what the disk file format is
> as long as there is a nice api to read it.
> we view dicom in the same lens that we saw it through in the nineties.
> perhaps we can be educated on the diffs in the last 20 years.
> d. metadata maintenance
> as an algorithm developer, one would have to decide what metadata to keep
> and what new pieces of metadata to add to the dicom object. i know andrey,
> steve, and others have done this for segmentation objects and structured
> reports, but the field would have to do this for connectomes, surfaces, and
> blindly copying dicom metadata is analogous to blindly copying nifti header
> extensions. so in both spaces, one has to decide what to keep and what to
> modify. while we can be careful about this for new algorithms, doing so for
> the existing ones is a lot of work.
> e. a view of information that reduces cognitive load
> as algorithm developers we care about the view, the minimal set of
> information that is needed to write a function/solve a problem. nifti-1 was
> an agglomeration of those views when it was created, together with some
> backward compatibility decisions with analyze. people were not thinking of
> large databases, diffusion imaging, and other areas that we now consider
> important. and hence nifti is a view of the underlying information that is
> already out of date. yes the extensions were part of the solution, but how
> many people use the diffusion extension over bvecs and bval files (a la
I think that is partly because of the relative opacity of the
extension formats. Even a small binary format needs some software in
your language to unpack it in a reliable way. Because it takes this
work to look at the information, people often don't bother to check if
the information is what they want, and just throw it away. That's the
big advantage of text files and text formats - they can just open the
file in text editor to have a look if the information is useful.
> the dicom object stores much more information, but it is also a view. it
> does not store the raw sensor data (think nikon RAW vs JPEG) in most cases,
> because people thought it was excessive. as we now seem to find with
> simultaneous multislice seqeunces in fMRI and dMRI, the reconstruction
> algorithm has a huge impact on the SNR of the combined channel data. hence
> more people are preserving k-space data in projects that use such sequences.
> at some point neither dicom nor nifti will be the appropriate format. we are
> not there yet but there are many pointers in that direction as connected
> information aggregates (genetics, imaging, behavior, ehr, etc). at present,
> in our resource constrained development environments, perhaps we can
> preserve information and make it useful when we can.
Thanks for the useful summary,
More information about the Neuroimaging