The IO library and image file formats -- compare with with PIL
Hi, Could someone here please summarize the plans for the io library !? I saw that there are parts of it going into numpy already. And more stuff is already in SciPy -- IIRC. What is the status / plans regrading image formats like TIFF. Are you guys planning to duplicate the efforts of the Python Imaging Library ( PIL ) ? Or can you just copy-paste some of the code ? The reason I'm asking now, is that I just submitted a patch to PIL that adds writing capabilities for MultiPage TIFF files. I did not get any response on the PIL list, and I'm actually still not clear on their handling of free vs. commercial. The commercial PIL license sounds like you have to pay $2000 to get access to svn, i.e. the current development version of PIL. Anyone here who could comment on this ? Thanks, Sebastian Haase
On Fri, Apr 18, 2008 at 3:36 AM, Sebastian Haase <haase@msg.ucsf.edu> wrote:
Hi,
Could someone here please summarize the plans for the io library !?
I saw that there are parts of it going into numpy already. And more stuff is already in SciPy -- IIRC.
What is the status / plans regrading image formats like TIFF.
There are no such plans.
Are you guys planning to duplicate the efforts of the Python Imaging Library ( PIL ) ?
No.
Or can you just copy-paste some of the code ?
We don't intend to, no.
The reason I'm asking now, is that I just submitted a patch to PIL that adds writing capabilities for MultiPage TIFF files. I did not get any response on the PIL list, and I'm actually still not clear on their handling of free vs. commercial. The commercial PIL license sounds like you have to pay $2000 to get access to svn, i.e. the current development version of PIL. Anyone here who could comment on this ?
I doubt it. Fredrik Lundh is the person you have to ask about such things. You might try emailing him personally. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On 18/04/2008, Sebastian Haase <haase@msg.ucsf.edu> wrote:
The reason I'm asking now, is that I just submitted a patch to PIL that adds writing capabilities for MultiPage TIFF files. I did not get any response on the PIL list,
If you *do* figure out how to get their attention, please let me know (maybe I should try the route Robert suggested). Zachary Pincus (IIRC) and myself both have patches we'd like to have applied. We mailed it to the Image SIG, without any response. Regards Stéfan
On Fri, Apr 18, 2008 at 11:18 AM, Stéfan van der Walt <stefan@sun.ac.za> wrote:
On 18/04/2008, Sebastian Haase <haase@msg.ucsf.edu> wrote:
The reason I'm asking now, is that I just submitted a patch to PIL that adds writing capabilities for MultiPage TIFF files. I did not get any response on the PIL list,
If you *do* figure out how to get their attention, please let me know (maybe I should try the route Robert suggested). Zachary Pincus (IIRC) and myself both have patches we'd like to have applied. We mailed it to the Image SIG, without any response.
Regards Stéfan
Ultimately we has to consider a fork of PIL. Do you guys know, if this is allowed -- per the PIL license !? Of course this would be super sub optimal, but then, it's effectively what I have right now -- and you have your own version ..... To summerize: numpy can probably do many things of PIL already better -- I'm talking about all the transformation stuff of course. So only the file IO would have to get forked out -- to scipy for example ;-) Cheers -Sebastian
Ultimately we has to consider a fork of PIL. Do you guys know, if this is allowed -- per the PIL license !?
Of course this would be super sub optimal, but then, it's effectively what I have right now -- and you have your own version .....
To summerize: numpy can probably do many things of PIL already better -- I'm talking about all the transformation stuff of course. So only the file IO would have to get forked out -- to scipy for example ;-)
I have my own "internal fork" of PIL that I've been calling "PIL- lite". I tore out everything except the file IO, and I fixed that to handle 16-bit files correctly on all endian machines, and to have a more robust array interface. IIRC, PIL is BSD-licensed (or BSD-compatible), so the fork should be OK to re-distribute. Now, part of the reason that we may have heard nothing about the PIL patches we've submitted variously is that I understand that they're doing a big re-write of PIL, and in particular, its memory handling, that should address these sort of issues. However, we all know how well "big rewrites" go... If people wanted to make a proper "fork" of PIL into a numpy- compatible image IO layer, I would be all for that. I'd be happy to donate "PIL-lite" as a starting point. Now, the file IO in PIL is a bit circuitous -- files are initially read by pure-Python code that determines the file type, etc. This information is then passed to (brittle and ugly) C code to unpack and swizzle the bits as necessary, and pack them into the PIL structs in memory. I think that basically all of what PIL does, bit-twiddling-wise, could be done with numpy. So really, what's needed is to take the pure- Python "file format reading" functionality from PIL (with my modifications thereof to handle 16-bit files better, and Stéfan and Sebastian's modifications for other functionality, etc), and then attach it to a layer that uses Python and numpy to actually read the bits out of the files and directly into numpy arrays. I've been meaning to do this for a while, but just haven't gotten around to it. I think it will be a surprisingly small amount of code needed around PIL's python file format readers. Zach
On 18/04/2008, Zachary Pincus <zachary.pincus@yale.edu> wrote:
I have my own "internal fork" of PIL that I've been calling "PIL- lite". I tore out everything except the file IO, and I fixed that to handle 16-bit files correctly on all endian machines, and to have a more robust array interface.
If people wanted to make a proper "fork" of PIL into a numpy- compatible image IO layer, I would be all for that. I'd be happy to donate "PIL-lite" as a starting point. Now, the file IO in PIL is a bit circuitous -- files are initially read by pure-Python code that determines the file type, etc. This information is then passed to (brittle and ugly) C code to unpack and swizzle the bits as necessary, and pack them into the PIL structs in memory.
I would really try and avoid the forking route, if we could. Each extra dependency (i.e. libpng, libjpeg etc.) is a potential build problem, and PIL already comes packaged everywhere. My changes can easily be included in SciPy, rather than in PIL. Could we do the same for yours? Then we could rather build scipy.image (Travis' and Robert's colour-space codes can be incorporated there, as well?) on top of the PIL. I'm really unhappy about the current state of ndimage. It's written in (Python API) C, so no one wants to touch the code. Much of it can be rewritten in equivalent pure Python, using modern NumPy constructs that weren't available to Peter. What we really need is to get knowledgeable people together for a week and hack on this (ndimage is an extremely useful module!), but I don't know when we're going to have that chance. Who fancies a visit to South Africa? :) Cheers Stéfan
On Apr 20, 2008, at 1:42 PM, Stéfan van der Walt wrote:
On 18/04/2008, Zachary Pincus <zachary.pincus@yale.edu> wrote:
I have my own "internal fork" of PIL that I've been calling "PIL- lite". I tore out everything except the file IO, and I fixed that to handle 16-bit files correctly on all endian machines, and to have a more robust array interface.
If people wanted to make a proper "fork" of PIL into a numpy- compatible image IO layer, I would be all for that. I'd be happy to donate "PIL-lite" as a starting point. Now, the file IO in PIL is a bit circuitous -- files are initially read by pure-Python code that determines the file type, etc. This information is then passed to (brittle and ugly) C code to unpack and swizzle the bits as necessary, and pack them into the PIL structs in memory.
I would really try and avoid the forking route, if we could. Each extra dependency (i.e. libpng, libjpeg etc.) is a potential build problem, and PIL already comes packaged everywhere. My changes can easily be included in SciPy, rather than in PIL. Could we do the same for yours? Then we could rather build scipy.image (Travis' and Robert's colour-space codes can be incorporated there, as well?) on top of the PIL.
Nothing should be built "on top" of PIL, or any other image IO library, IMO. Just build things to work with numpy arrays (or things that have an array interface, so can be converted by numpy), and let the user decide what package is best for getting bits into and out of files on disk. Any explicit PIL dependencies should be really discouraged, because of that library's continued unsuitability for dealing with scientifically-relevant file formats and data types. As to the problems with PIL that I've addressed (and several others), these are deep-seated issues that won't be fixed without a major overhaul. My thought was thus to take the pure-python file-sniffing part of PIL and marry it to numpy tools for taking in byte sequences and interpreting them as necessary. This would be have no library dependencies, and really wouldn't be a "fork" of PIL so much as using a small amount of non-broken PIL file-format-reading code that's there and abandoning the awkward/broken byte-IO and memory-model. I can't promise I have any time to work on this -- but I'll look into it, maybe -- and if anyone else wants to look into it as well, I'm happy to provide some code to start with.
I'm really unhappy about the current state of ndimage. It's written in (Python API) C, so no one wants to touch the code. Much of it can be rewritten in equivalent pure Python, using modern NumPy constructs that weren't available to Peter. What we really need is to get knowledgeable people together for a week and hack on this (ndimage is an extremely useful module!), but I don't know when we're going to have that chance. Who fancies a visit to South Africa? :)
A major difficulty with ndimage, beyond the hairy C-code, is the spline-interpolation model that nearly everything is built on. While it's technically a nice infrastructure, it's quite dissimilar from what a lot of people (well, especially me) are used to with regard to how image resampling systems are generally constructed. So that makes it a lot harder to hack on or track down and fix bugs. I don't really have a good suggestion for addressing this, though, because the spline model is really quite nice when it works. Zach
On 21/04/2008, Zachary Pincus <zachary.pincus@yale.edu> wrote:
I would really try and avoid the forking route, if we could. Each extra dependency (i.e. libpng, libjpeg etc.) is a potential build problem, and PIL already comes packaged everywhere. My changes can easily be included in SciPy, rather than in PIL. Could we do the same for yours? Then we could rather build scipy.image (Travis' and Robert's colour-space codes can be incorporated there, as well?) on top of the PIL.
Nothing should be built "on top" of PIL, or any other image IO library, IMO. Just build things to work with numpy arrays (or things that have an array interface, so can be converted by numpy), and let the user decide what package is best for getting bits into and out of files on disk. Any explicit PIL dependencies should be really discouraged, because of that library's continued unsuitability for dealing with scientifically-relevant file formats and data types.
I agree with you, but we still need to provide the user with an easy way to access images on disk (SciPy comes pretty much batteries included).
My thought was thus to take the pure-python file-sniffing part of PIL and marry it to numpy tools for taking in byte sequences and interpreting them as necessary. This would be have no library dependencies, and really wouldn't be a "fork" of PIL so much as using a small amount of non-broken PIL file-format-reading code that's there and abandoning the awkward/broken byte-IO and memory-model. I can't promise I have any time to work on this -- but I'll look into it, maybe -- and if anyone else wants to look into it as well, I'm happy to provide some code to start with.
How would this code be used in practice? I'm just trying to form a mental image of how the parts fit together.
A major difficulty with ndimage, beyond the hairy C-code, is the spline-interpolation model that nearly everything is built on. While it's technically a nice infrastructure, it's quite dissimilar from what a lot of people (well, especially me) are used to with regard to how image resampling systems are generally constructed. So that makes it a lot harder to hack on or track down and fix bugs. I don't really have a good suggestion for addressing this, though, because the spline model is really quite nice when it works.
I have the article on which this is based (I think?). It is "Splines: A Perfect Fit for Signal/Image Processing" by Michael Unser (http://citeseer.ist.psu.edu/unser99splines.html) The irony is that we have something like three or four spline implementations in SciPy! We should re-factor that into a standard location, but as you can imagine it is no small task. Regards Stéfan
On Mon, Apr 21, 2008 at 9:38 AM, Stéfan van der Walt <stefan@sun.ac.za> wrote:
On 21/04/2008, Zachary Pincus <zachary.pincus@yale.edu> wrote:
I would really try and avoid the forking route, if we could. Each extra dependency (i.e. libpng, libjpeg etc.) is a potential build problem, and PIL already comes packaged everywhere. My changes can easily be included in SciPy, rather than in PIL. Could we do the same for yours? Then we could rather build scipy.image (Travis' and Robert's colour-space codes can be incorporated there, as well?) on top of the PIL.
Nothing should be built "on top" of PIL, or any other image IO library, IMO. Just build things to work with numpy arrays (or things that have an array interface, so can be converted by numpy), and let the user decide what package is best for getting bits into and out of files on disk. Any explicit PIL dependencies should be really discouraged, because of that library's continued unsuitability for dealing with scientifically-relevant file formats and data types.
I agree with you, but we still need to provide the user with an easy way to access images on disk (SciPy comes pretty much batteries included).
My thought was thus to take the pure-python file-sniffing part of PIL and marry it to numpy tools for taking in byte sequences and interpreting them as necessary. This would be have no library dependencies, and really wouldn't be a "fork" of PIL so much as using a small amount of non-broken PIL file-format-reading code that's there and abandoning the awkward/broken byte-IO and memory-model. I can't promise I have any time to work on this -- but I'll look into it, maybe -- and if anyone else wants to look into it as well, I'm happy to provide some code to start with.
How would this code be used in practice? I'm just trying to form a mental image of how the parts fit together.
A major difficulty with ndimage, beyond the hairy C-code, is the spline-interpolation model that nearly everything is built on. While it's technically a nice infrastructure, it's quite dissimilar from what a lot of people (well, especially me) are used to with regard to how image resampling systems are generally constructed. So that makes it a lot harder to hack on or track down and fix bugs. I don't really have a good suggestion for addressing this, though, because the spline model is really quite nice when it works.
I have the article on which this is based (I think?). It is
"Splines: A Perfect Fit for Signal/Image Processing" by Michael Unser (http://citeseer.ist.psu.edu/unser99splines.html)
The irony is that we have something like three or four spline implementations in SciPy! We should re-factor that into a standard location, but as you can imagine it is no small task.
Regards Stéfan
Hi all, please don't discuss ndimage and image file-IO (alas PIL) in the same thread !!! the "image" in "ndimage" has nothing (!!) to do with jpeg or tiff ---- you might know this.... So, I summarize then from the recent discussion here, that PIL could be divided into consisting of five parts: a) file format handling based on external libs such as libjpg, libpng, (not libtiff, I think, please confirm !!) b) file format handling based on PIL's python code c) image processing, such as contrast change, pixelwise mapping, transformations like rotation, ... d) image drawing, like addiong text into an image e) image display I like some of PIL's "d" features. I don't use "e" at all (I have written my own OpenGL based 2d-section viewer [BSD lic.]) (I think this is minimal tk code, and some "calling to external OS viewer programs") "c" should all be done using numpy ( + here is a connection to ndimage, but don't dwell on it ...) "a" might be harder to build, because of dependencies, but this is also optional, and setup.py exists (???) "b" is mainly what is the "annoying" part where patches seem to get stuck and lie unused..... I hope you guys can agree with my summary -- now I'm waiting for comments .... -Sebastian
Hello all,
please don't discuss ndimage and image file-IO (alas PIL) in the same thread !!! the "image" in "ndimage" has nothing (!!) to do with jpeg or tiff ---- you might know this....
Ha ha, agreed. Sorry!
So, I summarize then from the recent discussion here, that PIL could be divided into consisting of five parts: a) file format handling based on external libs such as libjpg, libpng, (not libtiff, I think, please confirm !!) b) file format handling based on PIL's python code c) image processing, such as contrast change, pixelwise mapping, transformations like rotation, ... d) image drawing, like addiong text into an image e) image display
One note: (b) should be divided into: (b.1) File _format_ interpretation in pure-python code (that is, figuring out where in the file the image pixels are, in what format they are stored, and in what format they need to be unpacked); and (b.2) Pixel unpacking in C-code. I never use (a) -- the pure-python code for interpreting files is just fine, and much more flexible. The only thing that (a) doesn't handle, as far as I know, are JPEG files -- those need an external library, I fear. (b.1) is quite good, but there have been a few things I've needed to patch with regard to 16-bit TIFFs and PNGs, and 32-bit TIFFs. (b.2) is not so good, mostly because it's written around the PIL's memory model, which is trickier to work with than numpy, and has much less flexibility with regard to data type. I'm with you on the interpretation of the rest. To answer Stéfan's earlier question of how I see things fitting together, I *think* that the pure-python file format interpretation code could be used (either by importing from PIL or using patched copies as needed) to figure out what a given image file type is, and where in the file the pixels are stored. Then the relevant region of the file would be passed through python.zipfile/deflate/etc. if needed to decompress the pixels, and sent to numpy for unpacking the bits from the string. I think this isn't the perfect solution for everyone: it doesn't use external libs, so it will be a bit slower and won't support all the strange corners of the format specifications (which PIL doesn't either). Also JPEG will be hard to support. But what this would be is a very light-weight python-and-numpy image reader with no external dependencies, which has merits. Zach
On 21/04/2008, Zachary Pincus <zachary.pincus@yale.edu> wrote:
To answer Stéfan's earlier question of how I see things fitting together, I *think* that the pure-python file format interpretation code could be used (either by importing from PIL or using patched copies as needed) to figure out what a given image file type is, and where in the file the pixels are stored. Then the relevant region of the file would be passed through python.zipfile/deflate/etc. if needed to decompress the pixels, and sent to numpy for unpacking the bits from the string.
So this is the bit that I don't understand. Those pixel values are encoded, so which component do you use to take the data chunk and convert it to actual pixel values? Regards Stéfan
On 21/04/2008, Zachary Pincus <zachary.pincus@yale.edu> wrote:
To answer Stéfan's earlier question of how I see things fitting together, I *think* that the pure-python file format interpretation code could be used (either by importing from PIL or using patched copies as needed) to figure out what a given image file type is, and where in the file the pixels are stored. Then the relevant region of the file would be passed through python.zipfile/deflate/etc. if needed to decompress the pixels, and sent to numpy for unpacking the bits from the string.
So this is the bit that I don't understand. Those pixel values are encoded, so which component do you use to take the data chunk and convert it to actual pixel values?
numpy.fromstring takes a byte sequence and unpacks it into an array of a specified shape and data type. Most image file formats are just different ways of putting byte sequences on disk and specifying how they were compressed, if at all. Most formats have either no compression, or LZW/Deflate/zlib-style compression, for which there are already python libraries. So for example, reading a TIFF file would consist of looking at the header to determine the pixel format, image size, and compression, then rooting around in the file to assemble the relevant bytes, then running that through deflate (most often), and passing the resulting string to numpy.fromstring. Same for PNG, or most anything that's not JPEG. Writing is similar. Again, what I'm imagining wouldn't be a full-featured image IO library, but something lightweight with no dependencies outside of numpy, and potentially (if JPEG decoding isn't desired), no C- extensions. (One could conceivably use numpy to do JPEG encoding and decoding, but I've no interest in doing that...) This is all just an idea, and I'm not convinced whether it's a great idea. But I just wanted to put the suggestion out there... Zach
On 21/04/2008, Zachary Pincus <zachary.pincus@yale.edu> wrote:
Again, what I'm imagining wouldn't be a full-featured image IO library, but something lightweight with no dependencies outside of numpy, and potentially (if JPEG decoding isn't desired), no C- extensions. (One could conceivably use numpy to do JPEG encoding and decoding, but I've no interest in doing that...)
I love it -- let's do it (if everyone agrees, of course). Having a Python reference implementation is the way to go. Should we separate the io and the image processing routines, or put it all in scipy.image? Regards Stéfan
On Mon, Apr 21, 2008 at 12:13 PM, Stéfan van der Walt <stefan@sun.ac.za> wrote:
On 21/04/2008, Zachary Pincus <zachary.pincus@yale.edu> wrote:
Again, what I'm imagining wouldn't be a full-featured image IO library, but something lightweight with no dependencies outside of numpy, and potentially (if JPEG decoding isn't desired), no C- extensions. (One could conceivably use numpy to do JPEG encoding and decoding, but I've no interest in doing that...)
I love it -- let's do it (if everyone agrees, of course). Having a Python reference implementation is the way to go. Should we separate the io and the image processing routines, or put it all in scipy.image?
If you are thinking about using PIL's code, I would prefer that it not go into scipy. It smells too much like a fork, and I just don't want scipy to get involved. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On 21/04/2008, Robert Kern <robert.kern@gmail.com> wrote:
If you are thinking about using PIL's code, I would prefer that it not go into scipy. It smells too much like a fork, and I just don't want scipy to get involved.
It sounded like Zachary said we could do it without PIL (or that was my understanding, at least). I certainly don't want to be involved in any PILfering. Cheers Stéfan
On Mon, 21 Apr 2008, Robert Kern apparently wrote:
If you are thinking about using PIL's code, I would prefer that it not go into scipy.
1. Maybe the PIL developers would like to see this *narrowly* targeted development happen no matter where it happens. Possibly even for use in PIL. It never hurts to ask them. 2. Of course, getting a response has been problematic. That is one key driver of this push, I believe. So if they do not respond, how about have this as a SciKit, so that it is outside of SciPy proper. Cheers, Alan Isaac
On Apr 21, 2008, at 1:20 PM, Robert Kern wrote:
On Mon, Apr 21, 2008 at 12:13 PM, Stéfan van der Walt <stefan@sun.ac.za
wrote: On 21/04/2008, Zachary Pincus <zachary.pincus@yale.edu> wrote:
Again, what I'm imagining wouldn't be a full-featured image IO library, but something lightweight with no dependencies outside of numpy, and potentially (if JPEG decoding isn't desired), no C- extensions. (One could conceivably use numpy to do JPEG encoding and decoding, but I've no interest in doing that...)
I love it -- let's do it (if everyone agrees, of course). Having a Python reference implementation is the way to go. Should we separate the io and the image processing routines, or put it all in scipy.image?
If you are thinking about using PIL's code, I would prefer that it not go into scipy. It smells too much like a fork, and I just don't want scipy to get involved.
Understandable. It does seem pretty clear to me that the most expeditious way to proceed with a lightweight library would be to start by modifying (perhaps heavily, as in beyond-recognition, or perhaps lightly to not-at-all) some of the pure-python PIL code that is responsible for reading image file headers. While I wouldn't call this a "fork" so much as the kind of horizontal- code-transfer between related projects that is one of the major rationales for open-source, there's no reason to be potentially provocative. Let me look into whether this idea is at all feasible, and if it is we can revisit the issue of whether it belongs anywhere near scipy. (Would getting Fredrik Lundh's OK to use various bits in this way make things easier? He does seem much more responsive to direct queries than patch-submissions.) I'm glad that there's some support for the idea, so let me look at the code that I have and see if it's worth pursuing. Zach
On Mon, Apr 21, 2008 at 3:37 PM, Zachary Pincus <zachary.pincus@yale.edu> wrote:
Let me look into whether this idea is at all feasible, and if it is we can revisit the issue of whether it belongs anywhere near scipy. (Would getting Fredrik Lundh's OK to use various bits in this way make things easier? He does seem much more responsive to direct queries than patch-submissions.)
That would alleviate my concerns, yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Zach et al, hi, has there been a development on this ... ? I.e. has anyone gotten in contact with Fredrik Lundh ? I recently got more problems with reading a Zeiss Confocal Microscope (LSM) file -- as far as I can tell, LSM files are TIFF files, and ImageMagick recognizes the file in question as such .... The point here is, that I don't think that the Image-SIG mailing list is very helpful .... (it'is only a fraction as good as what we are used to over here at SciPy .... ) So, (again,) it would be great if we could unite our image-io interests over here at SciPy... Thanks, Sebastian On Mon, Apr 21, 2008 at 10:49 PM, Robert Kern <robert.kern@gmail.com> wrote:
On Mon, Apr 21, 2008 at 3:37 PM, Zachary Pincus <zachary.pincus@yale.edu> wrote:
Let me look into whether this idea is at all feasible, and if it is we can revisit the issue of whether it belongs anywhere near scipy. (Would getting Fredrik Lundh's OK to use various bits in this way make things easier? He does seem much more responsive to direct queries than patch-submissions.)
That would alleviate my concerns, yes.
-- Robert Kern
Hi, I've made little progress on this as I've been really quite busy with my research, I'm afraid. Sorry. I will try to contact Fredrik today, though.
I recently got more problems with reading a Zeiss Confocal Microscope (LSM) file -- as far as I can tell, LSM files are TIFF files, and ImageMagick recognizes the file in question as such ....
I recently have been looking at some numpy bindings for ImageMagick that were sent to me by an acquaintance at CMU. I can send them over (once I make sure there's no problem re-distributing them), if that would be helpful for anyone before we get a dependency-free solution. Also on the microscopy-file-format question (a most vexed issue, to be sure), perhaps the tools at http://www.loci.wisc.edu/ome/formats.html will be useful. I've been using the bfconvert tool to turn Zeiss ZVI files into normal tiffs, and it handles a ton of other microscopy formats. (The bad news is that it's all in Java, so direct python bindings are rather unlikely.) Zach On Jun 15, 2008, at 3:00 AM, Sebastian Haase wrote:
Zach et al,
hi, has there been a development on this ... ? I.e. has anyone gotten in contact with Fredrik Lundh ?
I recently got more problems with reading a Zeiss Confocal Microscope (LSM) file -- as far as I can tell, LSM files are TIFF files, and ImageMagick recognizes the file in question as such .... The point here is, that I don't think that the Image-SIG mailing list is very helpful .... (it'is only a fraction as good as what we are used to over here at SciPy .... )
So, (again,) it would be great if we could unite our image-io interests over here at SciPy...
Thanks, Sebastian
On Mon, Apr 21, 2008 at 10:49 PM, Robert Kern <robert.kern@gmail.com> wrote:
On Mon, Apr 21, 2008 at 3:37 PM, Zachary Pincus <zachary.pincus@yale.edu
wrote: Let me look into whether this idea is at all feasible, and if it is we can revisit the issue of whether it belongs anywhere near scipy. (Would getting Fredrik Lundh's OK to use various bits in this way make things easier? He does seem much more responsive to direct queries than patch-submissions.)
That would alleviate my concerns, yes.
-- Robert Kern
SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
On Sun, Jun 15, 2008 at 5:33 PM, Zachary Pincus <zachary.pincus@yale.edu> wrote:
Hi,
I've made little progress on this as I've been really quite busy with my research, I'm afraid. Sorry.
I will try to contact Fredrik today, though.
great - thanks.
I recently got more problems with reading a Zeiss Confocal Microscope (LSM) file -- as far as I can tell, LSM files are TIFF files, and ImageMagick recognizes the file in question as such ....
I recently have been looking at some numpy bindings for ImageMagick that were sent to me by an acquaintance at CMU. I can send them over (once I make sure there's no problem re-distributing them), if that would be helpful for anyone before we get a dependency-free solution.
What I have seen of imagemagick in terms of binding was very discouraging .... ( Mostly abandoned stuff because I.M. seems to be moving a target; with every release a new API -- maybe that has changed by now. ) Main point though: PIL is really close to what we need....
Also on the microscopy-file-format question (a most vexed issue, to be sure), perhaps the tools at http://www.loci.wisc.edu/ome/formats.html will be useful. I've been using the bfconvert tool to turn Zeiss ZVI files into normal tiffs, and it handles a ton of other microscopy formats. (The bad news is that it's all in Java, so direct python bindings are rather unlikely.)
Yeah I know about this. I actually tried some Jython on it: and I made a nice script which converts "whole directory trees full of " LOCI-readable files into my preferred (memory mappable) file format (it's close to the Deltavision format, a derivative of the MRC format) - Sebastian
Zach
On Jun 15, 2008, at 3:00 AM, Sebastian Haase wrote:
Zach et al,
hi, has there been a development on this ... ? I.e. has anyone gotten in contact with Fredrik Lundh ?
I recently got more problems with reading a Zeiss Confocal Microscope (LSM) file -- as far as I can tell, LSM files are TIFF files, and ImageMagick recognizes the file in question as such .... The point here is, that I don't think that the Image-SIG mailing list is very helpful .... (it'is only a fraction as good as what we are used to over here at SciPy .... )
So, (again,) it would be great if we could unite our image-io interests over here at SciPy...
Thanks, Sebastian
On Mon, Apr 21, 2008 at 10:49 PM, Robert Kern <robert.kern@gmail.com> wrote:
On Mon, Apr 21, 2008 at 3:37 PM, Zachary Pincus <zachary.pincus@yale.edu
wrote: Let me look into whether this idea is at all feasible, and if it is we can revisit the issue of whether it belongs anywhere near scipy. (Would getting Fredrik Lundh's OK to use various bits in this way make things easier? He does seem much more responsive to direct queries than patch-submissions.)
That would alleviate my concerns, yes.
-- Robert Kern
Hi all,
I will try to contact Fredrik today, though.
great - thanks.
I'll let you know what I learn. Anyhow, I think that I can't really take the lead on any project right now because I have a ton of other stuff on my plate for the next month or two. If I do hear back from Fredrik in the positive, what I can do is send an interested party what I have so far, which is called "PIL- Lite" and is a private fork of PIL. From there, it should be possible to tear everything out except the image header IO (in the *ImagePlugin files) and graft that on to numpy/python decompression/pixel decoding. (At that point, what we'd have is less a fork of PIL, which rightly concerned Robert, and more a separate entity that happens to share a bit of code with PIL.) But, it won't be super-easy -- image formats are a real bear, and there are things like palette-modes to consider (do we want to support them?) as well as the issue of JPEG decoding. Zach
Hi all, I just heard back from Fredrik. He's supportive of the idea, and made some helpful suggestions for how to proceed. It might be simpler than I had thought, actually... Here's his email:
hi zachary,
Our general idea is to use the python standard library to handle most compressed data, and use numpy's internal unpacking features to decode the uncompressed and assembled data streams. Anyhow, it struck me that the *ImagePlugin.py files from the PIL would be a pretty useful (either unmodified, or more likely, slightly-modified) as the front end of this IO system, much as they are used in the PIL today.
that definitely makes sense. to avoid fragmentation, I'd prefer if you use unmodified versions (and submit any bug fixes etc upstream). the ImagePlugin modules have very few dependencies, on purpose; you should be able to create a light-weight "pil emulator" simply by plugging in Image, ImageFile, and ImagePalette objects in sys.modules, and then use the modules right away. e.g.
class ImageEmulator: ... stuff that implements necessary portions of the Image interface ... class ImageFileEmulator: ... etc class ImagePaletteEmulator: ... sys.modules["Image"] = ImageEmulator() sys.modules["ImageFile"] = ImageFileEmulator() sys.modules["ImagePalette"] = ImagePaletteEmulator()
import PngImagePlugin
see the "open" and "save" code in Image.py to get some ideas on how to use the plugins.
(and feel free to mail me if you want further integration ideas)
cheers /F
Best, Zach
On Mon, Jun 16, 2008 at 14:13, Zachary Pincus <zachary.pincus@yale.edu> wrote:
Hi all,
I just heard back from Fredrik. He's supportive of the idea, and made some helpful suggestions for how to proceed. It might be simpler than I had thought, actually...
Excellent. All of my concerns are addressed. Thank you. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Zachary Pincus wrote:
numpy.fromstring takes a byte sequence and unpacks it into an array of a specified shape and data type. Most image file formats are just different ways of putting byte sequences on disk and specifying how they were compressed, if at all. Most formats have either no compression, or LZW/Deflate/zlib-style compression, for which there are already python libraries.
So for example, reading a TIFF file would consist of looking at the header to determine the pixel format, image size, and compression, then rooting around in the file to assemble the relevant bytes, then running that through deflate (most often), and passing the resulting string to numpy.fromstring. Same for PNG, or most anything that's not JPEG. Writing is similar.
Again, what I'm imagining wouldn't be a full-featured image IO library, but something lightweight with no dependencies outside of numpy, and potentially (if JPEG decoding isn't desired), no C- extensions. (One could conceivably use numpy to do JPEG encoding and decoding, but I've no interest in doing that...)
This is all just an idea, and I'm not convinced whether it's a great idea. But I just wanted to put the suggestion out there...
I've wanted to have native image readers in SciPy for a long time for a lot of reasons (teaching being one of them so I like this approach). I'd rather not have a PIL dependency to do such things. But, that is just my point of view. So, I'm very supportive of this project generally. -Travis
On Mon, Apr 21, 2008 at 1:43 PM, Travis E. Oliphant <oliphant@enthought.com> wrote:
I've wanted to have native image readers in SciPy for a long time for a lot of reasons (teaching being one of them so I like this approach). I'd rather not have a PIL dependency to do such things. But, that is just my point of view. So, I'm very supportive of this project generally.
+1. I would also like to see basic image/movie readers in scipy.io that just give you a numpy array and not a special image object. We should avoid depending on PIL for this. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/
On Mon, Apr 21, 2008 at 2:44 PM, Jarrod Millman <millman@berkeley.edu> wrote:
+1. I would also like to see basic image/movie readers in scipy.io that just give you a numpy array and not a special image object. We should avoid depending on PIL for this.
I meant to say readers/writers. Sorry. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/
participants (7)
-
Alan G Isaac -
Jarrod Millman -
Robert Kern -
Sebastian Haase -
Stéfan van der Walt -
Travis E. Oliphant -
Zachary Pincus