Mailman 3 OpenImageIO - scikit-image

newer
So I just found out that PIL will...

OpenImageIO

older
ok to require PIL for tests?

Zachary Pincus

Oct. 3, 2009

10:25 a.m.

Stéfan:

...

I wonder if we shouldn't take the plunge and add OpenImageIO as a dependency?

...

- Supported on Linux, OS X, and Windows. All available under the BSD license, so you may modify it and use it in both open source or proprietary apps.

I really don't have much hope for PIL. The development process is closed and slow. Once you ignore your community, you are pretty much done for. The only reason PIL still exists is because it is useful, but let's face it: we can easily rewrite 80% of its capabilities at a multi-day sprint. Perhaps we should.

The only downside to OpenImageIO is that it has some not-always- standard dependencies, such as boost and cmake (neither of which mentioned in the build instructions), which make it a bit tricky to install, at least from an end-user perspective (especially as a replacement for PIL, which is just a "python setup.py install" away). The situation on Windows is also not super-simple. Perhaps a streamlined build could be shoehorned into distutils (but the boost thing is still a bit of a pain)... Any thoughts on this matter? I think it looks like a great library, and it would be great to have some ctypes wrappers for it, but I'm not sure how simple a dependency it will wind up being, especially given that they don't have platform binaries available yet for OpenImageIO... Zach

Show replies by date

Stéfan van der Walt

October 2009

1:27 p.m.

Hey Zach 2009/10/3 Zachary Pincus <zachary.pincus@yale.edu>:

...

The only downside to OpenImageIO is that it has some not-always- standard dependencies, such as boost and cmake (neither of which

I didn't realise there was a boost dependency -- I'd rather avoid boost if at all possible. You spent some time writing a replacement IO reader in pure Python, if I recall correctly; did you have any practically usable results? Another option may be GraphicsMagick: http://www.graphicsmagick.org/ Regards Stéfan

Damian Eads

1:36 p.m.

One alternative is LIBCVD, which reads and writes many common formats including BMP, PNG, PPM, JPEG, etc. It has a simple, easy-to-use image loading function img_load, Image <float> img(img_load("image.png")); It also reads and writes video. All of its dependencies are optional so a reader/writer is only compiled if the development library is available during ./configure. Damian 2009/10/3 Stéfan van der Walt <stefan@sun.ac.za>:

...

Hey Zach

2009/10/3 Zachary Pincus <zachary.pincus@yale.edu>:

...
The only downside to OpenImageIO is that it has some not-always- standard dependencies, such as boost and cmake (neither of which

I didn't realise there was a boost dependency -- I'd rather avoid boost if at all possible.

You spent some time writing a replacement IO reader in pure Python, if I recall correctly; did you have any practically usable results?

Another option may be GraphicsMagick:

http://www.graphicsmagick.org/

Regards Stéfan

-- ----------------------------------------------------- Damian Eads Ph.D. Candidate University of California Computer Science 1156 High Street Machine Learning Lab, E2-489 Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads

Chris Colbert

1:55 p.m.

There is also imagemagick, which is included in the ubuntu repos: http://www.imagemagick.org/script/formats.php which supports a ton of formats, and also has python bindings... On Sat, Oct 3, 2009 at 10:36 PM, Damian Eads <eads@soe.ucsc.edu> wrote:

...

One alternative is LIBCVD, which reads and writes many common formats including BMP, PNG, PPM, JPEG, etc. It has a simple, easy-to-use image loading function img_load,

Image <float> img(img_load("image.png"));

It also reads and writes video. All of its dependencies are optional so a reader/writer is only compiled if the development library is available during ./configure.

Damian

2009/10/3 Stéfan van der Walt <stefan@sun.ac.za>:

...
Hey Zach

2009/10/3 Zachary Pincus <zachary.pincus@yale.edu>:

...
The only downside to OpenImageIO is that it has some not-always- standard dependencies, such as boost and cmake (neither of which

I didn't realise there was a boost dependency -- I'd rather avoid boost if at all possible.

You spent some time writing a replacement IO reader in pure Python, if I recall correctly; did you have any practically usable results?

Another option may be GraphicsMagick:

http://www.graphicsmagick.org/

Regards Stéfan

-- ----------------------------------------------------- Damian Eads Ph.D. Candidate University of California Computer Science 1156 High Street Machine Learning Lab, E2-489 Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads

Chris Colbert

1:56 p.m.

but then again, if we want to include anything from OpenCV we might as well use that imageIO because it supports quite a bit as well.. On Sat, Oct 3, 2009 at 10:55 PM, Chris Colbert <sccolbert@gmail.com> wrote:

...

There is also imagemagick, which is included in the ubuntu repos:

http://www.imagemagick.org/script/formats.php

which supports a ton of formats, and also has python bindings...

On Sat, Oct 3, 2009 at 10:36 PM, Damian Eads <eads@soe.ucsc.edu> wrote:

...
One alternative is LIBCVD, which reads and writes many common formats including BMP, PNG, PPM, JPEG, etc. It has a simple, easy-to-use image loading function img_load,

Image <float> img(img_load("image.png"));

It also reads and writes video. All of its dependencies are optional so a reader/writer is only compiled if the development library is available during ./configure.

Damian

2009/10/3 Stéfan van der Walt <stefan@sun.ac.za>:

...
Hey Zach

2009/10/3 Zachary Pincus <zachary.pincus@yale.edu>:

...
The only downside to OpenImageIO is that it has some not-always- standard dependencies, such as boost and cmake (neither of which

I didn't realise there was a boost dependency -- I'd rather avoid boost if at all possible.

You spent some time writing a replacement IO reader in pure Python, if I recall correctly; did you have any practically usable results?

Another option may be GraphicsMagick:

http://www.graphicsmagick.org/

Regards Stéfan

-- ----------------------------------------------------- Damian Eads Ph.D. Candidate University of California Computer Science 1156 High Street Machine Learning Lab, E2-489 Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads

Stéfan van der Walt

2:31 p.m.

2009/10/3 Chris Colbert <sccolbert@gmail.com>:

...

but then again, if we want to include anything from OpenCV we might as well use that imageIO because it supports quite a bit as well..

This is an important issue that we should clarify: "general" vs. "specific" dependencies. With a "general" dependency, I refer to a library that developers are encourage to use throughout the scikit code. If OpenCV is chosen as such a library, we can use its image loading, processing, vision etc. routines. On the other hand, a specific dependency states that only a certain function depends on, for example, OpenCV. We could say: "If you want to execute optical_flow(...) you'll have to have OpenCV installed." With a general dependency, code becomes inextricably intertwined, and you won't be able to get rid of the dependency without invasive surgery. A specific dependency is much more easily removed. My personal feeling is that we should stay away from general dependencies, if possible. I don't intend for scikits.image to become a wrapper around libcvd or opencv -- those wrappers already exist. Rather, I want to focus on implementing novel image processing techniques that are not easily available elsewhere. [Of course, if a function is easy enough to implement and useful for general purpose image processing (such as the color conversion routines), there's little reason to exclude it.] So, for image reading, I'm OK following a path such as: 1. Attempt to use ImageMagick 2. Not found, attempt to use PIL 3. Try built-in png reader (we can adapt matplotlib's) 4. Give up I'd like to hear your further opinions regarding dependencies. Thanks Stéfan

Chris Colbert

3:11 p.m.

I agree that the core functionality should not have dependencies. And I feel that IO falls under this core functionality. So if we choose an existing library for IO, I think it should be statically linked. So then it becomes a questions of which of the existing libraries would it be easiest to separate out the IO code in order to statically link. I'm not familiar with other libraries, but all IO functionality is performed with libhighgui in opencv. This includes all video IO as well... and also some basic routines for creating minimal gui windows and widgets. -Chris 2009/10/3 Stéfan van der Walt <stefan@sun.ac.za>:

...

2009/10/3 Chris Colbert <sccolbert@gmail.com>:

...
but then again, if we want to include anything from OpenCV we might as well use that imageIO because it supports quite a bit as well..

This is an important issue that we should clarify: "general" vs. "specific" dependencies.

With a "general" dependency, I refer to a library that developers are encourage to use throughout the scikit code. If OpenCV is chosen as such a library, we can use its image loading, processing, vision etc. routines.

On the other hand, a specific dependency states that only a certain function depends on, for example, OpenCV. We could say: "If you want to execute optical_flow(...) you'll have to have OpenCV installed."

With a general dependency, code becomes inextricably intertwined, and you won't be able to get rid of the dependency without invasive surgery. A specific dependency is much more easily removed.

My personal feeling is that we should stay away from general dependencies, if possible. I don't intend for scikits.image to become a wrapper around libcvd or opencv -- those wrappers already exist. Rather, I want to focus on implementing novel image processing techniques that are not easily available elsewhere. [Of course, if a function is easy enough to implement and useful for general purpose image processing (such as the color conversion routines), there's little reason to exclude it.]

So, for image reading, I'm OK following a path such as:

1. Attempt to use ImageMagick 2. Not found, attempt to use PIL 3. Try built-in png reader (we can adapt matplotlib's) 4. Give up

I'd like to hear your further opinions regarding dependencies.

Thanks Stéfan

Damian Eads

3:47 p.m.

2009/10/3 Stéfan van der Walt <stefan@sun.ac.za>:

...

2009/10/3 Chris Colbert <sccolbert@gmail.com>:

...
but then again, if we want to include anything from OpenCV we might as well use that imageIO because it supports quite a bit as well..

This is an important issue that we should clarify: "general" vs. "specific" dependencies.

It is an important distinction. Along these lines, LIBCVD has no general dependencies other than a C++ compiler and compiles on both GCC and Visual Studio. If chosen as a specific dependency, it wouldn't increase the size of our dependency DAG by very much at all.

...

With a "general" dependency, I refer to a library that developers are encourage to use throughout the scikit code. If OpenCV is chosen as such a library, we can use its image loading, processing, vision etc. routines.

On the other hand, a specific dependency states that only a certain function depends on, for example, OpenCV. We could say: "If you want to execute optical_flow(...) you'll have to have OpenCV installed."

With a general dependency, code becomes inextricably intertwined, and you won't be able to get rid of the dependency without invasive surgery. A specific dependency is much more easily removed.

My personal feeling is that we should stay away from general dependencies, if possible. I don't intend for scikits.image to become a wrapper around libcvd or opencv -- those wrappers already exist. Rather, I want to focus on implementing novel image processing techniques that are not easily available elsewhere. [Of course, if a function is easy enough to implement and useful for general purpose image processing (such as the color conversion routines), there's little reason to exclude it.]

Novel image processing algorithms not available elsewhere? Like what? Can you give examples? If we restrict our attention to novel algorithms, then we greatly limit the breadth of functionality, and scikits.image is less likely to be adopted by other researchers. Python is not currently the preferred language for Computer Vision or Image Processing. Most researchers use MATLAB, C++, or a combination of both. We should think of ways to broaden the appeal of Python to such researchers and the development of scikits.image should reflect it. Damian

Stéfan van der Walt

4:49 p.m.

2009/10/4 Damian Eads <eads@soe.ucsc.edu>:

...

It is an important distinction. Along these lines, LIBCVD has no general dependencies other than a C++ compiler and compiles on both GCC and Visual Studio. If chosen as a specific dependency, it wouldn't increase the size of our dependency DAG by very much at all.

I downloaded both ImageMagick and CVD earlier this evening and started to compile both. ImageMagick completed fairly quickly, but CVD seems to take extremely long (could be a platform/compiler specific issue, I"m not sure. Are they making heavy use of templates?). We could look at extracting the IO part of CVD or ImageMagick to keep things light-weight. But like I mentioned earlier, we could just wrap existing solutions -- the user is bound to have PIL or matplotlib or imagemagick or ... installed (and we can encourage them to do so in the readme, for example).

...

...
My personal feeling is that we should stay away from general dependencies, if possible. I don't intend for scikits.image to become a wrapper around libcvd or opencv -- those wrappers already exist. Rather, I want to focus on implementing novel image processing techniques that are not easily available elsewhere. [Of course, if a function is easy enough to implement and useful for general purpose image processing (such as the color conversion routines), there's little reason to exclude it.]

Novel image processing algorithms not available elsewhere? Like what?

Sorry, I should have said "novel OR not easily available elsewhere". My main thought was that we should not try to replicate the wrappers for OpenCV, for example.

...

Image Processing. Most researchers use MATLAB, C++, or a combination of both. We should think of ways to broaden the appeal of Python to such researchers and the development of scikits.image should reflect it.

Absolutely, but since we can't be everything to all people, I'd rather make a difference where it is needed: adding algorithms not already easily accessible to Python users. Cheers Stéfan

Damian Eads

5:17 p.m.

2009/10/4 Stéfan van der Walt <stefan@sun.ac.za>:

...

2009/10/4 Damian Eads <eads@soe.ucsc.edu>:

...
It is an important distinction. Along these lines, LIBCVD has no general dependencies other than a C++ compiler and compiles on both GCC and Visual Studio. If chosen as a specific dependency, it wouldn't increase the size of our dependency DAG by very much at all.

I downloaded both ImageMagick and CVD earlier this evening and started to compile both. ImageMagick completed fairly quickly, but CVD seems to take extremely long (could be a platform/compiler specific issue, I"m not sure. Are they making heavy use of templates?).

It's not a template issue in this case. The FAST corner detector, which compiles by default in LIBCVD, requires a lot of memory and computation. To disable, try ./configure --disable-fast7 --disable-fast8, --disable-fast9. This should greatly speed up compilation.

...

We could look at extracting the IO part of CVD or ImageMagick to keep things light-weight.

Could do. LIBCVD is pretty lightweight and much smaller than OpenCV. For example, there aren't high-level systems like face detectors in LIBCVD nor will there ever be. LIBCVD just contains basic data structures (an Image and ImageRef class), basic image/video loaders, and easy-to-use, highly optimized image processing operators. Its interface is designed to be functional rather than object-oriented.

...

But like I mentioned earlier, we could just wrap existing solutions -- the user is bound to have PIL or matplotlib or imagemagick or ... installed (and we can encourage them to do so in the readme, for example).

That could work too. :)

...

...
...
My personal feeling is that we should stay away from general dependencies, if possible. I don't intend for scikits.image to become a wrapper around libcvd or opencv -- those wrappers already exist. Rather, I want to focus on implementing novel image processing techniques that are not easily available elsewhere. [Of course, if a function is easy enough to implement and useful for general purpose image processing (such as the color conversion routines), there's little reason to exclude it.]

Novel image processing algorithms not available elsewhere? Like what?

Sorry, I should have said "novel OR not easily available elsewhere". My main thought was that we should not try to replicate the wrappers for OpenCV, for example.

Agreed, I don't think we want to replicate efforts to wrap existing libraries. However, I think rewrapping is acceptable if we offer a much simpler, functional interface than what has been done before. One of the reasons why MATLAB is so popular is its functional style and use of arrays to represent most data. If we can greatly reduce boilerplating then duplicating efforts may be worthwhile.

...

...
Image Processing. Most researchers use MATLAB, C++, or a combination of both. We should think of ways to broaden the appeal of Python to such researchers and the development of scikits.image should reflect it.

Absolutely, but since we can't be everything to all people, I'd rather make a difference where it is needed: adding algorithms not already easily accessible to Python users.

Yes, I still need to integrate the morphology code into your branch, once I get around figuring out how GIT works. Damian ----------------------------------------------------- Damian Eads Ph.D. Candidate University of California Computer Science 1156 High Street Machine Learning Lab, E2-489 Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads

Stéfan van der Walt

5:53 p.m.

2009/10/4 Damian Eads <eads@soe.ucsc.edu>:

...

Agreed, I don't think we want to replicate efforts to wrap existing libraries. However, I think rewrapping is acceptable if we offer a much simpler, functional interface than what has been done before. One of the reasons why MATLAB is so popular is its functional style and use of arrays to represent most data. If we can greatly reduce boilerplating then duplicating efforts may be worthwhile.

Right on.

...

Yes, I still need to integrate the morphology code into your branch, once I get around figuring out how GIT works.

The easiest way may be to: 1. Branch off the current master 2. Copy your changes in and commit as necessary 3. Push back to the server using git push origin <your_current_branch_name> 4. Click on "Request Merge" The most important thing is not to merge with my or other branches while developing. If you feel you'd like to provide a patch that would provide cleanly, you can rebase, as long as you are aware of the problems that can cause (for example, never rebase published changes). Re: morphologie -- should we consider including the code from the other Python library as well? Regards Stéfan

Gary Ruben

7:39 p.m.

Re, IO, has anyone looked into any of the binary file parser libraries for Python? For example, there's pyffi, construct and bdec. Pyffi http://pyffi.sourceforge.net/ looks to me like the best candidate if this approach was to be considered and it's BSD licensed. The advantages are that this approach should be robust against faulty files, there's a gui file editor, it provides access to all the file contents (not just the image planes) and it may provide a nice general way to read more general (non-image) binary files in numpy. A possible disadvantage is that it doesn't take advantage of any of numpy's binary file machinery so it may be slower, but maybe this could be improved. It's not clear whether specifying the file format with something like this makes life easier, but I thought I'd put it out there. Construct may be worth a look, but I can't see any license info. http://construct.wikispaces.com/ There's also bdec, but it's lgpl'ed so not a candidate: http://www.hl.id.au/projects/bdec/ Gary On Oct 4, 11:53 am, Stéfan van der Walt <ste...@sun.ac.za> wrote:

...

2009/10/4 Damian Eads <e...@soe.ucsc.edu>:

...
Agreed, I don't think we want to replicate efforts to wrap existing libraries. However, I think rewrapping is acceptable if we offer a much simpler, functional interface than what has been done before. One of the reasons why MATLAB is so popular is its functional style and use of arrays to represent most data. If we can greatly reduce boilerplating then duplicating efforts may be worthwhile.

Right on.

...
Yes, I still need to integrate the morphology code into your branch, once I get around figuring out how GIT works.

The easiest way may be to:

1. Branch off the current master 2. Copy your changes in and commit as necessary 3. Push back to the server using

git push origin <your_current_branch_name> 4. Click on "Request Merge"

The most important thing is not to merge with my or other branches while developing. If you feel you'd like to provide a patch that would provide cleanly, you can rebase, as long as you are aware of the problems that can cause (for example, never rebase published changes).

Re: morphologie -- should we consider including the code from the other Python library as well?

Regards Stéfan

Zachary Pincus

7:04 a.m.

...

You spent some time writing a replacement IO reader in pure Python, if I recall correctly; did you have any practically usable results?

I looked into this for a while, and came to the conclusion that it would be very annoying yet technically simple to write a bunch of basic image format parsers in pure python (using the PIL image plugins as a guide). Any compression beyond what exists in the python stdlib (which is to say, zlib, basically), though, becomes rather more of a pain -- either you'd have to disallow jpeg IO, or write/wrap a jpeg decoder -- neither of which sound particularly fun. That is, I think that one could write simple PNG and TIFF decoders (which do not support all the corners of the spec, but neither do those in the PIL) in pure python / numpy in a day or so. This would be useful for many people, but lacking jpeg would be a big issue. Perhaps we could grab just the C core of some jpeg decoder/encoder somewhere and use that? Otherwise, I think the best option is to find a simple, dependency- free C image IO library to wrap. CVD looks OK here. Zach

Stéfan van der Walt

9:14 a.m.

Hi Zach 2009/10/5 Zachary Pincus <zachary.pincus@yale.edu>:

...

...
You spent some time writing a replacement IO reader in pure Python, if I recall correctly; did you have any practically usable results?

[...]

...

That is, I think that one could write simple PNG and TIFF decoders (which do not support all the corners of the spec, but neither do those in the PIL) in pure python / numpy in a day or so. This would be useful for many people, but lacking jpeg would be a big issue. Perhaps we could grab just the C core of some jpeg decoder/encoder somewhere and use that?

libjpeg and libpng are both fairly easy to wrap with a couple of cython / ctypes calls, so I might just do that at a next sprint. Looking back at this conversation, I believe a plugin system would be a practical solution that can be implemented right away. For example: plugins (be it for PIL, CVD, Magick, etc.) are asked to load an image. If a plugin fails because the format is not supported, it raises a FormatError and the next plugin is used. WIth a plugin system in place, we can later replace as much of the functionality under the hood as we want, while having developed a consistent interface that can be exposed to the user right away (via imread). Let me know your thoughts. Regards Stéfan

Ralf Gommers

2:48 a.m.

2009/10/5 Stéfan van der Walt <stefan@sun.ac.za>

...

Hi Zach

2009/10/5 Zachary Pincus <zachary.pincus@yale.edu>:

...
...
You spent some time writing a replacement IO reader in pure Python, if I recall correctly; did you have any practically usable results?

[...]

...
That is, I think that one could write simple PNG and TIFF decoders (which do not support all the corners of the spec, but neither do those in the PIL) in pure python / numpy in a day or so. This would be useful for many people, but lacking jpeg would be a big issue. Perhaps we could grab just the C core of some jpeg decoder/encoder somewhere and use that?

libjpeg and libpng are both fairly easy to wrap with a couple of cython / ctypes calls, so I might just do that at a next sprint.

Looking back at this conversation, I believe a plugin system would be a practical solution that can be implemented right away. For example: plugins (be it for PIL, CVD, Magick, etc.) are asked to load an image. If a plugin fails because the format is not supported, it raises a FormatError and the next plugin is used.

A plugin system sounds like a good idea. Maybe it needs a little more than waiting for a format error, because it is possible for a format to be supported but in a buggy way. Then you'd get back an array filled with garbage. It should be possible for the user to specify the order in which libraries are tried, to exclude libraries completely, as well as easily register their own library as a plugin.

...

WIth a plugin system in place, we can later replace as much of the functionality under the hood as we want, while having developed a consistent interface that can be exposed to the user right away (via imread).

Do you want a single function for everything, or different functions for single-page / multi-page images? Having to do something like: img = open(fname) img2d = imread(img) img.seek() img2d = imread(img) img.seek() would be less than ideal. Anyway, a big thumbs up for a plugin system no matter what the interface will look like exactly. Cheers, Ralf

Stéfan van der Walt

4:34 a.m.

2009/10/7 Ralf Gommers <ralf.gommers@googlemail.com>:

...

Do you want a single function for everything, or different functions for single-page / multi-page images? Having to do something like:

img = open(fname) img2d = imread(img) img.seek() img2d = imread(img) img.seek()

would be less than ideal.

I have some code waiting to be merged that implements an ImageCollection. Typically, you'd have ic = ImageCollection('*.png') where all PNGs are access only as necessary, and are cached once they've been read from disk. You can also index into or iterate over an ImageCollection (yielding the image arrays). It sounds like a multi-image could be interpreted as an ImageCollection.

...

Anyway, a big thumbs up for a plugin system no matter what the interface will look like exactly.

OK, I'll implement this over the weekend. If someone else has time, feel free to jump in. Cheers Stéfan

Ralf Gommers

2:35 a.m.

2009/10/7 Stéfan van der Walt <stefan@sun.ac.za>

...

2009/10/7 Ralf Gommers <ralf.gommers@googlemail.com>:

...
Do you want a single function for everything, or different functions for single-page / multi-page images? Having to do something like:

img = open(fname) img2d = imread(img) img.seek() img2d = imread(img) img.seek()

would be less than ideal.

I have some code waiting to be merged that implements an ImageCollection. Typically, you'd have

ic = ImageCollection('*.png')

where all PNGs are access only as necessary, and are cached once they've been read from disk. You can also index into or iterate over an ImageCollection (yielding the image arrays). It sounds like a multi-image could be interpreted as an ImageCollection.

That sounds like a good option. Let me know if you want me to test it / work on it / send you some multi-image files. Cheers, Ralf

...

...
Anyway, a big thumbs up for a plugin system no matter what the interface will look like exactly.

OK, I'll implement this over the weekend. If someone else has time, feel free to jump in.

Cheers Stéfan

Stéfan van der Walt

2:48 a.m.

Hey Ralph 2009/10/9 Ralf Gommers <ralf.gommers@googlemail.com>:

...

...
where all PNGs are access only as necessary, and are cached once they've been read from disk. You can also index into or iterate over an ImageCollection (yielding the image arrays). It sounds like a multi-image could be interpreted as an ImageCollection.

That sounds like a good option. Let me know if you want me to test it / work on it / send you some multi-image files.

I'd appreciate it if you could investigate a bit further. The code I was referring to is at http://bazaar.launchpad.net/~stefanv/supreme/main/annotate/head%3A/supreme/m... As you can see, it is very simplistic. It also returns a bunch of Image objects, that we don't need. But the basic idea is there: a container over which you can iterate, that loads images on demand and keeps a cache as necessary. I've never played with loading of multi-layer images, so I hope you can get something going. Cheers Stéfan

Ralf Gommers

4:19 a.m.

2009/10/9 Stéfan van der Walt <stefan@sun.ac.za>

...

Hey Ralph

2009/10/9 Ralf Gommers <ralf.gommers@googlemail.com>:

...
...
where all PNGs are access only as necessary, and are cached once they've been read from disk. You can also index into or iterate over an ImageCollection (yielding the image arrays). It sounds like a multi-image could be interpreted as an ImageCollection.

That sounds like a good option. Let me know if you want me to test it / work on it / send you some multi-image files.

I'd appreciate it if you could investigate a bit further. The code I was referring to is at

http://bazaar.launchpad.net/~stefanv/supreme/main/annotate/head%3A/supreme/misc/io.py<http://bazaar.launchpad.net/%7Estefanv/supreme/main/annotate/head%3A/supreme/misc/io.py>

As you can see, it is very simplistic. It also returns a bunch of Image objects, that we don't need. But the basic idea is there: a container over which you can iterate, that loads images on demand and keeps a cache as necessary. I've never played with loading of multi-layer images, so I hope you can get something going.

Sure, I'll give it a go. I cloned your scikits.image repo on github, will add a new branch and push to my cloned repo once it works. That is best way to do it right? Another git question, for scipy I followed this guide: http://projects.scipy.org/numpy/wiki/GitMirror. Now I have it here: http://github.com/rgommers/scipy. Would it not be better to clone another scipy repo already on github, like David's or Pauli's? Or does it not matter? Cheers, Ralf

...

Cheers Stéfan

Stéfan van der Walt

5:04 a.m.

Hi Ralph 2009/10/9 Ralf Gommers <ralf.gommers@googlemail.com>:

...

Sure, I'll give it a go. I cloned your scikits.image repo on github, will add a new branch and push to my cloned repo once it works. That is best way to do it right?

I added some sparse instructions to http://stefanv.github.com/scikits.image/contribute.html#development-process but patches are welcome to flesh out the description.

...

Another git question, for scipy I followed this guide: http://projects.scipy.org/numpy/wiki/GitMirror. Now I have it here: http://github.com/rgommers/scipy. Would it not be better to clone another scipy repo already on github, like David's or Pauli's? Or does it not matter?

The idea is that, eventually, we have an official git repo that everybody clones. As is, it seems we all have our own clones hanging around, but David and Pauli's were probably made from the official scipy.git repo. I agree, though, that the instructions can be improved -- a lot! Hopefully we'll be switching to git and redmine soon, then these problems will go away. Cheers Stéfan

Ralf Gommers

11:36 a.m.

2009/10/9 Stéfan van der Walt <stefan@sun.ac.za>

...

Hey Ralph

2009/10/9 Ralf Gommers <ralf.gommers@googlemail.com>:

...
...
where all PNGs are access only as necessary, and are cached once they've been read from disk. You can also index into or iterate over an ImageCollection (yielding the image arrays). It sounds like a multi-image could be interpreted as an ImageCollection.

That sounds like a good option. Let me know if you want me to test it / work on it / send you some multi-image files.

I'd appreciate it if you could investigate a bit further. The code I was referring to is at

http://bazaar.launchpad.net/~stefanv/supreme/main/annotate/head%3A/supreme/misc/io.py<http://bazaar.launchpad.net/%7Estefanv/supreme/main/annotate/head%3A/supreme/misc/io.py>

As you can see, it is very simplistic. It also returns a bunch of Image objects, that we don't need. But the basic idea is there: a container over which you can iterate, that loads images on demand and keeps a cache as necessary. I've never played with loading of multi-layer images, so I hope you can get something going.

Thanks Stefan, that was a useful start. I added a MultiImg class which is quite similar to your ImgCollection. There are enough differences between a multi-image file and a collection of single image files to justify creating a separate class I think. The code is here: http://github.com/rgommers/scikits.image/blob/imgcollection/scikits/image/io... It works with my multi-frame TIFF files (only PIL trunk, not 1.1.6), and once I figure out how to create a correct TIFF header/file (does anyone have code for this?) I can add a self-contained example and tests. Things that would be useful to add: - caching a configurable number of frames (now 1 or all) - a dtype keyword - switch to the new IO plugin system once it's ready - add a MultiImgCollection - what else? Questions: - do you want to keep the Image class in that form? It seems either a plain ndarray or ndarray + tags dict is enough. - can I remove the EXIF stuff or move it to a subclass of Image? I don't think it belongs in the base Image class. - should imread be moved into io.py? I'd appreciate any feedback on the basic design and new feature suggestions. Cheers, Ralf

...

Cheers Stéfan

Stéfan van der Walt

11:12 p.m.

Hey Ralf 2009/10/9 Ralf Gommers <ralf.gommers@googlemail.com>:

...

http://github.com/rgommers/scikits.image/blob/imgcollection/scikits/image/io...

Thanks for working on this!

...

Questions: - do you want to keep the Image class in that form? It seems either a plain ndarray or ndarray + tags dict is enough.

I'd like to remove the image class entirely.

...

- can I remove the EXIF stuff or move it to a subclass of Image? I don't think it belongs in the base Image class.

Yes, although having an EXIF reader as a separate function might be handy! That code is BSD-licensed AFAIK.

...

- should imread be moved into io.py?

Let's leave it where it is for now. It is accessible as scikits.image.io.imread, which is fine from a user perspective. Other notes: If you require PIL trunk, you need to check that it is available explicitly. Also, the PIL import test is already done by imread. About naming: I'd prefer if we expand the names, i.e. MultiImage instead of MultiImg. I've learnt this the hard way, but it seems I can never remember my own shorthand :) The example markup does not require the "::". The description of MultiImage could be changed to reflect what it is storing, i.e. something like class MultiImage(object): """Class for loading multi-layer images.""" When using try-except statements, keep the code snippet contained as small as possible. In this case, there's no problem really, because you specifically wait for an EOFError. In general, however, it's safer to use the form: i = 0 while True: i += 1 try: img.seek(i) except EOFError: break return i Not sure whether you'll ever come across images without any frames inside, but in those cases you need a return statement as well, as above. _getallframes can be simplified using _getframe: frames = [] for i in range(len(self)): frames.append(self._getframe(i)) return frames The numframes variable should not be exposed, since len() is already available. The string representation can also include information on the number of frames, e.g. cat.tiff [50 frames] Finally, ensure that read-only members are defined as properties: @property def filename(self): return _filename As always with review comments: they may be overly pedantic, so use what you find applicable and discard the rest. Cheers Stéfan

Ralf Gommers

12:56 a.m.

Hi Stefan, 2009/10/10 Stéfan van der Walt <stefan@sun.ac.za>

...

Hey Ralf

2009/10/9 Ralf Gommers <ralf.gommers@googlemail.com>:

...
http://github.com/rgommers/scikits.image/blob/imgcollection/scikits/image/io...

Thanks for working on this!

...
Questions: - do you want to keep the Image class in that form? It seems either a plain ndarray or ndarray + tags dict is enough.

I'd like to remove the image class entirely.

...
- can I remove the EXIF stuff or move it to a subclass of Image? I don't think it belongs in the base Image class.

Yes, although having an EXIF reader as a separate function might be handy! That code is BSD-licensed AFAIK.

What I added is BSD-licensed as well.

...

...
- should imread be moved into io.py?

Let's leave it where it is for now. It is accessible as scikits.image.io.imread, which is fine from a user perspective.

Other notes:

If you require PIL trunk, you need to check that it is available explicitly. Also, the PIL import test is already done by imread.

OK. I'll move the import into the MultiImg class then, so ImageCollection still works if trunk is not available.

...

About naming: I'd prefer if we expand the names, i.e. MultiImage instead of MultiImg. I've learnt this the hard way, but it seems I can never remember my own shorthand :)

The reason was the Image class, which conflicted with the PIL import. That is solved now, so I'll expand all the names again.

...

The example markup does not require the "::".

I saw Pauli do that for examples that are not self-contained, i.e. can't be

run with doctest. Alternatively I can use the #doctest +SKIP markup (ugly as well...).

...

The description of MultiImage could be changed to reflect what it is storing, i.e. something like

class MultiImage(object): """Class for loading multi-layer images."""

When using try-except statements, keep the code snippet contained as small as possible. In this case, there's no problem really, because you specifically wait for an EOFError. In general, however, it's safer to use the form:

i = 0 while True: i += 1 try: img.seek(i) except EOFError: break return i

Sure, I'l change that.

...

Not sure whether you'll ever come across images without any frames inside, but in those cases you need a return statement as well, as above.

_getallframes can be simplified using _getframe:

frames = [] for i in range(len(self)): frames.append(self._getframe(i)) return frames

_getframe opens and closes the file each time, so _getallframes should be a little faster. And it's still simple code, so I think it's worth the few lines of duplication.

...

The numframes variable should not be exposed, since len() is already available.

Sure, I'll make it private.

...

The string representation can also include information on the number of frames, e.g.

cat.tiff [50 frames]

Makes sense.

...

Finally, ensure that read-only members are defined as properties:

@property def filename(self): return _filename

Sure.

...

As always with review comments: they may be overly pedantic, so use what you find applicable and discard the rest.

Don't worry, I find the above very useful. Thanks for the feedback.

Cheers, Ralf

...

Cheers Stéfan

Stéfan van der Walt

1:50 a.m.

2009/10/10 Ralf Gommers <ralf.gommers@googlemail.com>:

...

...
The example markup does not require the "::".

I saw Pauli do that for examples that are not self-contained, i.e. can't be run with doctest. Alternatively I can use the #doctest +SKIP markup (ugly as well...).

OK, that's fine then! Let me know when you're done, then I'll have a look at the patch. Cheers Stéfan

Ralf Gommers

5:41 p.m.

2009/10/11 Stéfan van der Walt <stefan@sun.ac.za>

...

2009/10/10 Ralf Gommers <ralf.gommers@googlemail.com>:

...
...
The example markup does not require the "::".

I saw Pauli do that for examples that are not self-contained, i.e. can't be run with doctest. Alternatively I can use the #doctest +SKIP markup (ugly as well...).

OK, that's fine then!

Let me know when you're done, then I'll have a look at the patch.

It's done. Nitpick away! Also, what do you think about adding a dtype keyword to imread? I find it useful to be able to get images as float for example so you don't have to worry about division problems. Cheers, Ralf

...

Cheers Stéfan

Stéfan van der Walt

11:25 p.m.

2009/10/12 Ralf Gommers <ralf.gommers@googlemail.com>:

...

Also, what do you think about adding a dtype keyword to imread? I find it useful to be able to get images as float for example so you don't have to worry about division problems.

That sounds like a useful addition. It should probably default to int8 or uint8 -- whatever is currently returned. Stéfan

Stéfan van der Walt

12:09 a.m.

2009/10/12 Ralf Gommers <ralf.gommers@googlemail.com>:

...

...
Let me know when you're done, then I'll have a look at the patch.

It's done. Nitpick away!

Thanks, Ralf! I've merged your changes: http://github.com/stefanv/scikits.image/commits/ Cheers Stéfan

Ralf Gommers

4:59 a.m.

2009/10/12 Stéfan van der Walt <stefan@sun.ac.za>

...

2009/10/12 Ralf Gommers <ralf.gommers@googlemail.com>:

...
...
Let me know when you're done, then I'll have a look at the patch.

It's done. Nitpick away!

Thanks, Ralf! I've merged your changes:

http://github.com/stefanv/scikits.image/commits/

Thanks. I've fixed a test that broke due to the io -> collection rename,

and added dtype keywords to imread and MultiImage. Defaults to None, which keeps the current behavior. Can you pull those changes as well? Cheers, Ralf

...

Cheers Stéfan

Stéfan van der Walt

6:32 a.m.

2009/10/12 Ralf Gommers <ralf.gommers@googlemail.com>:

...

Thanks. I've fixed a test that broke due to the io -> collection rename, and added dtype keywords to imread and MultiImage. Defaults to None, which keeps the current behavior.

Can you pull those changes as well?

Thanks, done (will push soon). In the future, it may be easier not to merge with the master branch. I'm still figuring out the best way to do this, but I think that will be easier since I can then just merge your branch, instead of cherry picking out the changes. Thanks! Stéfan

Ralf Gommers

6:43 a.m.

2009/10/12 Stéfan van der Walt <stefan@sun.ac.za>

...

2009/10/12 Ralf Gommers <ralf.gommers@googlemail.com>:

...
Thanks. I've fixed a test that broke due to the io -> collection rename, and added dtype keywords to imread and MultiImage. Defaults to None, which keeps the current behavior.

Can you pull those changes as well?

Thanks, done (will push soon).

In the future, it may be easier not to merge with the master branch. I'm still figuring out the best way to do this, but I think that will be easier since I can then just merge your branch, instead of cherry picking out the changes.

Hmm, not sure how else I would have fixed that test, since it only broke after you renamed io.py in the master branch. Why did you have to cherry pick, instead of just merging back my imgcollection branch into your master? Disclaimer: I am also quite new to this way of doing things. Cheers, Ralf

...

Thanks! Stéfan

Stéfan van der Walt

6:48 a.m.

2009/10/12 Ralf Gommers <ralf.gommers@googlemail.com>:

...

...
In the future, it may be easier not to merge with the master branch. I'm still figuring out the best way to do this, but I think that will be easier since I can then just merge your branch, instead of cherry picking out the changes.

Hmm, not sure how else I would have fixed that test, since it only broke after you renamed io.py in the master branch. Why did you have to cherry pick, instead of just merging back my imgcollection branch into your master?

Disclaimer: I am also quite new to this way of doing things.

You could simply have created a new branch, and made your changes there. One branch per change (or related set of changes) sounds about right. If I simply merged, we would have had messages in the commit log such as: Stefan merged Ralf's branch. Ralf merged Stefan's main branch. Ralf changes this and that. Now, we just have: Stefan merged Ralf's branch. Ralf changed this and that. Have a look at these two articles: http://article.gmane.org/gmane.comp.video.dri.devel/34744 http://lwn.net/Articles/328436/ Cheers Stéfan

Ralf Gommers

7:14 a.m.

2009/10/12 Stéfan van der Walt <stefan@sun.ac.za>

...

2009/10/12 Ralf Gommers <ralf.gommers@googlemail.com>:

...
...
In the future, it may be easier not to merge with the master branch. I'm still figuring out the best way to do this, but I think that will be easier since I can then just merge your branch, instead of cherry picking out the changes.

Hmm, not sure how else I would have fixed that test, since it only broke after you renamed io.py in the master branch. Why did you have to cherry pick, instead of just merging back my imgcollection branch into your master?

Disclaimer: I am also quite new to this way of doing things.

You could simply have created a new branch, and made your changes there. One branch per change (or related set of changes) sounds about right.

If I simply merged, we would have had messages in the commit log such as:

Stefan merged Ralf's branch. Ralf merged Stefan's main branch. Ralf changes this and that.

Now, we just have:

Stefan merged Ralf's branch. Ralf changed this and that.

Have a look at these two articles:

http://article.gmane.org/gmane.comp.video.dri.devel/34744 http://lwn.net/Articles/328436/

Makes sense, thanks for the lesson:) Cheers, Ralf

...

Cheers Stéfan

5604

Age (days ago)

5613

Last active (days ago)

List overview

Download

31 comments

6 participants

participants (6)

Chris Colbert
Damian Eads
Gary Ruben
Ralf Gommers
Stéfan van der Walt
Zachary Pincus