[scikit-image] Memory consumption of measure.label (compared to matlab)

Martin Fleck martin.fleck at uni-konstanz.de
Thu Jul 13 07:21:58 EDT 2017


Hi Juan, hi Greg,

quoting Greg:
> I think the main reason for the increased memory usage is that the
output type of the label function is int64 while your input is most
likely uint8.

Indeed, this could be the complete problem already! For the analysis I
use a binary image - so only one bit per pixel.

Greg: Regarding your PR and my analysis: My analysis using a 1.2GB file
stops due to memory problems already in
skimage.morphology.remove_small_objects() even if the major memory
blowup happens with skimage.morphology.label().
So there are problems at multiple steps that hopefully can be improved.

Quoting Juan:
> For example, what are the data types of the outputs in Matlab?

the first steps of my analysis are to convert the 8 bit input image to a
meaningful binary image. The whole analysis is done on binary images. So
all inputs and outputs in Matlab are of Matlab Class "logical".

I will provide you with a minimal example script and data for the
skimage case.
I will try to create equivalent memory inofrmation in Matlab.

I'll both post it here as soon as I'm done with that.

Thanks so far!

Martin

On 07/13/2017 03:05 AM, Juan Nunez-Iglesias wrote:
> Hi Martin,
>
> No one on this list wants to push you to more Matlab usage, believe me. ;)
>
> Do you think you could provide a script and sample data that we can
> use for troubleshooting? As Greg pointed out, the optimization
> approach *might* have to be data-type dependent. We could, for
> example, provide a dtype= keyword argument that would force the output
> to be of a particular, more memory-efficient type, if you know in
> advance how many objects you expect.
>
> If you can provide something similar to a memory profile, and
> diagnostic information, for your equivalent Matlab script, that would
> be really useful, so we know what we are aiming for. For example, what
> are the data types of the outputs in Matlab?
>
> Juan.
>
> On 13 Jul 2017, 9:59 AM +1000, Gregory Lee <grlee77 at gmail.com>, wrote:
>> Hi Martin,
>>
>>     My problem my analysis uses much more memory than I expect.
>>     I attached output from the memory_profiler package, with which I
>>     tried
>>     to keep track of the memory consumption of my analysis.
>>     You can see that for an ~8MiB file that I used for testing,
>>     skimage.measure.label needs to use 56MiB of memory, which
>>     surprised me.
>>
>>
>> I haven't looked at it in much detail, but I did find what appear to
>> be some unnecessary copies in the top-level Cython routine called by
>> skimage.morphology.label.  I opened a PR to try and avoid this here:
>> https://github.com/scikit-image/scikit-image/pull/2701
>> <https://github.com/scikit-image/scikit-image/pull/2701>
>>
>> However, I think that PR is going to give a minor performance
>> improvement, but not help with memory use much if at all.  I think
>> the main reason for the increased memory usage is that the output
>> type of the label function is int64 while your input is most likely
>> uint8.  This means that the labels array requires 8 times the memory
>> usage of the uint8 input.  I don't think there is much way around
>> that without making a version of the routines that allows specifying
>> a smaller integer dtype.
>>
>> - Greg
>> _______________________________________________
>> scikit-image mailing list
>> scikit-image at python.org
>> https://mail.python.org/mailman/listinfo/scikit-image
>
>
> _______________________________________________
> scikit-image mailing list
> scikit-image at python.org
> https://mail.python.org/mailman/listinfo/scikit-image

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-image/attachments/20170713/66e603d8/attachment.html>


More information about the scikit-image mailing list