[scikit-image] Memory consumption of measure.label (compared to matlab)
Martin Fleck
martin.fleck at uni-konstanz.de
Thu Jul 13 10:03:31 EDT 2017
Hi again,
attached is a file "matlab_memory_info" and again the same
"skiamge_memory_profiler.out" that I showed before.
in the matlab_memory_info file, I added for every matlab call the
equivalent that I do in skimage.
I don't think it will be needed - the attached files should be enough -
but if someone wants to see the full memory report of matlab, you can
download it here:
https://drive.google.com/open?id=0BzmlODsuIIz0dVRsMk9sT3RuU0E
(it's an html file)
Cheers,
Martin
On 07/13/2017 02:09 PM, Martin Fleck wrote:
>
> Hi again,
>
> here you can download a minimal example:
>
> https://drive.google.com/open?id=0BzmlODsuIIz0elpIcU1kdmpNTlE
> (download button is the arrow on the top right)
>
> In order to run it and get the memory_profiler output you have to
> install memory_profiler
> e.g. with
>
> pip3 install memory_profiler
>
> and run the file with
>
> python3 -m memory_profiler minimal_test.py
>
> If you just want to run the example without memory profiling and
> installing memory_profiler, you have to comment out or remove line 8
> "@profile"
>
> Cheers,
> Martin
>
>
>
> On 07/13/2017 01:21 PM, Martin Fleck wrote:
>>
>> Hi Juan, hi Greg,
>>
>> quoting Greg:
>> > I think the main reason for the increased memory usage is that the
>> output type of the label function is int64 while your input is most
>> likely uint8.
>>
>> Indeed, this could be the complete problem already! For the analysis
>> I use a binary image - so only one bit per pixel.
>>
>> Greg: Regarding your PR and my analysis: My analysis using a 1.2GB
>> file stops due to memory problems already in
>> skimage.morphology.remove_small_objects() even if the major memory
>> blowup happens with skimage.morphology.label().
>> So there are problems at multiple steps that hopefully can be improved.
>>
>> Quoting Juan:
>> > For example, what are the data types of the outputs in Matlab?
>>
>> the first steps of my analysis are to convert the 8 bit input image
>> to a meaningful binary image. The whole analysis is done on binary
>> images. So all inputs and outputs in Matlab are of Matlab Class
>> "logical".
>>
>> I will provide you with a minimal example script and data for the
>> skimage case.
>> I will try to create equivalent memory inofrmation in Matlab.
>>
>> I'll both post it here as soon as I'm done with that.
>>
>> Thanks so far!
>>
>> Martin
>>
>> On 07/13/2017 03:05 AM, Juan Nunez-Iglesias wrote:
>>> Hi Martin,
>>>
>>> No one on this list wants to push you to more Matlab usage, believe
>>> me. ;)
>>>
>>> Do you think you could provide a script and sample data that we can
>>> use for troubleshooting? As Greg pointed out, the optimization
>>> approach *might* have to be data-type dependent. We could, for
>>> example, provide a dtype= keyword argument that would force the
>>> output to be of a particular, more memory-efficient type, if you
>>> know in advance how many objects you expect.
>>>
>>> If you can provide something similar to a memory profile, and
>>> diagnostic information, for your equivalent Matlab script, that
>>> would be really useful, so we know what we are aiming for. For
>>> example, what are the data types of the outputs in Matlab?
>>>
>>> Juan.
>>>
>>> On 13 Jul 2017, 9:59 AM +1000, Gregory Lee <grlee77 at gmail.com>, wrote:
>>>> Hi Martin,
>>>>
>>>> My problem my analysis uses much more memory than I expect.
>>>> I attached output from the memory_profiler package, with which
>>>> I tried
>>>> to keep track of the memory consumption of my analysis.
>>>> You can see that for an ~8MiB file that I used for testing,
>>>> skimage.measure.label needs to use 56MiB of memory, which
>>>> surprised me.
>>>>
>>>>
>>>> I haven't looked at it in much detail, but I did find what appear
>>>> to be some unnecessary copies in the top-level Cython routine
>>>> called by skimage.morphology.label. I opened a PR to try and avoid
>>>> this here:
>>>> https://github.com/scikit-image/scikit-image/pull/2701
>>>> <https://github.com/scikit-image/scikit-image/pull/2701>
>>>>
>>>> However, I think that PR is going to give a minor performance
>>>> improvement, but not help with memory use much if at all. I think
>>>> the main reason for the increased memory usage is that the output
>>>> type of the label function is int64 while your input is most likely
>>>> uint8. This means that the labels array requires 8 times the
>>>> memory usage of the uint8 input. I don't think there is much way
>>>> around that without making a version of the routines that allows
>>>> specifying a smaller integer dtype.
>>>>
>>>> - Greg
>>>> _______________________________________________
>>>> scikit-image mailing list
>>>> scikit-image at python.org
>>>> https://mail.python.org/mailman/listinfo/scikit-image
>>>
>>>
>>> _______________________________________________
>>> scikit-image mailing list
>>> scikit-image at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-image
>>
>>
>>
>> _______________________________________________
>> scikit-image mailing list
>> scikit-image at python.org
>> https://mail.python.org/mailman/listinfo/scikit-image
>
>
>
> _______________________________________________
> scikit-image mailing list
> scikit-image at python.org
> https://mail.python.org/mailman/listinfo/scikit-image
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-image/attachments/20170713/d09303c6/attachment.html>
-------------- next part --------------
Line # Mem usage Increment Line Contents
================================================
154 102.523 MiB 0.000 MiB @profile
155 def run():
156 110.500 MiB 7.977 MiB image = data.imread(image_filename)
157 110.500 MiB 0.000 MiB image = color.rgb2gray(image)
158
159 # make binary image with threshold using Li's method
160 117.328 MiB 6.828 MiB bwSource = image<filters.threshold_minimum(image)
161
162 # Filter bwSource image -> remove small specks
163 131.715 MiB 14.387 MiB bwFiltered = morphology.remove_small_objects(bwSource, min_size=minSingleDislocationArea, connectivity=2)
164
165 # analyze regions:
166 187.551 MiB 55.836 MiB label_img = label(bwFiltered)
167 187.809 MiB 0.258 MiB regions = regionprops(label_img)
-------------- next part --------------
FunctionName EquivalentCallInSkimage Calls TotalTime SelfTime* AllocatedMemory FreedMemory SelfMemory PeakMemory TotalTimePlot
imread skimage.data.imread() 1 0.178 s 0.003 s 8770.72 Kb 612.52 Kb 42.67 Kb 7598.20 Kb
imbinarize IMAGE<TRESHOLD_VALUE 1 0.035 s 0.001 s 6705.84 Kb 58.42 Kb 14.33 Kb 6472.17 Kb
bwareaopen skimage.morphology.remove_small_objects() 1 0.720 s 0.032 s 17420.83 Kb 7763.23 Kb 5011.38 Kb 7147.20 Kb
regionprops skimage.measure.regionprops(skiamge.measure.label) 1 19.697 s 0.018 s 93376.56 Kb 96408.75 Kb -11470.08 Kb 1602.05 Kb
More information about the scikit-image
mailing list