cp wrote:
The image I tested initially is 2000x2000 RGB tif ~11mb in size.
I continued testing, with the initial PIL approach and 3 alternative numpy scripts:
#Script 1 - indexing for i in range(10): imarr[:,:,0].mean() imarr[:,:,1].mean() imarr[:,:,2].mean()
#Script 2 - slicing for i in range(10): imarr[:,:,0:1].mean() imarr[:,:,1:2].mean() imarr[:,:,2:3].mean()
#Script 3 - reshape for i in range(10): imarr.reshape(-1,3).mean(axis=0)
#Script 4 - PIL for i in range(10): stats = ImageStat.stat(img) stats.mean
After profiling the four scripts separately I got the following script 1: 5.432sec script 2: 10.234sec script 3: 4.980sec script 4: 0.741sec
when I profiled scripts 1-3 without calculating the mean, I got similar results of about 0.45sec for 1000 cycles, meaning that even if there is a copy involved the time required is only a small fraction of the whole procedure.Getting back to my initial statement I cannot explain why PIL is very fast in calculations for whole images, but very slow in calculations of small sub-images.
I don't know anything about PIL and its implementation, but I would not be surprised if the cost is mostly accessing items which are not contiguous in memory and bounds checking ( to check where you are in the subimage). Conditional inside loops often kills performances, and the actual computation (one addition/item for naive average implementation) is negligeable in this case. cheers, David