<div class="gmail_quote">On Fri, Jun 24, 2011 at 7:02 PM, Matthew Brett <span dir="ltr"><<a href="mailto:matthew.brett@gmail.com">matthew.brett@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Hi,<br>
<br>
On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney <<a href="mailto:wesmckinn@gmail.com">wesmckinn@gmail.com</a>> wrote:<br>
...<br>
<div class="im">> Perhaps we should make a wiki page someplace summarizing pros and cons<br>
> of the various implementation approaches?<br>
<br>
</div>But - we should do this if it really is an open question which one we<br>
go for. If not then, we're just slowing Mark down in getting to the<br>
implementation.<br>
<br>
Assuming the question is still open, here's a starter for the pros and cons:<br>
<br>
array.mask<br>
1) It's easier / neater to implement<br></blockquote><div><br></div><div>Yes</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
2) It can generalize across dtypes<br></blockquote><div><br></div><div>Yes</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
3) You can still get the masked data underneath the mask (allowing you<br>
to unmask etc)<br></blockquote><div><br></div><div>By setting up views appropriately, yes. If you don't have another view to the underlying data, you can't get at it. </div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
nafloat64:<br>
1) No memory overhead<br></blockquote><div><br></div><div>Yes</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
2) Battle-tested implementation already done in R<br></blockquote><div><br></div><div>We can't really use that though, R is GPL and NumPy is BSD. The low-level implementation details are likely different enough that a re-implementation would be needed anyway.</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
I guess we'd have to test directly whether the non-continuous memory<br>
of the mask and data would cause enough cache-miss problems to<br>
outweigh the potential cycle-savings from single byte comparisons in<br>
array.mask.<br></blockquote><div><br></div><div>The different memory buffers are each contiguous, so the access patterns still have a lot of coherency. I intend to give the mask memory layouts matching those of the arrays.</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
I guess that one and only one of these will get written. I guess that<br>
one of these choices may be a lot more satisfying to the current and<br>
future masked array itch than the other.<br></blockquote><div><br></div><div>I'm only going to implement one solution, yes.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
I'm personally worried that the memory overhead of array.masks will<br>
make many of us tend to avoid them. I work with images that can<br>
easily get large enough that I would not want an array-items size byte<br>
array added to my storage.<br></blockquote><div><br></div><div>May I ask what kind of dtypes and sizes you're working with? </div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
The reason I'm asking for more details about the implementation is<br>
because that is most of the argument for array.mask at the moment (1<br>
and 2 above).<br></blockquote><div><br></div><div>I'm first trying to nail down more of the higher level requirements before digging really deep into the implementation details. They greatly affect how those details have to turn out.</div>
<div><br></div><div>-Mark</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<br>
See you,<br>
<font color="#888888"><br>
Matthew<br>
</font><div><div></div><div class="h5">_______________________________________________<br>
NumPy-Discussion mailing list<br>
<a href="mailto:NumPy-Discussion@scipy.org">NumPy-Discussion@scipy.org</a><br>
<a href="http://mail.scipy.org/mailman/listinfo/numpy-discussion" target="_blank">http://mail.scipy.org/mailman/listinfo/numpy-discussion</a><br>
</div></div></blockquote></div><br>