<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jul 11, 2016 at 12:58 PM, Charles R Harris <span dir="ltr"><<a href="mailto:charlesr.harris@gmail.com" target="_blank">charlesr.harris@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span class="">On Mon, Jul 11, 2016 at 11:39 AM, Chris Barker <span dir="ltr"><<a href="mailto:chris.barker@noaa.gov" target="_blank">chris.barker@noaa.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span>On Sun, Jul 10, 2016 at 8:12 PM, Nathan Goldbaum <span dir="ltr"><<a href="mailto:nathan12343@gmail.com" target="_blank">nathan12343@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div><br></div><div>Maybe this can be an informal BOF session?</div></blockquote><div><br></div></span><div>or maybe a formal BoF? after all, how formal do they get?</div><div><br></div><div>Anyway, it was my understanding that we really needed to do some significant refactoring of how numpy deals with dtypes in order to do this kind of thing cleanly -- so where has that gone since last year? </div><div><br></div><div>Maybe this conversation should be about how to build a more flexible dtype system generally, rather than specifically about unit support. (though unit support is a great use-case to focus on)</div></div></div></div></blockquote><div><br></div></span><div>Note that Mark Wiebe will also be giving a talk Friday, so he may be around. As the last person to add a type to Numpy and the designer of DyND he might have some useful input. DyND development is pretty active and I'm always curious how we can somehow move in that direction.<br><br></div></div></div></div></blockquote><div><br></div><div>There has been a lot of work over the past 6 months on making DyND implement the "pluribus" concept that I have talked about briefly in the past. DyND now has a separate C++ ndt data-type library. The Python interface to that type library is still unified in the dynd module but it is separable and work is in progress to make a separate Python-wrapper to this type library. The dynd type library is datashape described at <a href="http://datashape.pydata.org">http://datashape.pydata.org</a></div><div><br></div><div>This type system is extensible and could be the foundation of a re-factored NumPy. My view (and what I am encouraging work in the direction of) is that array computing in Python should be refactored into a "type-subsystem" (I think ndt is the right model there), a generic ufunc-system (I think dynd has a very promising approach there as well), and then a container (the memoryview already in Python might be enough already). These modules could be separately installed, maintained and eventually moved into Python itself. </div><div><br></div><div>Then, a potential future NumPy project could be ported to be a layer of calculations and connections to other C-libraries on-top of this system. Many parts of the current code could be re-used in that effort --- or the new system could be part of a re-factoring of NumPy to make the innards of NumPy more accessible to a JIT compiler. </div><div><br></div><div>We are already far enough along that this could be pursued with a motivated person. It would take 18 months to complete the system but first-light would be less than 6 months for a dedicated, motivated, and talented resource. DyND is far enough along as well as Cython and/or Numba to make this pretty straight-forward. For this re-factored array-computing project to take the NumPy name, this community would have to decide that that is the right thing to do. But, other projects like Pandas and/or xarray and/or numpy-py and/or NumPy on Jython could use this sub-system also. <br></div><div><br></div><div>It has taken me a long time to actually get to the point where I would recommend a specific way forward. I have thought about this for many years and don't make these recommendations lightly. The pluribus concept is my recommendation about what would be best now and in the future --- and I will be pursuing this concept and working to get to a point where this community will accept it if possible because it would be ideal if this new array library were still called NumPy. </div><div><br></div><div>My working view is that someone will have to build the new prototype NumPy for the community to evaluate whether it's the right approach and get consensus that it is the right way forward. There is enough there now with DyND, data-shape, and Numba/Cython to do this fairly quickly. It is not strictly necessary to use DyND or Numba or even data-shape to accomplish this general plan --- but these are already available and a great place to start as they have been built explicitly with the intention of improving array-computing in Python. </div><div><br></div><div>This potential NumPy could be backwards compatible from an API perspective (including a C-API) --- though recompliation would be necessary and there would be some semantic differences in corner-cases that could either be fixed where necessary but potentially just made part of the new version. </div><div><br></div><div>I will be at the Continuum Happy hour on Thursday at our offices and welcome anyone to come discuss things with me there --- I am also willing to meet with anyone on Thursday and Friday if I can --- but I don't have a ticket to ScPy itself. Please CC me directly if you have questions. I try to follow the numpy-discussion mailing list but I am not always successful at keeping up. <br></div><div><br></div><div>To be clear as some have mis-interpreted me in the past, while I originally wrote NumPy (borrowing heavily from Numeric and drawing inspiration from Numarray and receiving a lot of help for specific modules from many of you), the community has continued to develop NumPy and now has a proper governance model. I am now simply an interested NumPy user and previous NumPy developer who finally has some concrete ideas to share based on work that I have been funding, leading, and encouraging for the past several years. </div><div><br></div><div>I am still very interested in helping NumPy progress, but we are also going to be taking these ideas to create a general concept of the "buffer protocol in Python" to enable cross-language code-sharing to enable more code re-use for data analytics among language communities. This is the concept of "data-fabric" which is pre-alpha vapor-ware at this point but with some ideas expressed at <a href="http://datashape.pydata.org">http://datashape.pydata.org</a> and here: <a href="https://github.com/blaze/datafabric">https://github.com/blaze/datafabric</a> and is something DyND is enabling. </div><div><br></div><div>NumPy itself has a clear governance model and whether NumPy (the project) adopts any of the new array-computing concepts I am proposing will depend on this community's decisions as well as work done by motivated developers willing to work on prototypes. I will be wiling to help get funding for someone motivated to work on this. </div><div><br></div><div>Best,</div><div><br></div><div>-Travis</div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div><div>Chuck <br></div></div></div></div>
<br>_______________________________________________<br>
NumPy-Discussion mailing list<br>
<a href="mailto:NumPy-Discussion@scipy.org">NumPy-Discussion@scipy.org</a><br>
<a href="https://mail.scipy.org/mailman/listinfo/numpy-discussion" rel="noreferrer" target="_blank">https://mail.scipy.org/mailman/listinfo/numpy-discussion</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div><div style="font-size:12.8px;color:rgb(136,136,136)"><b><br>Travis Oliphant, PhD</b></div><div style="font-size:12.8px;color:rgb(136,136,136)"><i>Co-founder and CEO</i></div><div style="font-size:12.8px;color:rgb(136,136,136)"><i><br></i></div><div style="font-size:12.8px;color:rgb(136,136,136)"><img src="https://docs.google.com/a/continuum.io/uc?id=0B8_D9l6ZUhNIaF9HbGZSV09TNHc&export=download" width="200" height="37"></div></div><div style="font-size:12.8px;color:rgb(136,136,136)"><br></div><div style="font-size:12.8px;color:rgb(136,136,136)">@teoliphant</div><div style="font-size:12.8px;color:rgb(136,136,136)">512-222-5440</div><div style="font-size:12.8px;color:rgb(136,136,136)"><a href="http://www.continuum.io" target="_blank">http://www.continuum.io</a></div></div></div></div></div></div></div></div>
</div></div>