[Numpy-discussion] Fortran order in recarray.

Alex Rogozhnikov alex.rogozhnikov at yandex.ru
Wed Feb 22 10:23:40 EST 2017


Hi Francesc, 
thanks a lot for you reply and for your impressive job on bcolz! 

Bcolz seems to make stress on compression, which is not of much interest for me, but the ctable, and chunked operations look very appropriate to me now. (Of course, I'll need to test it much before I can say this for sure, that's current impression).

The strongest concern with bcolz so far is that it seems to be completely non-trivial to install on windows systems, while pip provides binaries for most (or all?) OS for numpy. 
I didn't build pip binary wheels myself, but is it hard / impossible to cook pip-installabel binaries?

> ​You can change shapes of numpy arrays, but that usually involves copies of the whole container.
sure, but this is ok for me, as I plan to organize column editing in 'batches', so this should require seldom copying. 
It would be nice to see an example to understand how deep I need to go inside numpy.

Cheers, 
Alex. 
 



> 22 февр. 2017 г., в 17:03, Francesc Alted <faltet at gmail.com> написал(а):
> 
> Hi Alex,
> 
> 2017-02-22 12:45 GMT+01:00 Alex Rogozhnikov <alex.rogozhnikov at yandex.ru <mailto:alex.rogozhnikov at yandex.ru>>:
> Hi Nathaniel, 
> 
> 
>> pandas
> 
> yup, the idea was to have minimal pandas.DataFrame-like storage (which I was using for a long time), 
> but without irritating problems with its row indexing and some other problems like interaction with matplotlib.
> 
>> A dict of arrays?
> 
> 
> that's what I've started from and implemented, but at some point I decided that I'm reinventing the wheel and numpy has something already. In principle, I can ignore this 'column-oriented' storage requirement, but potentially it may turn out to be quite slow-ish if dtype's size is large.
> 
> Suggestions are welcome.
> 
> ​You may want to try bcolz:
> 
> https://github.com/Blosc/bcolz <https://github.com/Blosc/bcolz>
> 
> bcolz is a columnar storage, basically as you require, but data is compressed by default even when stored in-memory (although you can disable compression if you want to).​
> 
>  
> 
> Another strange question:
> in general, it is considered that once numpy.array is created, it's shape not changed. 
> But if i want to keep the same recarray and change it's dtype and/or shape, is there a way to do this?
> 
> ​You can change shapes of numpy arrays, but that usually involves copies of the whole container.  With bcolz you can change length and add/del columns without copies.​  If your containers are large, it is better to inform bcolz on its final estimated size.  See:
> 
> http://bcolz.blosc.org/en/latest/opt-tips.html <http://bcolz.blosc.org/en/latest/opt-tips.html>
> 
> ​Francesc​
>  
> 
> Thanks, 
> Alex.
> 
> 
> 
>> 22 февр. 2017 г., в 3:53, Nathaniel Smith <njs at pobox.com <mailto:njs at pobox.com>> написал(а):
>> 
>> On Feb 21, 2017 3:24 PM, "Alex Rogozhnikov" <alex.rogozhnikov at yandex.ru <mailto:alex.rogozhnikov at yandex.ru>> wrote:
>> Ah, got it. Thanks, Chris!
>> I thought recarray can be only one-dimensional (like tables with named columns).
>> 
>> Maybe it's better to ask directly what I was looking for: 
>> something that works like a table with named columns (but no labelling for rows), and keeps data (of different dtypes) in a column-by-column way (and this is numpy, not pandas). 
>> 
>> Is there such a magic thing?
>> 
>> Well, that's what pandas is for...
>> 
>> A dict of arrays?
>> 
>> -n
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> https://mail.scipy.org/mailman/listinfo/numpy-discussion <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
> 
> 
> 
> 
> -- 
> Francesc Alted
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> https://mail.scipy.org/mailman/listinfo/numpy-discussion <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170222/8d0d281e/attachment.html>


More information about the NumPy-Discussion mailing list