On Mon, Oct 5, 2009 at 5:22 PM, Elaine Angelino
Hi there,
We are writing to announce the release of "Tabular", a package of Python modules for working with tabular data.
Tabular is a package of Python modules for working with tabular data. Its main object is the tabarray class, a data structure for holding and manipulating tabular data. By putting data into a tabarray object, you’ll get a representation of the data that is more flexible and powerful than a native Python representation. More specifically, tabarray provides:
-- ultra-fast filtering, selection, and numerical analysis methods, using convenient Matlab-style matrix operation syntax -- spreadsheet-style operations, including row & column operations, 'sort', 'replace', 'aggregate', 'pivot', and 'join' -- flexible load and save methods for a variety of file formats, including delimited text (CSV), binary, and HTML -- helpful inference algorithms for determining formatting parameters and data types of input files -- support for hierarchical groupings of columns, both as data structures and file formats
You can download Tabular from PyPI (http://pypi.python.org/pypi/tabular/) or alternatively clone our hg repository from bitbucket (http://bitbucket.org/elaine/tabular/). We also have posted tutorial-style Sphinx documentation (http://www.parsemydata.com/tabular/).
The tabarray object is based on the record array object from the Numerical Python package (NumPy), and Tabular is built to interface well with NumPy in general. Our intended audience is two-fold: (1) Python users who, though they may not be familiar with NumPy, are in need of a way to work with tabular data, and (2) NumPy users who would like to do spreadsheet-style operations on top of their more "numerical" work.
We hope that some of you find Tabular useful!
Best,
Elaine and Dan
I briefly looked at the sphinx docs and the code. Tabular looks pretty useful and the code can be partially read as recipes for working with recarrays or structured arrays. Thanks for the choice of license (it makes looking at the code "legal"). I didn't see any explicit nan handling. Are missing values allowed e.g. in the constructor? I looked a bit closer at function like tabular.fast.recarrayisin since I always have problems with these row operations. Are these function supposed to work with arbitrary structured arrays? The tests are only for a 1d integer arrays. With floats the default string representation doesn't sort correctly. Or am I misreading the function?
arr = np.array([6,1,2,1e-13,0.5*1e-14,1,2e25,3,0,7]).view([('',float)]*2) arr array([(6.0, 1.0), (2.0, 1e-013), (5e-015, 1.0), (2.0000000000000002e+025, 3.0), (0.0, 7.0)], dtype=[('f0', '
Being able to do a searchsorted on rows of an array would be a useful feature in numpy. Is there a sortable 1d representation of the rows of a 2d float or mixed type array? Thanks, Josef
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion