[Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help]
Michael Droettboom
mdroe at stsci.edu
Tue Sep 29 13:55:24 EDT 2009
I now have a rather large patch ready which addresses the following
issues with chararrays. Would it be possible to get SVN commit
priviledges, or would you prefer a patch file?
1) Fix bugs in Trac
http://projects.scipy.org/numpy/ticket/1199 (chararray.expandtabs broken)
http://projects.scipy.org/numpy/ticket/856 (chararray __mod__ error)
http://projects.scipy.org/numpy/ticket/855 (chararray __mul__ error)
http://projects.scipy.org/numpy/ticket/1231 (chararray methods ignore
all arguments following the first argument that evaluates to False)
http://projects.scipy.org/numpy/ticket/1235 (Coercing object arrays to
string arrays has surprising behaviour)
http://projects.scipy.org/numpy/ticket/1240 (Casting from Unicode to
String array ignores exception)
http://projects.scipy.org/numpy/ticket/1241 (Array constructed with
mixture of str and unicode objects fails length detection)
I can provide small individual patches for some of these if necessary,
but some are interrelated and can only be fixed by the "whole enchilada".
2) Improve documentation
Every method now has a docstring, and a new page of routines has been
added to the Sphinx tree.
3) Improve unit test coverage
Full line-by-line coverage of defchararray.py, as well as lots of hairy
Unicode side cases.
4a) Create C-based vectorized string operations
This is benchmarking about 5x faster than the old Python-based looping
on a large database of around 20k astronomical objects
4b) Refactor chararray class in terms of those
4c) Design and create an interface to those methods that will be the
"right way" going forward
All vectorized string operations are now available as regular functions
in the numpy.char namespace. Usage of the chararray view class is only
recommended for numarray backward compatibility.
A few side notes:
http://projects.scipy.org/numpy/ticket/1200 (chararray.rstrip inconsistency)
This bug I believe should be marked as "won't fix". The inconsistent
handling of trailing whitespace inconsistency is an unfortunate
"feature" of the chararray class, and I am wary that fixing it may break
backward compatibility. However, the new free functions in numpy.char
do not have this inconsistency, so they should be recommended for new code.
http://projects.scipy.org/numpy/ticket/1240 (Casting from Unicode to
String array ignores exception)
This bug probably needs review by someone deeply familiar with the
low-level internals, as it affects more than just string and unicode
arrays. It doesn't break any of the unit tests, for what it's worth ;)
Cheers,
Mike
David Goldsmith wrote:
> Great, thanks!
>
> DG
>
> On Fri, Sep 25, 2009 at 6:07 AM, Michael Droettboom <mdroe at stsci.edu
> <mailto:mdroe at stsci.edu>> wrote:
>
> David Goldsmith wrote:
> > On Tue, Sep 22, 2009 at 4:02 PM, Ralf Gommers
> > <ralf.gommers at googlemail.com
> <mailto:ralf.gommers at googlemail.com>
> <mailto:ralf.gommers at googlemail.com
> <mailto:ralf.gommers at googlemail.com>>> wrote:
> >
> >
> > On Tue, Sep 22, 2009 at 1:58 PM, Michael Droettboom
> > <mdroe at stsci.edu <mailto:mdroe at stsci.edu>
> <mailto:mdroe at stsci.edu <mailto:mdroe at stsci.edu>>> wrote:
> >
> > Trac has these bugs. Any others?
> >
> > http://projects.scipy.org/numpy/ticket/1199
> > http://projects.scipy.org/numpy/ticket/1200
> > http://projects.scipy.org/numpy/ticket/856
> > http://projects.scipy.org/numpy/ticket/855
> > http://projects.scipy.org/numpy/ticket/1231
> >
> >
> > This one:
> >
> http://article.gmane.org/gmane.comp.python.numeric.general/23638/match=chararray
> >
> > Cheers,
> > Ralf
> >
> >
> > That last one never got "promoted" to a ticket?
> It's a symptom of this bug, that I created and produced a patch for
> yesterday:
>
> http://projects.scipy.org/numpy/ticket/1235
>
> Mike
>
>
> --
> Michael Droettboom
> Science Software Branch
> Operations and Engineering Division
> Space Telescope Science Institute
> Operated by AURA for NASA
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA
More information about the NumPy-Discussion
mailing list