[Numpy-discussion] Moving lib.recfunctions?

Skipper Seabold jsseabold at gmail.com
Tue Jul 5 15:23:10 EDT 2011


On Tue, Jul 5, 2011 at 2:46 PM, Pierre GM <pgmdevlist at gmail.com> wrote:
>
> On Jul 5, 2011, at 8:33 PM, Skipper Seabold wrote:
>
>> On Fri, Jul 1, 2011 at 2:32 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>> On Fri, Jul 1, 2011 at 2:22 PM,  <josef.pktd at gmail.com> wrote:
>>>> On Fri, Jul 1, 2011 at 1:59 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>>> lib.recfunctions has never been fully advertised. The two bugs I just
>>>>> discovered lead me to believe that it's not that well vetted, but it
>>>>> is useful. I can't be the only one using these?
>>>>>
>>>>> What do people think of either deprecating lib.recfunctions or at
>>>>> least importing them into the numpy.rec namespace?
>>>>>
>>>>> I'm sure this has come up before, but gmane search isn't working for me.
>>>>
>>>> about once a year
>>>>
>>>> http://old.nabble.com/Emulate-left-outer-join--td27522655.html#a27522655
>>>>
>>>> my guess is not much has changed since then
>>>>
>>>
>>> Ah, yes. I recall now.
>>>
>>> I agree that they're more general than rec, but also don't have a
>>> first best solution for this. So I think we should move them (in a
>>> correct way) to numpy.rec and add (some of?) them as methods to
>>> recarrays. The best we can do beyond that is put some docs on the
>>> structured array page and notes in the docstrings that they also work
>>> for ndarrays with structured dtype.
>>>
>>> I'll submit a pull request soon and maybe that'll generate some interest.
>>>
>>
>> Had a brief look at what getting lib.recfunctions into rec/core.rec
>> namespace would entail. It's not as simple as it could be, because
>> there are circular imports between core.records and recfunctions (and
>> its imports). It seems that it is possible to work around the circular
>> imports in some of the code except for the degree to which
>> recfunctions is wrapped up with the masked array code.
>
> Hello,
> The idea behin having a lib.recfunctions and not a rec.recfunctions or whatever was to illustrate that the functions of this package are more generic than they appear. They work with regular structured ndarrays and don't need recarrays. Methinks we gonna lose this aspect if you try to rename it, but hey, your call.

I agree (even though 'rec' is already in the name). My goal was to
just have numpy.rec.join_by, numpy.rec.stack_arrays, etc, so they're
right there (rec seems more intuitive than lib to me). Do you think
that they may be better off in the main numpy namespace? This is far
from my call, just trying to reach some consensus and make an effort
to move the status quo.

> As to as why they were never really advertised ? Because I never received any feedback when I started developing them (developing is a big word here, I just took a lot of code that John D Hunter had developed in matplotlib and make it more consistent). I advertised them once or twice on the list, wrote the basic docstrings, but waited for other people to start using them.

As Josef pointed out before, it's a chicken and egg thing re:
advertisement and feedback. I think the best advertisement is by
namespace. I use them frequently, and I haven't offered any feedback
because I've never been left wanting (recent pull request is the only
exception). For the most part they do what I want and the docs are
good.

> Anyhow.
> So, yes, there might be some weird import to polish. Note that if you decided to just rename the package and leave it where it was, it would probably be easier.
>

Imports are fine as long as they stay where they are and aren't
imported into numpy.core.

>
>> The path of least resistance is to just import lib.recfunctions.* into
>> the (already crowded) main numpy namespace and be done with it.
>
> Why ? Why can't you leave it available through numpy.lib ? Once again, if it's only a matter of PRing, you could start writing an entry page in the doc describing the functions, that would improve the visibility.

I'm fine with leaving the code where it is, but typing
numpy.lib.recfunctions.<function> is painful (ditto `import
numpy.lib.recfunctions as nprf`). Every time. And I have to do this
often. Even if they are imported into the lib namespace (they aren't),
it would be an improvement, but I still don't think it occurs to
people to hunt through lib to try and join two structured arrays. It
looks like everything in the lib namespace is imported into the main
numpy namespace anyway. And 2) I found a little buglet recently that
made me think this code should be banged on more. The best way to do
this is to get it out there. If other users are anything like me, I
rely on tab-completion and docstrings not online docs for working with
projects that I don't need to be intimately familiar with, the
implication being that lib is intimate, I guess.

Skipper
(standing astride this molehill)

>
>
>> Another option, though it's more work, is to remove all the internal
>> masked array support and let the user decide what do with the
>> record/structured arrays after they're returned (I invariably have to
>> set usemask=False anyway).
>
> Or you just port the functions in numpy.ma (making a numpy.ma.recfunctions, for example).
>
>
>> The functions can then be wrapped by
>> higher-level ones in np.ma if the old usemask behavior is still
>> desirable for people. This should probably wait until the new masked
>> array changes are in and settled a bit though.
>
> Oh yes... I agree with that
> P.
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list