[Numpy-discussion] extract elements of an array that are contained in another array?
Robert Cimrman
cimrman3 at ntc.zcu.cz
Mon Jun 8 07:51:26 EDT 2009
Hi Josef,
thanks for the summary! I am responding below, later I will make an
enhancement ticket.
josef.pktd at gmail.com wrote:
> On Sat, Jun 6, 2009 at 4:42 AM, Neil Crighton <neilcrighton at gmail.com> wrote:
>> Robert Cimrman <cimrman3 <at> ntc.zcu.cz> writes:
>>
>>> Anne Archibald wrote:
>>>
>>>> 1. add a keyword argument to intersect1d "assume_unique"; if it is not
>>>> present, check for uniqueness and emit a warning if not unique
>>>> 2. change the warning to an exception
>>>> Optionally:
>>>> 3. change the meaning of the function to that of intersect1d_nu if the
>>>> keyword argument is not present
>>>>
>
> 1. merge _nu version into one function
> -------------------------------------------------------
>
>>> You mean something like:
>>>
>>> def intersect1d(ar1, ar2, assume_unique=False):
>>> if not assume_unique:
>>> return intersect1d_nu(ar1, ar2)
>>> else:
>>> ... # the current code
>>>
>>> intersect1d_nu could be still exported to numpy namespace, or not.
>>>
>> +1 - from the user's point of view there should just be intersect1d and
>> setmember1d (i.e. no '_nu' versions). The assume_unique keyword Robert suggests
>> can be used if speed is a problem.
>
> + 1 on rolling the _nu versions this way into the plain version, this
> would avoid a lot of the confusion.
> It would not be a code breaking API change for existing correct usage
> (but some speed regression without adding keyword)
+1
> depreciate intersect1d_nu
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> intersect1d_nu could be still exported to numpy namespace, or not.
> I would say not, if they are the default branch of the non _nu version
>
> +1 on depreciation
+0
> 2. alias as "in"
> ---------------------
>> I really like in1d (no underscore) as a new name for setmember1d_nu. inarray is
>> another possibility. I don't like 'ain'; 'a' in front of 'in' detracts from
>> readability, unlike the extra a in arange.
> I don't like the extra "a"s either, ones name spaces are commonly used
>
> alias setmember1d_nu as `in1d` or `isin1d`, because the function is a
> "in" and not a set operation
> +1
+1
> 3. behavior of other set functions
> -----------------------------------------------
>
> guarantee that setdiff1d works for non-unique arrays (even when
> implementation changes), and change documentation
> +1
+1, it is useful for non-unique arrays.
> need to check other functions
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> union1d: works for non-unique arrays, obvious from source
Yes.
> setxor1d: requires unique arrays
>>>> np.setxor1d([1,2,3,3,4,5], [0,0,1,2,2,6])
> array([2, 4, 5, 6])
>>>> np.setxor1d(np.unique([1,2,3,3,4,5]), np.unique([0,0,1,2,2,6]))
> array([0, 3, 4, 5, 6])
>
> setxor: add keyword option and call unique by default
> +1 for symmetry
+1 - you mean np.setxor1d(np.unique(a), np.unique(b)) to become
np.setxor1d(a, b, assume_unique=False), right?
> ediff1d and unique1d are defined for non-unique arrays
yes
> 4. name of keyword
> ----------------------------
>
> intersect1d(ar1, ar2, assume_unique=False)
>
> alternative isunique=False or just unique=False
> +1 less to write
We should look at other functions in numpy (and/or scipy), what is a
common scheme here. -1e-1 to the proposed names, as isunique is singular
only, and unique=False does not show clearly the intent for me. What
about ar1_unique=False, ar2_unique=False - to address each argument
specifically?
> 5. module name
> -----------------------
>
> rename arraysetops to something easier to read like setfun. I think it
> would only affect internal changes since all functions are exported to
> the main numpy name space
> +1e-4 (I got used to arrayse_tops)
+0 (internal change only). Other numpy/scipy submodules containing a
bunch of functions are called *pack (fftpack, arpack, lapack), *alg
(linalg), *utils. *fun is used comonly in the matlab world.
> 5. keep docs in sync with correct usage
> ---------------------------------------------------------
>
> obvious
+1
thanks,
r.
More information about the NumPy-Discussion
mailing list