[Numpy-discussion] relational join

Robert Kern robert.kern at gmail.com
Wed Feb 2 16:46:36 EST 2011


On Wed, Feb 2, 2011 at 21:42, Ilya Shlyakhter <ilya_shl at alum.mit.edu> wrote:
> Does numpy have a relational join operation for joining recordarrays?

[~]
|1> from numpy.lib import recfunctions

[~]
|2> recfunctions.join_by?
Type:           function
Base Class:     <type 'function'>
String Form:    <function join_by at 0x17bba30>
Namespace:      Interactive
File:
/Library/Frameworks/Python.framework/Versions/6.3/lib/python2.6/site-packages/numpy/lib/recfunctions.py
Definition:     recfunctions.join_by(key, r1, r2, jointype='inner',
r1postfix='1', r2postfix='2', defaults=None, usemask=True,
asrecarray=False)
Docstring:
    Join arrays `r1` and `r2` on key `key`.

    The key should be either a string or a sequence of string corresponding
    to the fields used to join the array.
    An exception is raised if the `key` field cannot be found in the two input
    arrays.
    Neither `r1` nor `r2` should have any duplicates along `key`: the presence
    of duplicates will make the output quite unreliable. Note that duplicates
    are not looked for by the algorithm.

    Parameters
    ----------
    key : {string, sequence}
        A string or a sequence of strings corresponding to the fields used
        for comparison.
    r1, r2 : arrays
        Structured arrays.
    jointype : {'inner', 'outer', 'leftouter'}, optional
        If 'inner', returns the elements common to both r1 and r2.
        If 'outer', returns the common elements as well as the elements of r1
        not in r2 and the elements of not in r2.
        If 'leftouter', returns the common elements and the elements of r1 not
        in r2.
    r1postfix : string, optional
        String appended to the names of the fields of r1 that are present in r2
        but absent of the key.
    r2postfix : string, optional
        String appended to the names of the fields of r2 that are present in r1
        but absent of the key.
    defaults : {dictionary}, optional
        Dictionary mapping field names to the corresponding default values.
    usemask : {True, False}, optional
        Whether to return a MaskedArray (or MaskedRecords is `asrecarray==True`)
        or a ndarray.
    asrecarray : {False, True}, optional
        Whether to return a recarray (or MaskedRecords if `usemask==True`) or
        just a flexible-type ndarray.

    Notes
    -----
    * The output is sorted along the key.
    * A temporary array is formed by dropping the fields not in the key for the
      two arrays and concatenating the result. This array is then sorted, and
      the common entries selected. The output is constructed by
filling the fields
      with the selected entries. Matching is not preserved if there are some
      duplicates...

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list