Flatten... or How to determine sequenceability?

Alex Martelli aleaxit at yahoo.com
Fri May 25 12:08:35 EDT 2001


"Noel Rappin" <noelrap at yahoo.com> wrote in message
news:mailman.990802119.12058.python-list at python.org...
> I'm writing code that needs to flatten a multi-dimension list or tuple
into
> a single dimension list.
>
> [1, [2, 3], 4] => [1, 2, 3, 4]
>
> As part of the algorithm, I need to determine whether each object in the
> list is itself a sequence or whether it is an atom.  Using the types
module
> would cause me to miss any sequence-like object that isn't actually the
> basic type.  So, what's the best (easiest, most foolproof) way to
determine
> whether a Python object is a sequence?

Hmmm -- good question.  Checking whether x[:] raises an exception
is reasonable, but a mapping that accepts slice objects would fool
this test.  I _suspect_ the most-foolproof test today might be:

def is_sequence(x):
    try:
        for _ in x:
            return 1
        else: return 1    # empty sequences are sequences too
    except: return 0

However, if I understand correctly, this would break in 2.2,
since mappings then also become usable in a for statement,
not just sequences.  This would then incorrectly identify
_any_ mapping as "a sequence", too.  What *IS* there that
you can do ONLY to a sequence, ANY sequence, but NOT to
any mapping...?  zip(), which seems like it could serve,
apparently does NOT test its arguments for "sequencehood"
(well, depending how you define that, I guess...): it
does a PySequence_GetItem, and fails iff that fails with
other than an IndexError.

>>> class x:
...   def __getitem__(self, k): return k
...
>>> a=x()
>>> zip(a,'x')
[(0, 'x')]

map() and friends would be slow on long sequences, AND do
require a sequence to have a length to accept it, too.

I'm almost tempted to propose a tiny extension module, since
the C-API level *DOES* expose an "is-a-sequence" test...:

#include "Python.h"
static PyObject *is_seq(PyObject* self, PyObject* args) {
    int i, rc;
    PyObject *arg;
    if(!PyArg_ParseTuple(args, "O", &arg)) return 0;
    return Py_BuildValue("i",PySequence_Check(arg));
}
static PyMethodDef sq_module_functions[] =
{{ "is_seq", (PyCFunction)is_seq, METH_VARARGS },
 { 0, 0 }};
voidinitsq(void) {
    PyObject* sq_module = Py_InitModule("sq", sq_module_functions);
}

and of course the setup.py to go with it:

from distutils.core import setup, Extension
sq_ext=Extension('sq', sources=['sq.c'])
setup(name="sq", ext_modules=[sq_ext])

but that doesn't seem fair...

...plus, it ALSO gets thrown by above-exemplified
object a from class x...!

So, maybe zip IS best after all, something like:

def is_sequence(x):
    try: zip(x,'')
    except: return 0
    else: return 1


Oh BTW -- all of these methods will see a string or
unicode object as a 'sequence', because it IS one by
Python's rules (you can index and slice it, loop on
it with for, &c).  Most often in such tasks one
wants to consider strings as atoms, but that will
require some further special-casing, anyway.


Alex






More information about the Python-list mailing list