[Python-Dev] Clean way in python to test for None, empty, scalar, and list/ndarray? A prayer to the gods of Python
maschu09 at gmail.com
Fri Jun 14 21:12:00 CEST 2013
As much as I love python, the following drives me crazy, and I would wish
that some future version would come up with a more consistent approach for
this. And please don't reply with "Too bad if you don't know what type your
data are..." - if I want to implement some generic functionality, I want to
avoid constrains on the user where not absolutely necessary, and I believe
this approach is truely pythonic.
OK - here comes the problem set. These are in fact several related issues.
Clearly, a solution exists for each of them, but you will have to admit
that they are all very different in style and therefore unnecessarily
complicate matters. If you want to write code which works under many
circumstances, it will become quite convoluted.
1. Testing for None:
From what I read elsewhere, the preferred solution is `if x is None` (or
`if not x is None` in the opposite case). This is fine and it works for
scalars, lists, sets, numpy ndarrays,...
2. Testing for empty lists or empty ndarrays:
In principle, `len(x) == 0` will do the trick. **BUT** there are several
- `len(scalar)` raises a TypeError, so you will have to use try and
except or find some other way of testing for a scalar value
- `len(numpy.array(0))` (i.e. a scalar coded as numpy array) also raises
a TypeError ("unsized object")
- `len([])` returns a length of 1, which is somehow understandable,
but - I would argue - perhaps not what one might expect initially
Alternatively, numpy arrays have a size attribute, and
`numpy.array().size`, `numpy.array(8.).size`, and
`numpy.array([8.]).size` all return what you would expect. And even
`numpy.array([]).size` gives you 0. Now, if I could convert everything to
a numpy array, this might work. But have you ever tried to assign a list of
mixed data types to a numpy array? `numpy.array(["a",1,[2,3],(888,9)])`
will fail, even though the list inside is perfectly fine as a list.
3. Testing for scalar:
Let's suppose we knew the number of non-empty elements, and this is 1.
Then there are occasions when you want to know if you have in fact `6` or
`` as an answer (or maybe even `[]`). Obviously, this question is
also relevant for numpy arrays. For the latter, a combination of size and
ndim can help. For other objects, I would be tempted to use something like
`isiterable()`, however, this function doesn't exist, and there are
numerous discussions how one could or should find out if an object is
iterable - none of them truly intuitive. (and is it true that **every**
iterable object is a descendant of collections.Iterable?)
4. Finding the number of elements in an object:
From the discussion above, it is already clear that `len(x)` is not very
robust for doing this. Just to mention another complication: `len("abcd")`
returns 4, even though this is only one string. Of course this is correct,
but it's a nuisance if you need to to find the number of elements of a list
of strings and if it can happen that you have a scalar string instead of a
1-element list. And, believe me, such situations do occur!
5. Forcing a scalar to become a 1-element list:
Unfortunately, `list(77)` throws an error, because 77 is not iterable.
`numpy.array(77)` works, but - as we saw above - there will be no len
defined for it. Simply writing `[x]` is dangerous, because if x is a list
already, it will create `[]`, which you generally don't want. Also,
`numpy.array([x])` would create a 2D array if x is already a 1D array or a
list. Often, it would be quite useful to know for sure that a function
result is provided as a list, regardless of how many elements it contains
(because then you can write `res` without risking the danger to throw an
exception). Does anyone have a good suggestion for this one?
6. Detecting None values in a list:
This is just for completeness. I have seen solutions using `all` which
solve this problem (see [question #1270920]). I haven't digged into them
extensively, but I fear that these will also suffer from the
above-mentioned issues if you don't know for sure if you are starting from
a list, a numpy array, or a scalar.
Enough complaining. Here comes my prayer to the python gods: **Please**
- add a good `isiterable` function
- add a `size` attribute to all objects (I wouldn't mind if this is None
in case you don't really know how to define the size of something, but it
would be good to have it, so that `anything.size` would never throw an error
- add an `isscalar` function which would at least try to test if something
is a scalar (meaning a single entity). Note that this might give different
results compared to `isiterable`, because one would consider a scalar
string as a scalar even though it is iterable. And if `isscalar` would
throw exceptions in cases where it doesn't know what to do: fine - this can
be easily captured.
- enable the `len()` function for scalar variables such as integers or
floats. I would tend to think that 1 is a natural answer to what the length
of a number is.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev