
I like your idea with AtomicIterable, but I still don't think it fully solves the problem. How can one categorize array.array? Consider an array of bytes. Sometimes you'll want to iterate over them as ints sometimes as 1 byte objects. In one case they can be used in order to conserve space as a sequence of ints in which case you'll want them one by one, and in the other case they'll be atomic just like bytes. Giving the register option on the other hand to the end user would pose a different problem - ABC.register is global. If you're developing a module and register array.array as an AtomicIterable, you might affect other parts of the program in unexpected ways. I think it's a tough one to solve :-/ On Tue, Jul 26, 2016, 7:52 AM Nick Coghlan <ncoghlan@gmail.com> wrote:
On 26 July 2016 at 03:58, Guido van Rossum <guido@python.org> wrote:
On Mon, Jul 25, 2016 at 10:19 AM, Gregory P. Smith <greg@krypto.org> wrote:
Given how often the str problem comes up in this context, I'm thinking a common special case in type annotation checkers that explicitly exclude str from consideration for iteration. It's behavior is rarely what the programmer intended and more likely to be a bug.
Should we have IterableButNotString, CollectionExceptString, NonStringSequence, or all three? I suppose the runtime check would be e.g. isinstance(x, Iterable) and not isinstance(x, (str, bytes, bytearray, memoryview))? And the type checker should special-case the snot out of it?
For cases where str instances are handled differently from other iterables, but still within the same function, a convention of writing "Union[str, Iterable]" as the input parameter type may suffice - while technically the union is redundant, the implication would be that str instances are treated as str objects, rather than as an iterable of length-1 str objects.
The other case where this would come up is in signature overloading, where one overload may be typed as "str" and the other as "Iterable". Presumably in those cases the more specific type already wins.
Anything more than that would need to be defined in the context of a particular algorithm, such as the oft-requested generic "flatten" operation, where you typically want to consider types like str, bytes, bytearray and memoryview as atomic objects, rather than as containers, but then things get blurry around iterable non-builtin types like array.array, numpy.ndarray, Pandas data frames, image formats with pixel addressing, etc, as well as builtins with multiple iteration behaviours like dict.
One possible way to tackle that would be to declare a new collections.abc.AtomicIterable ABC, as well as a collections.flatten operation defined as something like:
def flatten(iterable): for item in iterable: if isinstance(item, AtomicIterable): yield item continue if isinstance(item, Mapping): yield from item.items() continue try: subiter = iter(item) except TypeError: yield item else: yield from flatten(subiter)
Over time, different types would get explicitly registered with AtomicIterable based on what their developers considered the most appropriate behaviour to be when asked to flatten them.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/