ANN: intervalset Was: Set type for datetime intervals

Tue Apr 5 04:11:04 EDT 2016

> Yes, my question is why it's useful to have a single Interval as a
> *distinct* type, separate from the interval set type, which supports a
> sharply limited number of set-like operations (such as the union of two
> overlapping intervals but NOT two non-overlapping ones). This doesn't
> appear to be the case in sympy based on your examples.
The main reason for this was to check begin / end values, and return the
EMPTY_INTERVAL singleton for empty intervals. If we want to support zero
length intervals with begin==end value, then there is no need to use
EMPTY_INTERVAL. But there is still need to check begin <= end. But maybe
you are right: if we suppose that nobody will try to use an interval
where end < begin then we can use plain tuples. But then the user *must*
make sure not to use invalid intervals. Both solution have pros and
cons. Right now, I prefer to find the problem as soon as possible (e.g.
in the Interval constructor), but you can try to convince me.
> Having an interval as a distinct type may be useful (to iterate over the
> intervals of a set, for example), but his design blurs the line between
> intervals and sets (by supporting some set operations) without
> eliminating it as sympy seems to do.
It is blurred by design. There is an interpretation where an interval
between [0..4] equals to a set of intervals ([0..2],[2..4]). Actually,
you can ask: "is an element within the interval?" The same question can
be asked for an Interval set. So the same type of value can be
"contained" within an interval and also witin an interval set. In the
current implementation, if you try to create a set with these two
intervals [0..2] and [2..4], then they are unified into a single element
on purpose: to make sure that there is only a single official
representation of it. E.g. ([0..2],[2..4]) is not a valid set, only
([0..4]) is.

By definition, a set is given with its elements, e.g. for any possible
item, you can tell if it is part of the set or not. So if
([0..2],[2..4]) and ([0..4]) have exactly the same elements (by the
given definition), then they are not just equal: they are the same set.
The same reasoning is used in math when they say: there cannot be
multiple empty sets. There exists a single empty set. It happens that we
can only represent a finite number of elements in any set on a computer.
I wanted to have a well defined single way to represent any given set.

I can also see your point. From another point a view, a set and a set of
sets is not the same type. However, I do not share this view, because if
we start thinking this way, then operations like the one below does not
make sense:

[0..4]) - (1..2] == [0..1] | [2..4]

If you make a strinct distinction between intervals and interval sets,
then the above equation does not make sense, because you cannot remove
an element from a set that is not an "element" of it. The above equation
makes sense only if the "intervalset" is a general representation of
possible values, and the "interval" type is a special representation of
(the same) possible values. It is clear when you ask the question: "is 3
in the interval?" and you can also ask "is 3 in the interval set?" - so
the value of the same type can be contained in the interval and also in
the interval set. Maybe the naming is bad, and it should be named
"Intervals" instead of IntervalSet?