Re: [Python-Dev] PEP 495 Was: PEP 498: Literal String Interpolation is ready for pronouncement
On Tue, Sep 8, 2015 at 8:27 PM, Guido van Rossum <guido@python.org> wrote:
Now if only PEP 495 could be as easy... :-)
I think we nailed the hard issues there. The next update will have a restored hash invariant and == that satisfies all three axioms of equivalency. I am not making a better progress because I am debating with myself about the fate of < and > comparisons. Cross-zone comparisons strike in full force there as well because two times ordered in UTC may appear in the opposite order in the local timezone where the clock is moved back. Note that I saved the hash invariant and the transitivity of == at the expense of the loss of trichotomy in comparisons (we will have pairs of aware datetimes that are neither equal nor < nor >). I don't think we need to change anything with < and > comparisons, but I am trying to come up with the arguments that will at least be convincing to myself. (I suspect that if I am not the only one who worries about this, the other such people can be counted by the values of the fold flag. :-)
On 9/11/2015 2:36 PM, Alexander Belopolsky wrote:
On Tue, Sep 8, 2015 at 8:27 PM, Guido van Rossum <guido@python.org <mailto:guido@python.org>> wrote:
Now if only PEP 495 could be as easy... :-)
I think we nailed the hard issues there. The next update will have a restored hash invariant and == that satisfies all three axioms of equivalency.
You are trying to sanely deal with politically mandated insanity. I think it essential that you not introduce mathematical insanity, but whatever you do will be less than completely satisfactory.
I am not making a better progress because I am debating with myself about the fate of < and > comparisons.
Both should not be true for the same pair ;-)
Cross-zone comparisons strike in full force there as well because two times ordered in UTC may appear in the opposite order in the local timezone where the clock is moved back.
Comparison of absolute Newtonion time, represented by UTC, and local 'clock face' relative time with political hacks, are different concepts. If I get up at 8:00 AM (in Delaware, USA) and you get up at 8:01 wherever you are, which of us got up earlier? It depends on what 'earlier' means in the context and purpose of the question. Are we asking about wakeup discipline, or email exchange? Pick whichever you and whoever consider to be most useful. Presuming that one can convert to UTC before comparision, I suspect the local version.
Note that I saved the hash invariant and the transitivity of == at the expense of the loss of trichotomy in comparisons (we will have pairs of aware datetimes that are neither equal nor < nor >).
That is the nature of partial orders.
I don't think we need to change anything with < and
comparisons,
I am guessing that the comparisons are currently local.
but I am trying to come up with the arguments that will at least be convincing to myself. (I suspect that if I am not the only one who worries about this, the other such people can be counted by the values of the fold flag. :-)
Good luck ;-) -- Terry Jan Reedy
On Fri, Sep 11, 2015 at 6:45 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I think we nailed the hard issues there. The next update will have a restored hash invariant and == that satisfies all three axioms of equivalency.
You are trying to sanely deal with politically mandated insanity. I think it essential that you not introduce mathematical insanity, but whatever you do will be less than completely satisfactory.
The insanity I am dealing with now is specific to Python datetime which wisely blocks any binary operation that involves naive and aware datetimes, but allows comparisons and subtractions of datetimes with different timezones. This is not an unusual situation given a limited supply of binary operators, Python has to reuse them in less than ideal ways. For example,
a = 'a' b = 'b' c = 5 (a + b) * c == a * c + b * c False
is less than ideal if you hold the distributive law of arithmetic sacred. Similarly, '-' is reused in datetime for two different operation: if s and t are datetimes with the same tzinfo, s - t tells you how far to move hands on the local clock to arrive at s when you start at t. This is clearly a very useful operation that forms the basis of everything else that we do in datetime. Note that for this operation, it does not matter what kind of time your clock is showing or whether it is running at all. We are not asking how long one needs to wait to see s after t was shown. We are just asking how far to turn the knob on the back of the clock. This operation does not make sense when s and t have different tzinfos, so in this case a '-' is reused for a different operation. This operation is much more involved. We require an existence of some universal time (UTC) and a rule to convert s and t to that time and define s - t as s.astimezone(UTC) - t.timezone(UTC). In the later expression '-' is taken in the "turns of the knob" sense because the operands are in the same timezone (UTC). Clearly, when we "abuse" the mathematical notation in this way, we cannot expect mathematical identities to hold and it is easy to find for example, aware datetimes u, t, and s such that (t - u) - (s - u) ≠ t - s. Deciding when it is ok to sacrifice mathematical purity for practical convenience is an art, not a science. The case of interzone datetime arithmetic is a borderline case. I would much rather let users decide what they want: s.astimezone(t.tzinfo) - t, s - t.astimezone(s.tzinfo) or s.astimezone(UTC) - t.astimezone(UTC) and not assume that s - t "clearly" means the last of the three choices. But the decision to allow interzone t - s was made long time ago and it is a PEP 495 goal to change that.
I am not making a better progress because I am debating
with myself about the fate of < and > comparisons.
Both should not be true for the same pair ;-)
I'll give you that. No worries.
Cross-zone
comparisons strike in full force there as well because two times ordered in UTC may appear in the opposite order in the local timezone where the clock is moved back.
Comparison of absolute Newtonion time, represented by UTC, and local 'clock face' relative time with political hacks, are different concepts. If I get up at 8:00 AM (in Delaware, USA) and you get up at 8:01 wherever you are, which of us got up earlier? It depends on what 'earlier' means in the context and purpose of the question. Are we asking about wakeup discipline, or email exchange?
There is no "earlier" or "later". There are "lesser" and "greater" which are already defined for all pairs of aware datetimes. PEP 495 doubles the set of possible datetimes and they don't fit in one straight line anymore. The whole point of PEP 495 is to introduce a "fold" in the timeline.
Pick whichever you and whoever consider to be most useful. Presuming that one can convert to UTC before comparision, I suspect the local version.
It is more delicate than that. Are you willing to accept a possibility of an infinite loop if you run a binary search for a UTC time in an ordered list of local times? We tolerate that with numbers, but you only have this risk when you have a "nan" in your list. I don't think the new PEP 495 datetimes will be nearly as bad as floating point NaNs: at least datetime == will still be reflexive and transitive unlike floating point ==. Still I am not ready to say that we have solved all the puzzles yet. But we are close.
Note that I saved the hash invariant and the
transitivity of == at the expense of the loss of trichotomy in comparisons (we will have pairs of aware datetimes that are neither equal nor < nor >).
That is the nature of partial orders.
Yes, but are we willing to accept that datetimes have only partial order? And in Python 3 we consider comparison between unorderable objects to be an error instead of silently returning False. I am strongly against allowing exceptions in astimezone(), ==, or hash()
I don't think we need to change anything with < and
comparisons,
I am guessing that the comparisons are currently local.
Yes, they are for naive datetime pairs and pairs with items sharing tzinfo. The problem is what to do with the interzone comparisons.
but I am trying to come up with the arguments that will
at least be convincing to myself. (I suspect that if I am not the only one who worries about this, the other such people can be counted by the values of the fold flag. :-)
Good luck ;-)
Thanks!
Alexander Belopolsky <alexander.belopolsky@gmail.com> writes:
There is no "earlier" or "later". There are "lesser" and "greater" which are already defined for all pairs of aware datetimes. PEP 495 doubles the set of possible datetimes
That depends on what you mean by "possible".
and they don't fit in one straight line anymore. The whole point of PEP 495 is to introduce a "fold" in the timeline.
That doesn't make sense. Within a given timezone, any given moment of UTC time (which is a straight line [shut up, no leap seconds here]) maps to only one local time. The point of PEP 495 seems to be to eliminate the cases where two UTC moments map to the same aware local time. Out of curiosity, can "fold" ever be any value other than 0 or 1? I don't know if there are any real-world examples (doubt it), but I could easily contrive a timezone definition that had three of a particular clock time.
Yes, but are we willing to accept that datetimes have only partial order?
I apparently haven't been following the discussion closely enough to understand how this can possibly be the case outside cases I assumed it already was (naive vs aware comparisons being invalid).
On 9/11/2015 5:40 PM, Alexander Belopolsky wrote:
The insanity I am dealing with now ... But the decision to allow interzone t - s was made long time ago and it is a PEP 495 goal to change that.
The first few paragraphs you wrote, which I elided, are a great explanation of why things work in ways that might be unexpected, and by including in the descriptions other things that might be unexpected, it helps people realize that the need to understand what the operators really mean, when applied to classes, rather than numbers. Of course, even floating point number operations and integer division only approximate mathematical reality, if you are looking for more examples. But the beginning phrase about "insanity" should probably be elided in documentation, yet the body could very well be appropriate for tutorial documentation, even if not reference documentation, although I'd not object to finding it there. The last phrase, about it being a PEP 495 goal to change that, might be true, but if it changes it, then it would be a confusing and backward incompatible change.
Yes, but are we willing to accept that datetimes have only partial order?
That's what the politicians gave us. These are datetime objects, not mathematical numbers.
On Fri, Sep 11, 2015 at 8:56 PM, Random832 <random832@fastmail.com> wrote:
Alexander Belopolsky <alexander.belopolsky@gmail.com> writes:
There is no "earlier" or "later". There are "lesser" and "greater" which are already defined for all pairs of aware datetimes. PEP 495 doubles the set of possible datetimes
That depends on what you mean by "possible".
What exactly depends on the meaning of "possible"? In this context "possible" means "can appear in a Python program."
and they don't fit in one straight line anymore. The whole point of PEP 495 is to introduce a "fold" in the timeline.
That doesn't make sense. Within a given timezone, any given moment of UTC time (which is a straight line [shut up, no leap seconds here]) maps to only one local time. The point of PEP 495 seems to be to eliminate the cases where two UTC moments map to the same aware local time.
Yes, but it does that at the cost of introducing the second local "01:30" which is "later" than the first "01:40" while "obviously" (and according to the current datetime rules) "01:30" < "01:40".
Out of curiosity, can "fold" ever be any value other than 0 or 1?
Thankfully, no.
Yes, but are we willing to accept that datetimes have only partial order?
I apparently haven't been following the discussion closely enough to understand how this can possibly be the case outside cases I assumed it already was (naive vs aware comparisons being invalid).
Local times that fall in the spring-forward gap cannot be ordered interzone even without PEP 495.
On Fri, Sep 11, 2015 at 9:12 PM, Glenn Linderman <v+python@g.nevcal.com> wrote: [Alexander Belopolsky]
But the decision to allow interzone t - s was made long time ago and it is a PEP 495 goal to change that.
The last phrase, about it being a PEP 495 goal to change that, might be true, but if it changes it, then it would be a confusing and backward incompatible change.
Oops, a Freudian slip. It is *not* a PEP 495 goal to change that.
On Fri, Sep 11, 2015 at 9:12 PM, Glenn Linderman <v+python@g.nevcal.com> wrote:
That's what the politicians gave us. These are datetime objects, not mathematical numbers.
That's an argument for not defining mathematical operations like <, > or - on them, but you cannot deny the convenience of having those. Besides, datetime objects are mathematical numbers as long as you only deal with one timezone or restrict yourself to naive instances. The problem with interzone subtraction, for example, is that we start with nice (not so little) integers and define an operation that is effectively op(x, y) = f(x) - g(y) where f and g are arbitrary functions under the control of the politicians. It is convenient to equate op with subtraction and if f and g are simple shifts, it is a subtraction, but in general it is not. This is the root of the problem, but datetime objects are still as close to mathematical numbers as Python ints.
On 9/11/2015 6:39 PM, Alexander Belopolsky wrote:
On Fri, Sep 11, 2015 at 9:12 PM, Glenn Linderman <v+python@g.nevcal.com <mailto:v+python@g.nevcal.com>> wrote:
That's what the politicians gave us. These are datetime objects, not mathematical numbers.
That's an argument for not defining mathematical operations like <, > or - on them, but you cannot deny the convenience of having those.
It wasn't intended to argue for not defining the operations, just intended to justify that it is partial ordering... if the associated timezone implements daylight saving.
Alexander Belopolsky <alexander.belopolsky@gmail.com> writes:
Yes, but it does that at the cost of introducing the second local "01:30" which is "later" than the first "01:40" while "obviously" (and according to the current datetime rules) "01:30" < "01:40".
The current datetime rules, such as they are, as far as I am aware, order all aware datetimes (except spring-forward) according to the UTC moment they map to. I'm not sure what the benefit of changing this invariant is.
Out of curiosity, can "fold" ever be any value other than 0 or 1?
Thankfully, no.
What happens, then, if I were to define a timezone with three local times from the same date? None may exist now, but the IANA data format can certainly represent this case. Should we be talking about adding an explicit offset member? (Ultimately, this "fold=1 means the second one" notion is a novel invention, and including the offset either explicitly a la ISO8601, or implicitly by writing EST/EDT, is not)
> Yes, but are we willing to accept that datetimes have only partial > order?
I apparently haven't been following the discussion closely enough to understand how this can possibly be the case outside cases I assumed it already was (naive vs aware comparisons being invalid).
Local times that fall in the spring-forward gap cannot be ordered interzone even without PEP 495.
Hmm. If these have to be allowed to exist, then... What about ordering times according to, notionally, a tuple of (UTC timestamp of transition, number of "fake" seconds "after" the transition) for a spring-forward time? Also, can someone explain why this:
ET = pytz.timezone("America/New_York") datetime.strftime(datetime.now(ET) + timedelta(days=90), ... "%Y%m%d %H%M%S %Z %z") returns '20151210 214526 EDT -0400'
I don't know if I expected 214526 or 204526, but I certainly expected the timezone info to say EST -0500. If EST and EDT are apparently two distinct tzinfo values, then what exactly would a value landing near the "fall back" transition have given for fold? fold=1 but EDT? And if EST and EDT are, against all rationality, distinct tzinfo values, then when exactly can fold ever actually be 1, and why is it needed?
I think we're getting into python-ideas territory here... --Guido (on mobile)
On Fri, Sep 11, 2015 at 9:51 PM, Glenn Linderman <v+python@g.nevcal.com> wrote:
It wasn't intended to argue for not defining the operations, just intended to justify that it is partial ordering...
It is not even that. Note that even partial ordering still requires transitivity of <=, but we don't have that in datetime:
from datetime import * from datetimetester import Eastern UTC = timezone.utc a = datetime(2002, 4, 7, 1, 40, tzinfo=Eastern) b = datetime(2002, 4, 7, 2, tzinfo=Eastern) c = datetime(2002, 4, 7, 6, 20, tzinfo=UTC) a <= b <= c True a <= c False
The above session is run in the currently released python. The Eastern timezone implementation is imported from the CPython test suit. The fact that transitivity of <= is already broken gives me little comfort because pretty much everything involving "problem times" is currently broken and users expect that. PEP 495, however, is expected to fix the issues with the problem times and not just replace one broken behavior with another.
On Fri, Sep 11, 2015 at 10:00 PM, Random832 <random832@fastmail.com> wrote:
The current datetime rules, such as they are, as far as I am aware, order all aware datetimes (except spring-forward) according to the UTC moment they map to.
No. See the library reference manual: "If both comparands are aware, and have the same tzinfo attribute, the common tzinfo attribute is ignored and the base datetimes are compared." < https://docs.python.org/3/library/datetime.html> The reasons for this have been explained at length in the recent Datetime-SIG discussions. Please check the archives if you are interested: <https://mail.python.org/pipermail/datetime-sig/2015-August/000169.html>.
On Fri, Sep 11, 2015 at 10:00 PM, Random832 <random832@fastmail.com> wrote:
And if EST and EDT are, against all rationality, distinct tzinfo values, then when exactly can fold ever actually be 1, and why is it needed?
No, fold is not needed in the case of fixed offset timezones. For an obvious reason: there are no folds or gaps in those.
On Fri, Sep 11, 2015 at 10:22 PM, Guido van Rossum <gvanrossum@gmail.com> wrote:
I think we're getting into python-ideas territory here...
Well, a violation of transitivity of <= in the current CPython implementation may be considered a bug by some. This makes this discussion appropriate for python-dev. We could discuss this on Datetime-SIG, but I think the question is more broad than just datetime. When is it appropriate for Python operators to violate various mathematical identities? We know that some violations are unavoidable when you try to represent real numbers in finite size objects, but when you effectively deal with a subset of integers, what identities are important and what can be ignored for practical reasons? Intuitively, I feel that the hash invariant and the reflexivity and transitivity of == are more important than say distribution law for + and *, but where does it leave the transitivity of <=? Is it as important as == being an equivalence, or it's a nice to have like the distribution law?
[Random832 <random832@fastmail.com>]
...
Also, can someone explain why this:
ET = pytz.timezone("America/New_York") datetime.strftime(datetime.now(ET) + timedelta(days=90), ... "%Y%m%d %H%M%S %Z %z") returns '20151210 214526 EDT -0400'
pytz lives in its own world here, and only uses eternally-fixed-offset zones. It's a magnificent hack to get around the lack of an "is_dst bit" in datetime's design, and effectively records that bit via _which_ fixed-offset zone it attaches to the datetime. The tradeoff is that, to get results you expect, you _need_ to invoke pytz's .normalize() after doing any arithmetic (and pytz's doc are very clear about this - do read them). That's required for pytz to realize the tzinfo in the result is no longer appropriate for the result's date and time, so it can (if needed) replace it with a different fixed-offset tzinfo.
I don't know if I expected 214526 or 204526, but I certainly expected the timezone info to say EST -0500. If EST and EDT are apparently two distinct tzinfo values,
In pytz, they are. This isn't how tzinfos were _intended_ to work in Python, but pytz does create an interesting set of tradeoffs.
then what exactly would a value landing near the "fall back" transition have given for fold? fold=1 but EDT?
As above, pytz is in its own world here. It doesn't need `fold` because it has its own hack for disambiguating local times in a fold.
And if EST and EDT are, against all rationality, distinct tzinfo values, then when exactly can fold ever actually be 1, and why is it needed?
pytz doesn't need it. "Hybrid" tzinfos do, because there is currently no way outside of pytz to disambiguate local times in a fold. So, short course: if you ask questions about pytz's behavior, you're asking question about pytz, not really about Python's datetime.
On 2015-09-12 02:23, Alexander Belopolsky wrote:
On Fri, Sep 11, 2015 at 8:56 PM, Random832 <random832@fastmail.com <mailto:random832@fastmail.com>> wrote:
Alexander Belopolsky <alexander.belopolsky@gmail.com <mailto:alexander.belopolsky@gmail.com>> writes: > There is no "earlier" or "later". There are "lesser" and "greater" > which are already defined for all pairs of aware datetimes. PEP 495 > doubles the set of possible datetimes
That depends on what you mean by "possible".
What exactly depends on the meaning of "possible"? In this context "possible" means "can appear in a Python program."
> and they don't fit in one > straight line anymore. The whole point of PEP 495 is to introduce a > "fold" in the timeline.
That doesn't make sense. Within a given timezone, any given moment of UTC time (which is a straight line [shut up, no leap seconds here]) maps to only one local time. The point of PEP 495 seems to be to eliminate the cases where two UTC moments map to the same aware local time.
Yes, but it does that at the cost of introducing the second local "01:30" which is "later" than the first "01:40" while "obviously" (and according to the current datetime rules) "01:30" < "01:40".
Out of curiosity, can "fold" ever be any value other than 0 or 1?
Thankfully, no.
[snip] What would happen if it's decided to stay on DST and then, later on, to reintroduce DST? Or what would happen in the case of "British Double Summer Time" (go forward twice in the spring and backward twice in the autumn)? https://en.wikipedia.org/wiki/British_Summer_Time
On Fri, Sep 11, 2015 at 11:03 PM, Tim Peters <tim.peters@gmail.com> wrote:
then what exactly would a value landing near the "fall back" transition have given for fold? fold=1 but EDT?
As above, pytz is in its own world here. It doesn't need `fold` because it has its own hack for disambiguating local times in a fold.
But I consider it as an imperative that pytz's hack continues to work in the post-PEP 495 world. Fortunately, this is not a hard requirement to satisfy.
On Fri, Sep 11, 2015 at 11:07 PM, MRAB <python@mrabarnett.plus.com> wrote:
What would happen if it's decided to stay on DST and then, later on, to reintroduce DST?
No problem as long as you don't move the clock back x minutes and then decide that you did not move it back enough and move it again before x minutes have passed. Fortunately, no government in the world can pass a new law in an hour. More so in an hour between 01:00 and 02:00 AM. :-) An interesting possibility is a fold straddling a leap second, but hopefully those who pass the timekeeping laws have learned about the leap seconds by now.
MRAB <python@mrabarnett.plus.com> writes:
What would happen if it's decided to stay on DST and then, later on, to reintroduce DST?
Or what would happen in the case of "British Double Summer Time" (go forward twice in the spring and backward twice in the autumn)?
"backward twice" could theoretically do it, if you literally went back an hour, waited an hour (or any nonzero amount less than two hours), and went back an hour again, rather than just going back two hours. I don't know if any real-life authorities have ever done such a thing; that's why I asked. You could also have them if you had a "timezone" representing the real-time local time of someone who traveled across timezone boundaries multiple times in close succession, rather than the time of a geographical place.
On 9/11/2015 8:40 PM, Alexander Belopolsky wrote:
The insanity I am dealing with now is specific to Python datetime which wisely blocks any binary operation that involves naive and aware datetimes, but allows comparisons and subtractions of datetimes with different timezones. This is not an unusual situation given a limited supply of binary operators, Python has to reuse them in less than ideal ways. For example,
a = 'a' b = 'b' c = 5 (a + b) * c == a * c + b * c False
is less than ideal if you hold the distributive law of arithmetic sacred.
In mathematics, algebra is, put simply, the study of binary operations. None of the laws for particular operations on particular sets is 'sacred'. They are simply facts. Or possibly axioms whose consequences are to be explored. A mathematician has no problem with 'a'+'b' != 'b'+'a'. After closure, associativity is the most 'basic' operation, but non-associative operations are studied. The equality relation, mapping pairs of members of a set to True or False is a different matter. Being an equivalence relation is fundamental to both normal logic, algebraic proofs, and the definition of sets. It is also required for the 'proper' operation of Python's sets. (Lets leave nans out of the discussion).
Similarly, '-' is reused in datetime for two different operation: if s and t are datetimes with the same tzinfo, s - t tells you how far to move hands on the local clock to arrive at s when you start at t. This is clearly a very useful operation that forms the basis of everything else that we do in datetime. Note that for this operation, it does not matter what kind of time your clock is showing or whether it is running at all. We are not asking how long one needs to wait to see s after t was shown. We are just asking how far to turn the knob on the back of the clock. This operation does not make sense when s and t have different tzinfos, so in this case a '-' is reused for a different operation. This operation is much more involved. We require an existence of some universal time (UTC) and a rule to convert s and t to that time and define s - t as s.astimezone(UTC) - t.timezone(UTC). In the later expression '-' is taken in the "turns of the knob" sense because the operands are in the same timezone (UTC).
Datetime members, are rather unusual beasts. They are triples consisting of a member of a discrete sequence (with some odd gaps), a tz tag, and a 0/1 fold tag. The tz tags divide datetimes into equivalence classes. The '-' operation is also unusual in being defined differently for pairs in the same or different equivalence classes.
Clearly, when we "abuse" the mathematical notation in this way,
Mathematicians wildly overload operater/relation symbols. '=' is the only one I can think of with more-or-less universal properties.
we cannot expect mathematical identities
which are context dependent
to hold
in a different context. Right.
and it is easy to find for example, aware datetimes u, t, and s such that (t - u) - (s - u) ≠ t - s.
Datetime '-' (subtraction) should be documented as an unusual overloaded use which does not have the normal properties that the naive might expect. Within the constraint on '=', there are two choices for designing an operation. 1. Define the operation the way you want, and live with the consequent properties (or absence of properties). 2. Decide the properties you require, and live with the consequent restriction on the definition. -- Terry Jan Reedy
On Sat, Sep 12, 2015 at 1:20 AM, Terry Reedy <tjreedy@udel.edu> wrote:
A mathematician has no problem with 'a'+'b' != 'b'+'a'.
I doubt it. A binary operation denoted + (and called addition) is almost universally a commutative operation. A non-commutative binary operation is usually denoted * (and called multiplication).
After closure,
Do you refer to "set closure" operation [1] here? I am not sure why it is relevant nor why it is "basic."
associativity is the most 'basic' operation, but non-associative operations are studied.
I think you have missed the words "property of" before "operation" above. "Closure", "commutativity", "associativity", etc. are properties of operations, not operations.
The equality relation, mapping pairs of members of a set to True or False is a different matter. Being an equivalence relation is fundamental to both normal logic, algebraic proofs, and the definition of sets.
Agree, and we have a solution for PEP 495 which preserves == as and equivalence (symmetric, reflexive and transitive) relationship.
Datetime members, are rather unusual beasts. They are triples consisting of a member of a discrete sequence (with some odd gaps),
I assume you are using a word "member" to refer to class instances. There are no gaps in datetimes: there are instances that don't correspond to any valid local time and (pre-PEP 495) there are local times that don't correspond to any instances with a given tzinfo. The unrepresentable times can still be represented using a different tzinfo. PEP 495 adds a way to represent all times using instances with any tzinfo, but on the flip side adds many more instances that are not "canonical" representations (e.g. fold=1 instances for regular times.)
a tz tag, and a 0/1 fold tag. The tz tags divide datetimes into equivalence classes.
That I don't understand. Local t and u = t.astimezone(UTC) are equal (t == u evaluates to True), so u and t belong to the same equivalence class.
The '-' operation is also unusual in being defined differently for pairs in the same or different equivalence classes.
I am not concerned about '-'. My main concern is about order operations. I am happy with the solution I have for ==, but I am still struggling with the non-transitivity of <. Comparison operations are special because they are used implicitly in other operations. The < operator is used implicitly in bisect. If it does not satisfy the (partial?) order properties, bisect may enter an infinite loop. [1]: http://mathworld.wolfram.com/SetClosure.html
On 9/12/2015 1:04 PM, Alexander Belopolsky wrote:
On Sat, Sep 12, 2015 at 1:20 AM, Terry Reedy <tjreedy@udel.edu <mailto:tjreedy@udel.edu>> wrote:
A mathematician has no problem with 'a'+'b' != 'b'+'a'.
I doubt it. A binary operation denoted + (and called addition) is almost universally a commutative operation. A non-commutative binary operation is usually denoted * (and called multiplication).
I am aware of the single-operation group theory convention, but the context was sequence concatenation and scalar multiplication, where '*' is repeated '+'. But these details are not directly relevant to DateTimes.
After closure,
I perhaps should have said 'completeness'. In any case, I was referring to 'a op b' existing for all a and b in the set.
Agree, and we have a solution for PEP 495 which preserves == as and equivalence (symmetric, reflexive and transitive) relationship.
Datetime members, are rather unusual beasts. They are triples consisting of a member of a discrete sequence (with some odd gaps),
Your correction, summarized, is that there are no gaps, so the set is simpler than I thought. Skipping on to the heart of the matter.
I am not concerned about '-'. My main concern is about order operations.
Is not '<' defined, in the obvious way, in terms of '-' and the sign of the resul?
I am happy with the solution I have for ==, but I am still struggling with the non-transitivity of <.
I am guessing that the 'struggle' is at least partly this: "Is non-transitivity of < necessary given other constraints, including back-compatibility, or is there another solution possible that would also be <-transitive?"
Comparison operations are special because they are used implicitly in other operations. The < operator is used implicitly in bisect. If it does not satisfy the (partial?) order properties, bisect may enter an infinite loop.
and, if we are stuck with <-intransitivity, what do we do? If back-compatibility allowed, I might suggest defining 'lt' or 'less' rather than '__lt__' so that sort and bisect don't work with DateTimes. Then document that the function is not transitive. -- Terry Jan Reedy
In a message of Sat, 12 Sep 2015 20:49:12 -0400, Terry Reedy writes:
and, if we are stuck with <-intransitivity, what do we do? If back-compatibility allowed, I might suggest defining 'lt' or 'less' rather than '__lt__' so that sort and bisect don't work with DateTimes. Then document that the function is not transitive.
I think it would be better to document what you are supposed to do if you have a list of DateTimes and want to sort them, as a way to get a list of times sorted from the earliest to the latest. Laura
On Sun, Sep 13, 2015 at 7:03 PM, Laura Creighton <lac@openend.se> wrote:
In a message of Sat, 12 Sep 2015 20:49:12 -0400, Terry Reedy writes:
and, if we are stuck with <-intransitivity, what do we do? If back-compatibility allowed, I might suggest defining 'lt' or 'less' rather than '__lt__' so that sort and bisect don't work with DateTimes. Then document that the function is not transitive.
I think it would be better to document what you are supposed to do if you have a list of DateTimes and want to sort them, as a way to get a list of times sorted from the earliest to the latest.
What I'd like to hear (but maybe this won't be possible) would be "less-than is transitive if and only if <X>", where <X> might be something like "all of the datetimes are in the same timezone" or "none of the datetimes fall within a fold" or something. That would at least make sorting possible, but maybe with a first-pass check to ensure transitivity. Vain hope or plausible restriction? ChrisA
[Chris Angelico <rosuav@gmail.com>]
What I'd like to hear (but maybe this won't be possible) would be "less-than is transitive if and only if <X>", where <X> might be something like "all of the datetimes are in the same timezone" or "none of the datetimes fall within a fold" or something. That would at least make sorting possible, but maybe with a first-pass check to ensure transitivity.
Vain hope or plausible restriction?
Pragmatically, if someone needs to care about sorting aware datetimes that may include times in folds, the obvious way is to convert them to UTC first (which can be done with sort's `key=` argument). Times in UTC are totally ordered (essentially the same as working with integers). That's a sane & easy sufficient condition. It's a waste of time to worry about minimal necessary conditions. "Convert to UTC' is the obvious way to do darned near everything. Converting to any other fixed-offset zone would do just as well, but _that_ observation is also a waste of time, since "convert to UTC" is just as easy ;-)
On Mon, Sep 14, 2015 at 1:44 AM, Tim Peters <tim.peters@gmail.com> wrote:
That's a sane & easy sufficient condition. It's a waste of time to worry about minimal necessary conditions. "Convert to UTC' is the obvious way to do darned near everything. Converting to any other fixed-offset zone would do just as well, but _that_ observation is also a waste of time, since "convert to UTC" is just as easy ;-)
So it isn't sufficient for them all to be in, say, Australia/Melbourne, which observes DST. Fair enough. And yeah, converting to UTC is straightforward enough. ChrisA
participants (9)
-
Alexander Belopolsky -
Chris Angelico -
Glenn Linderman -
Guido van Rossum -
Laura Creighton -
MRAB -
Random832 -
Terry Reedy -
Tim Peters