Representing ambiguity in datetime?

John Machin sjmachin at lexicon.net
Tue May 17 19:59:45 EDT 2005


On Tue, 17 May 2005 17:38:30 -0500, Terry Hancock
<hancock at anansispaceworks.com> wrote:

>What do you do when a date or time is
>incompletely specified?  ISTM, that as it is, there is no
>formal way to store this --- you have to guess, and there's
>no way to indicate that the guess is different from solid
>information.  As a result, I have sometimes had to abandon
>datetime, even though it seemed like the logical choice for
>representing data.
>
>E.g. I might have information like "this paper was published
>in May 1997".  There's no way to write that with datetime,
>is there?  Even if I just use the "date" object instead of 
>datetime, I still have to actually specify something like 
>May 1, 1997 --- fabricating data, which is frequently
>undesireable (later on, I might find information saying that
>it was actually published May 23, 1997 and I might want
>to update the earlier one, or simply evaluate them as 
>"equal" since they are, to within the precision given --- 
>for example, I might be trying to decide that two database
>entries are really duplicate references to the same paper).
>
>I know that this is somewhat theoretically stated, but I 
>have run into to concrete problems along the lines of
>the above.
>
>I'd say this is analogous to how you might use "None"
>rather than "0" to represent an integer if you don't know
>it's value (rather than knowing that it is zero).  ISTM, you
>ought to be able to specify a date as, e.g.:
>
>d = datetime.date(2005, 5, None)
>
>I realize there might be some complexity with deciding
>how to handle datestamp math, but as this situation
>occurs frequently in real life, it seems like it shouldn't
>be avoided.
>
>How do other people deal with this kind of problem?

Mostly, badly :-(

Real-life example: due to war-time disruption etc, in some countries
it is common enough to find that the date of birth of someone born in
the 1940s is not known precisely. E.g. on the Hong Kong identity card,
it is possible to find only the year and month of birth, and sometimes
even only the year. Depending on the purpose, legislation and
convention will take the first day of the vague period or the last day
when a calculation is required. Badly == entering into a database the
"exact" date that was used for the purpose du jour, with no indication
that the source was vague. Consequently a person can have DOB recorded
as 1945-01-01 on one database and 1945-12-31 on another.

Suggested approach in Python (sketch): Don't try to get the datetime
module to solve the problem. Define a fuzzydate class. Internal
representation: I'd suggest earliest possible date and latest possible
date. That way you have valid date instances for doing date
arithmetic. May have different constructors depending on how the
incoming vagueness is specified. 

HTH,
John



More information about the Python-list mailing list