[MATRIX-SIG] Much ado about nothingness.

Aaron Watters arw@dante.mh.lucent.com
Thu, 10 Jul 1997 07:42:37 -0400



----------
> From: Konrad Hinsen <hinsen@ibs.ibs.fr>
...
> Umath [null] support is easy: just implement methods for each function. For any
> unknown type, umath tries to translate the function into a method call, i.e....

Nothing about nulls is easy, as CJ Date will confirm (although
beyond that he's usually wrong regarding Nulls imho :) ).
Basically, the standard database theory breaks down and
becomes totally useless when nulls are introduced and treated
seriously.  CJ Date and the like tend to conclude that "nulls
are bad" and just throw them out (a convenience only a
theoretician has).

EG Consider the table 

name  age
john    5
fred     6
lisa     5
wally   null
lola     null

select those names with age 5.
IMHO the correct answer it *two* sets

inner = {john,lisa},  outer={john, fred, lisa, lola}

The "correct" answer is somewhere in between
(if we had better information).

what is the average age?  null, I think.
rephrased, what is the average among
the known ages? (5+6+5)/3 (wally, lola thrown out).

Thus for accumulation calculations (including
len) null values sometimes should be thrown
out,  ie
   known_sum([1,2,3,Null,3]) = 9
   known_min_len([1,3,3,Null,3]) = 4 (not 5!)
here I say "known min len" because more generally
the Null might be a place holder that stands for
*several* missing values.

On the other hand there are "not applicable" nulls
too

item  pages
ipwp  300
pp     900
win95  NA

what are the average number of pages? (900+300)/2
what has 900 pages?  {pp}
(win95 doesn't enter the calculation, it has no pages)

"unknown" Nulls are also a special case of more general
"partial information" values, such as
numeric intervals, where

   sum([interval(5,6), interval(7,9)]) = interval(12,15)

Gosh, you could write a whole dissertation
on the topic (but don't, I already did :) ).

In sum, I suspect there is no quicky answer for
treating nulls -- there might be uniform approaches,
but I'd like to understand the required semantics
very clearly before modifying numeric to support
them.  I suspect differing apps require radically
differing semantices...
   -- Aaron Watters

_______________
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
_______________