[Tutor] SENTINEL, & more

Sat May 29 20:24:22 CEST 2010

On 05/29/10 18:29, spir ☣ wrote:
> Hello,
> 
> 
> from the thread: "class methods: using class vars as args?"
> 
> On Sat, 29 May 2010 11:01:10 +1000 Steven D'Aprano
> <steve at pearwood.info> wrote:
> 
>> On Fri, 28 May 2010 07:42:30 am Alex Hall wrote:
>>> Thanks for all the explanations, everyone. This does make sense,
>>> and I am now using the if(arg==None): arg=self.arg idea. It only
>>> adds a couple lines, and is, if anything, more explicit than what
>>> I was doing before.
>> 
>> You should use "if arg is None" rather than an equality test.
>> 
>> In this case, you are using None as a sentinel value. That is, you
>> want your test to pass only if you actually receive None as an
>> argument, not merely something that is equal to None.
>> 
>> Using "arg is None" as the test clearly indicates your intention:
>> 
>> The value None, and no other value, is the sentinel triggering
>> special behaviour
>> 
>> while the equality test is potentially subject to false positives,
>> e.g. if somebody calls your code but passes it something like
>> this:
>> 
>> class EqualsEverything: def __eq__(self, other): return True
>> 
>> instead of None.
> 
> I'll try to clarify the purpose and use of sentinels with an example.
> Please, advanced programmers correct me. A point is that, in
> languages like python, sentinels are under-used, because everybody
> tends to une None instead, or as all-purpose sentinel.

Sentinels are underused not because everyone uses None, but because in
many cases sentinels can be dangerous if not explicitly checked. In many
cases, python prefers Exceptions (e.g. for-loop iteration) to sentinels.

> Imagine you're designing a kind of database of books; with a user
> interface to enter new data. What happens when an author is unknown?
> A proper way, I guess, to cope with this case, is to define a
> sentinel object, eg: UNKNOWN_AUTHOR = Object() There are many ways to
> define a sentinel; one could have defined "=0" or "=False" or
> whatever. But this choice is simple, clear, and secure because a
> custom object in python will only compare equal to itself -- by
> default. Sentinels are commonly written upercase because they are
> constant, predefined, elements.

In this case, I would prefer an unknown author to be an empty string
(i.e. "") because using object() does not persist between serialization
to the database (not to mention having to special-case it everywhere,
with empty string, you only need to special case whenever you need to).

> Hope I'm clear. In the very case of UNKNOWN_AUTHOR, it would hardly
> have any consequence to use "==", instead of "is", as relational
> operator for comparison. Because, as said above, by default, custom
> objects only compare equal to themselves in python. But * This
> default behaviour can be overriden, as shown by Steven above. * Using
> "is" clarifies your intent to the reader, including yourself. * Not
> all languages make a difference between "==" and "is". (Actually,
> very few do it.) Good habits...
> 
> 
> 
> === additional stuff -- more personal reflexion -- critics welcome
> ===
> 
> Sentinels belong to a wider category of programming elements, or
> objects, I call "marks". (Conventional term for this notion welcome.)
> Marks are elements that play a role in a programmer's model, but have
> no value. What is the value of NOVICE_MODE for a game? of the SPADE
> card suit? of the character 'ø'? These are notions, meaning semantic
> values, that must exist in an application but have no "natural" value
> -- since they are not values semantically, unlike a position or a
> color. 

What *is* "value"? Is there any difference between "semantic value" and
"natural value"? IMHO, there is no difference, "numerical value" is only
a subset of all "value".

> In C, on could use a preprocessor flag for this: #define
> NOVICE_MODE ... #ifdef NOVICE_MODE ... #endif NOVICE_MODE is here
> like a value-less symbol in the program: precisely what we mean. But
> not all languages have such features. (Indeed, there is a value
> behind the scene, but it is not accessible to the programmer; so, the
> semantics is correct.)
> 
> Thus, we need to _arbitrarily_ assign marks values. Commonly, natural
> numbers are used for that: they are called "nominals" (-->
> http://en.wikipedia.org/wiki/Nominal_number) precisely because they
> act like symbol names for things that have no value. The case of
> characters is typical: that 'ø' is represented by 248 is just
> arbitrary; we just need something, and software can only deal with
> values; 

Digital computers can only deal with "natural numbers" (i.e. {0, 1, 2,
3, ...}), that's why we need to encode all values as natural numbers.
integers maps nicely to natural number (0:0, 1:1, -1:2, 2:3, -2:4, 3:5,
-3:6, 4:7, -4:8, ...).

Everything has a value, but the question of whether such value is
representable in a computer is equivalent to asking whether the value is
representable as integers, or in other words, whether the "cardinality"
of the set of all such possible values is less than or equal to the
"cardinality" of the set of all integers.

In cases where the value is not representable in integers (such as the
case of real numbers), then in many practical situation, we make do with
a subset of the possible values and approximate the rest (e.g. 'float'
type is an encoding of subset of real numbers into integers, 'string' is
an encoding of text/stream into a list of integers).

The set of problems that is solvable by a digital computer is dependant
on whether the problem can be encoded into a Countable Set
http://en.wikipedia.org/wiki/Countable_set

> An interesting exercise is to define, in and for python, practicle
> types for isolated marks (sentinels), mark sequences (enumerations),
> and mark sets.

See: http://code.activestate.com/recipes/413486/