New language

Fri Jun 1 03:15:37 EDT 2001

In comp.object Topmind <topmind at technologist.com> wrote:
[snip]
>> >> Another good example is for
>> >> instance x, y coordinates you want to pass around, without going to the
>> >> trouble of  dealing with anything more complicated.
>> 
>> > What keeps a dictionary from satisfying this role?
>> 
>> They're too heavy, again. :)

> Please clarify "heavy".

I think I did elsewhere. Too heavy in syntax, and there are also 
implementation memory concerns. A hash table including keys is
going to be significantly larger in memory usage than a tuple with
two references in it. if you're going to have a lot of them, this
may be a drag.

However, as I said before, it's mostly a syntax issue to me. There is
a case to be made for a kind of tuple-like dictionary, though the
advantages of that are minimal in my opinion; for anything more complex
I'd be inclined to use a full-fledged object anyway. Dictionaries are
more for storage of many homogenous objects, while tuples and instances are
more for collecting heterogenous objects into into a bundle/record/object.
I know instances are dictionaries with some sugar, but I mean in 
programming idiom, which is important too.

[snip]
>> > I think that was my case, more or less. Touples, dictionaries, and
>> > classes have too much *overlap* in Python. It just seems to me that
>> > they could have factored the 3 into *one* thing. It keeps the
>> > language cleaner and the learning curve shorter that way.
>> 
>> I disagree; I think a distinction like this can help sometimes. Look at
>> Perl and their scalars, which merges things like integers, floats and
>> strings into 'one thing'. 

> I like that approach. It makes the code leaner and cleaner IMO.
> Less casting, converting, and declaration clutter. It allows
> you to look at raw business logic instead of diddling with
> conversions, casting, and bloated declarations.

You're confusing things here; you're confusing the effect of static
type checking (declarations and casting) with that of having different
datatypes. In Python, you don't need to declare or cast integers, only
convert when necessary. You need to convert an integer to a string if you
want to use integers in a string, for instance.

Because of this, the program stops when you do something silly, instead of 
going on blindly and making a mishmash of your data.

This is *not* the same argument as that for static type checking however;
it is important to see the distinction. It's an argument for a light-weight
dynamically checked type (or interface/protocol) system.

What you seem to be describing as the benefits of the Perl scalar may
instead be the benefits of the absence of statically checked types.

Of course there are also the type inferencing systems which may 
combine the best of both worlds; I haven't had too much experience
with these so I can't tell you if that's true, though. Ocaml has
such a system, for instance.

>> They also seem to merge lists and dictionaries
>> (hashes) into one thing. I think that's bad; you want your programming
>> language to complain if you're treating something as an integer when
>> it's really a string (can't add "one" and "two").

> If this is degenerating into the age-old strong versus weak
> typing battle, then I will leave it here.

See above for what I think is the subtle but crucial distinction. I'll
remain quiet about it now, though. :)

> A jillion messages
> are already devoted to that topic, with no "killer proof"
> on either side. It may be subjective which is the "best".
> I grew up on strong typing, but have gravitated toward
> prefering dynamic typing over the years.

Me too.

>> It's funny you should compare tuples with dictionaries and say they 
>> should be conflated; most people complaining about tuples say they're
>> too much like *lists* (arrays). They're right that they're very much
>> like lists, 

> That too. Roll 'em all up. Requirements change. I hate recoding
> from lists to touples to dictionaries to tables, etc.
> Make the interfaces the *same*, and only swap the engine, NOT
> the interface.

Oh, I'd say make the interfaces different, use the same engine where
possible. I changed my mind a little about dictionaries; in practice
they're often used to store lots of homogenous values, not as a kind
of datatype (in Python, class instances (objects) are used for that).

I think it's a myth that having a universal rolled-into-one collection
type helps your program deal with change more easily. In my Python
programs, I use lists and tuples and dictionaries and tables in 
rather different places in different idioms. While it is possible I
change one into the other on occasion, this is the exception, not the rule.
When such changes do happen so many other changes tend to happen it
doesn't really matter anymore anyway; the change in collection type is
probably caused by such a larger change.

With a universal collection type you lose some of the benefit of these
separate idioms (which can help with the readability of the program). You
also may increase errors, as due to the absence of different interfaces
and idioms you run a higher risk the program will continue after an error
and mangle your data in unpredictable and hard to track down ways.

Anyway, as I said before, you're in the minimalist camp here, along with
Smalltalkers (everything's an object with messages) and Lispers (everything's
a list). I take the position that syntactic sugar can help with idioms,
which can help with clarity, readability and error detection.

>> except that they're immutable (like integers and strings
>> in Python, but unlike lists and dictionaries and instances). Your
>> desire to conflate them with dictionaries is in my opinion wrong as well,
>> but you're more right than those who want to merge them with lists; 

> Show me "wrong".

Wrong as in "I think there are arguments against this which you are missing
and I disagree with your evaluation of the tradeoffs". This is a
subjective issue. I imagine you can do empirical research about programming
language effectiveness and these issues, but I'm not going to do it.
Are you? If not, you'll have to accept that my efforts in trying to show
you 'wrong' are as valid as your efforts to show me wrong and yourself right,
here. The alternative is saying you're doing no such thing, in which
case I wonder what we're doing. :)

>> tuples are generally used as 'records' (heterogenous objects) and not
>> as lists of homogenous objects.

> Doesn't matter. Needs change. See above. Homo today, hetero tomorrow.
> Micheal Jackson Collections, you could say.

Heterogenous collections are not going to change into homogenous collection
and vice versa in by far the most circumstances. If you disagree, you
should name some cases; I can't think of any.

By 'homogenous collection' I mean a collection of 'like' objects
(English words, files, animals, records with address data, etc). 
By 'heterogenous collection' I mean a collection of significantly
different objects ("an integer, a string and a list", "a first name,
a middle name and a last name", "a word and the frequency of its 
occurance in a text", "an x coordinate and an y coordinate").

>> Anyway, you're in the LISP and Smalltalk camp here; do a lot with just 
>> a few syntactic (surface semantic) concepts. 

> As far as collections, yes you can say that.  (Although Smalltalk's
> collection API's are still too taxonomy-happy for my tastes.)

>> A language like Python
>> adds more syntactic sugar, and my theory is that this syntactic 
>> sugar *helps* programmers write and read programs. 

> Perhaps it depends on the programmer. Also, there is maintainability.
> Having dedicated syntax for certain (false) categorizations of
> collections may make *some* code easier to read, but still makes
> it harder to change when collection needs grow, morph, or change.

Yes, there are definitely tradeoffs there and I recognize those 
tradeoffs. I think the tradeoffs for collections weigh into a different
direction, however. That's not to say I want a huge forest of collections
that you can see in some statically typed languages, where they have
arrays for integers, arrays for strings, arrays for floats, and so on
ad infinitum. While arrays are usually for homogenous collections I think
the strict specification and checking of such can bog down the programmer
too much. It's also fine with me if collections share an underlying
implementation in some cases, if this is easier or more efficient.
But for me, the balance of the tradeoffs still leans towards more
collection interfaces than just a single one.

>> Too much syntactic
>> sugar can result in messes (Perl is another good example here), but
>> too little also has some disadvantages. People are sometimes a bit too
>> focused on the conceptual purity and the ability to manipulate program
>> code; in a pragmatic language the tradeoffs may sometimes favor more
>> syntactic sugar, not less.

> Having to re-code collections *is* a programatic concern.

But as I tried to show before, such collections are often used in
entirely different circumstances, so the necessity to recode a collection
is minimal. And also as said before, even when a recode is necessary,
this is usually a) not a huge deal to do and b) part of a larger change
already and minimal work compared to that.

> [snip]
>> # okay, make a 'child dictionary'; the instance of the class
>> a = A() # uses __init__

> Case sensitivity, YUK!

In idiomatic Python code, classes start with a capital, and instances
don't. It makes examples pretty clear. Let's not go into case
sensitity issues here; it seems to be almost entirely based on
personal preference, like C indenting styles. :)

Anyway, is that the only response you had to my example? I showed you how
Python was already doing more or less what you said it should be doing. 
An 'oh, cool' or 'huh?' or 'that's not what I mean' would've been worth
my troubles.

> [snip]
>> >> > Besides, what is wrong
>> >> > with regular by-reference parameters? 
>> >> 
>> >> Nothing at all, except that returning multiple values is far more clear
>> >> by just about any measure you can come up with. :)
>> 
>> > Which would be?
>> 
>> > I suppose you could argue that under the old approach
>> > one could not tell what was being changed and what was
>> > not by looking at the caller. However, you might have to check
>> > the bottom or middle instead of the top of a routine to
>> > figure out the result parameter interface in Python.
>> 
>> Usually the bottom, yes, unless you document it at the top in 
>> a docstring. Looking for 'return' statements isn't terribly
>> difficult, either.

> But harder than looking at the top.

Yes, but the problem already exists in any dynamically typed language
where any kind of heterogenous collection can be returned, and you said
you prefer dynamic typing. It doesn't add to the problem therefore;
it's just as hard if you're returning a record or dictionary. The
advantage of tuples is that they can be instantly unpacked after the
function call. 

>> > IOW, it might trade caller readability for callee
>> > readability. At the most it is a wash IMO.
>> 
>> I disagree; caller readability is not significantly effected and
>> callee readability (in multiple places) is improved. A clear win,
>> therefore.

> I am not sure how you are doing your math here. I figure one
> always has to go the the function definition and parameter
> list *anyhow* to understand the function's interface. 
> Thus, having it defined in the heading is a one-stop deal.

Without declarations, that's just a name in an argument, and the 
'changeable reference' indicator. It's true that is a bit more
explicit.

> Having to also check return statements is a two-stop deal.
> (I don't end up looking at return statements very often.)

The other deal is that I don't have to go look up the function definition
each time I see a function call I don't know about, just in case this may
involve reference parameters! That's a huge deal in my opinion. :)

>> > Having the entire interface defined at the top is
>> > a good thing IMO. (Although "return" is rarely
>> > at the top, but it is a single item if it
>> > exists.)
>> 
>> A single item of any kind of complexity, anyway, and a serious tradeoff
>> in readability for the callee as there are now two different ways you can
>> return values, one of which (reference parameters) is a hack.   

> Define "hack".

Mathematical functions, which inspired functions in programming languages,
don't have 'reference parameters'. They just have inputs. It can therefore
be presumed originally computer language functions didn't have them
either, and someone added them to languages in an early hack in order to support
multiple output values. The hack makes sense if your language is statically
typed, as you can then define the types of all the output values in the
same way as you already defined the types of the input values. You don't have
to think about extra syntax. It doesn't make a lot of sense in a 
dynamically typed language, though.

(and of course in Python you can mutate mutable objects passed to a function and
*any* variable in Python is a reference. But it's better style to avoid
mutating input if possible, in my opinion. It encourages more independent
functions which makes for easier to maintain and debug code).

[snip]
>> Come *on*, man! Harder at the callee *and* caller side, even in a language
>> that constructs dictionaries as easily as Python.

> You are assuming that you want to *split* them out when returned. What
> if you want to keep them together as a record (dictionary)?

Keep your tuple as it is.

def foo():
    return 1, 2

coord = foo()

bar(coord)

And unpack it sometime later:

x, y = coord

Note that the idiom for 'heterogenously used' dictionaries in Python is
the class instance. For more complicated things than just a coordinate
pair I'd generally use these. 

[snip]
>> > Because I am not convinced it is significantly better. As a rule of
>> > thumb, I say something has to be at least 15 to 30 percent better to
>> > deviate from tradition. Perhaps if I saw more actual uses for 
>> > it besides foo-bar examples, but I have not.
>> 
>> '15 to 30 percent better': failure to grok error.

> Perhaps you grok differently than me.

>> If you mean the amount
>> of typing, I can see it's far more than 30 percent better. But you
>> probably don't mean that, and it's fairly meaningless beyond that.

So, how *do* you arrive at these 15 to 30 percent better figures?
It implies somekind of objectively measured thing, did you?

>> > Most "data structures" I deal with are more than 2 positions.
>> > Thus, I use tables, and perhaps a dictionary-like thing to
>> > interface to such records. (I prefer to use tables to store
>> > data instead of dictionaries themselves, other than an interface
>> > mechanism to specific records.) Perhaps some niches have lots of
>> > "skinney collections" where touples may help, but not mine.
>> 
>> Well, I tried to describe such a niche; returning multiple things from
>> a function. Another niche is indeed the very light weight record 
>> niche; x, y coordinates for instance. Yet another niche, harder to
>> describe is the 'make a new immutable object from other immutable
>> objects' niche. 

> I meant industry domains, like business versus embedded systems versus 
> scientific computing, etc. I don't do a lot of X, Y coordinate work, BTW.

European example for the industry domain is a 'year/weeknumber' tuple.
In Europe industry often works with (ISO) weeknumbers. To calculate 
weeknumers back to a date (beginning of the week), you need the year as well,
so it can make sense to pass these around as pairs in ones application.

[swapping two values with tuples]
>> But it *is* obvious what is going on as you already understand both
>> tuple unpacking and tuple construction. 

> But it is just Yet Another Silly Trick To Understand.

It's not a 'silly trick' like many Perl 'silly tricks' where the trick
is merely in syntax and not the *consequence* of an orthogonal syntax.
If your syntax is orthogonal you can reason about it, so it's not 
a silly trick you need to remember in isolation. That's a very different
thing when you're learning a language.   

>> We're just doing both in a
>> single line. There's nothing special case about this. It's not *hard*
>> to understand tuple construction and unpacking. You're clinging to
>> your traditions here just for argument's sake. :)

> Nope! I am weighing utility versus complexity, and it flunks
> in my book. Save the syntax complexity for the *common*
> stuff.

I tried to show you how this *is* common stuff. Not swapping variables,
but collecting a bunch of things together and passing them around as
a whole, and returning them as a whole, and easily separating them into
pieces again. It happens frequently in software.

When the amount of heterogenous objects you're packing together into
a bundle is large or the situation is complicated (need to do many
operations on them), it makes sense to use a record or a class.

In many circumstances it is however not a complicated bunch and it's
easier not to use such a thing and use the syntactically and semantically
minimal tuple instead.

Yes, a small record can grow into a large one, and you will have to adapt
some code when it does (in a dynamically typed language, not a lot). 
There are many cases when this just doesn't happen, though; x, y coordinates
are an example, so are year/weeknumbers, or 'year/month/day' pairs, or
'amount/currency_type' pairs, and so on.

>> If this were the *main* reason for having tuples in a language, then of
>> course I'd agree with you. But this is just a consequence of their
>> presence.

> Same issue. They don't add anything that a dictionary couldn't do.
> They make save a few keystrokes here or there, but not worth
> Yet Another Collection Type IMO.

Yes, you're a syntactic minimalist and don't agree on the tradeoffs. :)

[snip]
>> and I haven't talked about using tuples to catch 
>> an arbitrary amount of arguments to a function yet, something which 
>> can be useful sometimes. For me, tuples are an optimisation of
>> common stuff.  

> Well, I just like to limit collection types. Collection needs scale
> all over the map, and too many interfaces makes for too many
> code overhauls when collection needs change.

As I tried to show, I disagree that many collections need to scale
across the map. Heterogenous collections generally don't, and many small
and/or throwaway homogenous collections don't either (let's say a
registry of object types; you know you're not going to have millions
but at most just dozens, or a list of the fields in your form).

>> > Don't get me wrong, there are languages a lot worse than Python,
>> > but the poor consolidation of the similar things I mentioned
>> > kind of bug me.
>> 
>> I see these syntactic issues in a somewhat different philosophil light,
>> something which I tried to describe above. While I'm all in favor of
>> semantic minimalism, I'm not a syntactic minimalist. If you're a
>> syntactic minimalist these subtle differences make no sense, indeed.
>> 
>> Anyway, if I were designing a new language I would indeed attempt to
>> bring dictionaries and tuples closer together, so we're in agreement
>> in that sense as well. I'm just defending the special syntax for tuples, though
>> I also wonder about performance (but we'd just have to profile it) if
>> all tuples were dictionaries.

> Performance often only becomes an issue when stupid programmers play with
> too many features. Thus, reduce the syntax features and you have less 
> playing around with wasteful things and cryptic tricks.

The tradeoff here is that one thing often doesn't fit all. You sometimes
need multiple things optimized for different situations. If you don't,
you may end up with exactly the situation you describe here; cryptic
tricks to make the 'one-size-fits-all' thing behave in a way it wasn't
designed for.

Regards,

Martijn
-- 
History of the 20th Century: WW1, WW2, WW3?
No, WWW -- Could we be going in the right direction?