
Hello, I have been recently thinking at lexical distinctions around the notion of data. (--> eg for a starting point http://c2.com/cgi/wiki?WhatIsData) Not only but especially in Python. I ended up with the following questions: Can one state "in Python value=data=object"? Can one state "in Python speak value=data=object"? What useful distinctions are or may be done, for instance in documentation? What kind of difference in actual language semantics may such distinctions mirror? Denis PS: side-question on english: I am annoyed by the fact that in english "data" is mainly used & understood as a collective (uncountable) noun. "datum" (singular) & "datas" (plural) seem to be considered weird. How to denote a single unit of data wothout using the phrase "piece of data"? Can one still use "datum" or "datas" (or "data" as plural) and be trivially understood by *anybody* (not only scientists)? Or else what word should one use? --> Compare: german Datum/Daten http://de.wiktionary.org/wiki/Datum, french donnée/données http://fr.wiktionary.org/wiki/donn%C3%A9e, etc... ________________________________ vit esse estrany ☣ spir.wikidot.com

On 04/15/2010 12:37 PM, spir ☣ wrote:
Uh.. you are trying to have a discussion about detailed semantics without defining what you mean by any of your terminology. As I see it (which is undoubtedly misleading in many respects): the set of all data is a countably infinite and unordered. (perhaps segments of memory) the set of all values is a countably infinite and partially ordered. an object is a member of the set of (datum, value) pairs. (perhaps a memory address coupled with a type) In Python, each object with a different value has a different datum (and for the most part, objects with the same values have different data too). The only operation that concerns that datum of an object is the "is" operator. The value of an object is more useful, and this is what all the other comparison operators (and most other functions) deal with. It shouldn't be necessary to make the distinction between an object and its value in documentation (though it is (I presume) occasionally useful in actual code to distinguish objects with the same datum, perhaps in cycle detection). Conrad
You can use datum/data as countable but it sounds forced (as above).

On Thu, 15 Apr 2010 14:31:49 +0100 Conrad Irwin <conrad.irwin@googlemail.com> wrote:
This is precisely the point: how to properly use given terms -- giving them semantic distinctions or not.
[...]
(and for the most part, objects with the same values have different data too).
Right, this makes sense for me. And do you mean data are different as soon as located at different places in memory even if bit-per-bit equal? Or only that value holds a notion of interpretation (due to the type)?
[...]
Thank you. Let us take the case of a simple assignment: name = expression Once the expression is evaluated, what we get is commonly called a value, right? But in numerous places the result of data lookup is called object instead. While conceptually, for me, it's exactly the same thing. x.a = 1 results in an attribute 'a' with "value" 1, in x. b = x.a looks up for the "object" denoted to by the attr name 'a' in x. (I'm not trying to annoy people, just to clarify common notions.)
Conrad
________________________________ vit esse estrany ☣ spir.wikidot.com

2010/4/15 spir ☣ <denis.spir@gmail.com>:
Data is a buzzword. Object is, well, an object. It has a distinct id/address/location/whatever. (Though these may be reused after an object is destroyed.) Value is semantically defined by the type/class -- e.g. for numbers, two objects with the value 1 have the same value (even if the type is different, e.g. 1 and 1.0); and similar for strings. But for other object types the value is just the object identity. And for tuples it actually depends on the items. IOW values are defined by __eq__.
There is no notion of bit-per-bit equal. Only object identity (same address) and value equality defined by __eq__.
No, an object. (Which has a value but the meaning of the value depends on the type.)
But in numerous places the result of data lookup is called object instead.
Please stop using 'data'. It has no meaning in this context.
No, attribute lookup returns an object, not a value.
b = x.a looks up for the "object" denoted to by the attr name 'a' in x.
All this is defined by object semantics. Value semantics apply at a later level.
(I'm not trying to annoy people, just to clarify common notions.)
-- --Guido van Rossum (python.org/~guido)

On 04/16/10 04:36, Guido van Rossum wrote:
IMHO, the fact that the return value of id(...) can be reused by different object is a bug (though a bug that have no practical impact for most real-life program and other possible alternative implementations I can think of are no good for practical purposes)

data = plural of datum both words are commonly misused as to singular/plural usage. "piece/unit/item of data" is a common substitution for "datum" by people who don't know the word. [Likewise, "media" is the plural of "medium" but people get that confused too.] With numbers, a single number is a datum. More than one numbers are data. You would never refer to "the value X" (where X is a python variable) but you would refer to "the value 3" or "the value pi". Generally, in English when people say "data" they're referring to the numbers in a more abstract sense than 3 and pi. I would say that roughly: value : datum :: instance : object But this is hardly precise. People frequently refer to A as an object when we really mean it's an instance of the object [e.g., class] Alpha. --- Bruce http://www.vroospeak.com 2010/4/15 spir ☣ <denis.spir@gmail.com>

2010/4/15 Bruce Leban <bruce@leapyear.org>:
data = plural of datum both words are commonly misused as to singular/plural usage.
Too pedantic.
No. In Python, object always refers to an instance. "An instance of an object" is nonsense (unless the object happens to be a class object, in which case the object's class is also called a metaclass). We do use type and class interchangeably in Python. (Except in very old Python versions where type refers to a built-in type or a type defined by a built-in or extension module, and class refers to a user-defined class -- but in modern Python there is no longer a difference.) -- --Guido van Rossum (python.org/~guido)

Sorry -- Denis asked not just about Python but about English and I was speaking in that context and it's hard to not be pedantic in that case. :-) Even if Python defines precise meanings of 'object', 'class', 'metaclass' most of us use other languages too which use those words differently so there will invariably be some cross-leakage. I know a programming system that uses the word 'model' where others use table and 'entity' where others use row. :-) --- Bruce http://www.vroospeak.com On Thu, Apr 15, 2010 at 12:00 PM, Guido van Rossum <guido@python.org> wrote:

Bruce Leban writes:
I would say that roughly:
value : datum :: instance : object
You can say that if you like, but unfortunately you cannot expect others to take that meaning without an explicit gloss to explain what you mean. Data in the broadest sense (as the plural of datum) is just tiny pieces of unstructured information, but it is often used in other senses, such as coextensive with "information", or connoting an array of information pieces of some type, or a stream of information pieces conforming to some syntax. Because of current common usage, it is most useful as a collective noun for "information" when you don't want to be precise. Since in English it is rare that we are precise about the "sense" in which we use "data", I agree with Guido that the word should be avoided in this kind of discussion.

spir ☣ wrote:
What useful distinctions are or may be done, for instance in documentation?
Python mainly considers: object identity (x is y, determined by id()) object value (x == y, determined by the implementations of relevant __eq__ methods) For many objects, their value degenerates to being the same as their identity (i.e. they compare equal only with themselves), but others have for more useful comparisons defined (e.g. containers, numbers). In addition to objects with their identities and values, there are references to objects. These 'references' are the only things in Python which aren't first class objects, since they reflect lookup entries in some namespace or container, such as the mapping from names to pointers in a class or module __dict__ or a function's locals, or the items stored in a list or set. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

"spir ☣" <denis.spir@gmail.com> wrote Lots of points here and I suspect the English usage will vary greatly between the different English communities.
I'm not sure I understand your distinction between the two statements. I think in Python it's true to say that value=object but the rule is not commutative. object !=value in every case. (A function is an object but is not really a value (although it will return a value if called, and of course has an id() and is "not None", so in that sense is a value - but that is I think a special case) Data is a concept and is wider than mere values or objects. values can be data. But so can rules. Now a rule can be expressed as a function or as a mapping, and the mapping will contain values, but the mapping - the relationships - are not explicit values, they are rules inherent to the mapping. But the mapping is data. Objects differ from pure values in most languages in that they usually (always?) have operations (at least a constructor, and in modern Python much more). But since I said that all values in Python are objects their difference here is moot.
My dictionary defines data as facts or figures from which conclusions can be inferred; information facts are not always the same as values (defined as "measures" so by definition relative - you can compare values against a known datum of similar type (which is another concept again!) but you cannot really compare facts other than by their truthfulness, but then, an untruthful fact is not a fact!)
Speaking for the English English: "datas" is weird, in fact it is a non-word so far as I know. "datum" is perfectly acceptable and most well read people will recognise it although it is not a commonly used word. But it is by no means restricted to the scientific community. Common parlance uses data as both plural and singular, much as sheep is used both ways. "data set" is often used when plurality is being emphasised. (I have a friend (who works in MIS systems) who is very particular about his use of data/datum.)
How to denote a single unit of data wothout using the phrase "piece of data"?
datum does just fine. Or sometimes "data point" is used.
datum and data are fine when the plurality is obvious. Otherwise I tend to use data set to emphasise plurality. Never use datas! HTH, -- Alan Gauld Author of the Learn to Program web site http://www.alan-g.me.uk/

On 04/16/10 07:37, Alan Gauld wrote:
I'm sorry, but I'm not sure how you define your equality; in my textbook equality is a relationship that is "reflexive, symmetric, and transitive" (commutative is just another way to say symmetric). I'm not sure what you meant when you say value == object but object != value. Did you mean value *is an* object but object *is not (necessarily) a* value?
You don't need to define a mapping; since definition of mapping can be derived from function: - mapping is the set of all (x, F(x)) but so do function can be defined from a mapping: - f(x) = y iff (x,y) ∈ {set of 2-tuple} so we can say that a mapping and function is equivalent:
I guess the problem here is the distinction between data, value, object depends on how you define it. IOW, trying to make a concrete distinction from an inconcrete axioms/definition is bound to be futile. Many Information Systems people like to differentiate between data and information; most English dictionaries does not.

"Lie Ryan" <lie.1296@gmail.com> wrote
I'm sorry, but I'm not sure how you define your equality; ... Did you mean value *is an* object but object *is not (necessarily) a* value?
Yes, exactly. On reflection "=" was probably the wrong symbol to use.
True but that's the point I was making. The mapping whether implemented as a table/dictionary or as a function is still a form of data but it is not a value or even a set of values.
Indeed, and I'm very aware that Denis is coming from another language and even within English there are there are geographical differences of idiom and definition for the same words. It is therefore very difficult to be absolutely precise in any such discussion.
Many Information Systems people like to differentiate between data and information; most English dictionaries does not.
Ooh I was deliverately steering clear of the distinctions between data, information and knowledge! There lie religious wars! :-) Likewise the distinctions between Type and class can take many twists. -- Alan Gauld Author of the Learn to Program web site http://www.alan-g.me.uk/

Alan Gauld writes:
Ooh I was deliverately steering clear of the distinctions between data, information and knowledge! There lie religious wars! :-)
That's precisely why Guido says "don't say 'data' in this context." It's not your choice to steer clear of those differences of definition, it's your reader's, and she is not going to be aware that it's necessary unless you explain. You can't win here.

On 04/15/2010 12:37 PM, spir ☣ wrote:
Uh.. you are trying to have a discussion about detailed semantics without defining what you mean by any of your terminology. As I see it (which is undoubtedly misleading in many respects): the set of all data is a countably infinite and unordered. (perhaps segments of memory) the set of all values is a countably infinite and partially ordered. an object is a member of the set of (datum, value) pairs. (perhaps a memory address coupled with a type) In Python, each object with a different value has a different datum (and for the most part, objects with the same values have different data too). The only operation that concerns that datum of an object is the "is" operator. The value of an object is more useful, and this is what all the other comparison operators (and most other functions) deal with. It shouldn't be necessary to make the distinction between an object and its value in documentation (though it is (I presume) occasionally useful in actual code to distinguish objects with the same datum, perhaps in cycle detection). Conrad
You can use datum/data as countable but it sounds forced (as above).

On Thu, 15 Apr 2010 14:31:49 +0100 Conrad Irwin <conrad.irwin@googlemail.com> wrote:
This is precisely the point: how to properly use given terms -- giving them semantic distinctions or not.
[...]
(and for the most part, objects with the same values have different data too).
Right, this makes sense for me. And do you mean data are different as soon as located at different places in memory even if bit-per-bit equal? Or only that value holds a notion of interpretation (due to the type)?
[...]
Thank you. Let us take the case of a simple assignment: name = expression Once the expression is evaluated, what we get is commonly called a value, right? But in numerous places the result of data lookup is called object instead. While conceptually, for me, it's exactly the same thing. x.a = 1 results in an attribute 'a' with "value" 1, in x. b = x.a looks up for the "object" denoted to by the attr name 'a' in x. (I'm not trying to annoy people, just to clarify common notions.)
Conrad
________________________________ vit esse estrany ☣ spir.wikidot.com

2010/4/15 spir ☣ <denis.spir@gmail.com>:
Data is a buzzword. Object is, well, an object. It has a distinct id/address/location/whatever. (Though these may be reused after an object is destroyed.) Value is semantically defined by the type/class -- e.g. for numbers, two objects with the value 1 have the same value (even if the type is different, e.g. 1 and 1.0); and similar for strings. But for other object types the value is just the object identity. And for tuples it actually depends on the items. IOW values are defined by __eq__.
There is no notion of bit-per-bit equal. Only object identity (same address) and value equality defined by __eq__.
No, an object. (Which has a value but the meaning of the value depends on the type.)
But in numerous places the result of data lookup is called object instead.
Please stop using 'data'. It has no meaning in this context.
No, attribute lookup returns an object, not a value.
b = x.a looks up for the "object" denoted to by the attr name 'a' in x.
All this is defined by object semantics. Value semantics apply at a later level.
(I'm not trying to annoy people, just to clarify common notions.)
-- --Guido van Rossum (python.org/~guido)

On 04/16/10 04:36, Guido van Rossum wrote:
IMHO, the fact that the return value of id(...) can be reused by different object is a bug (though a bug that have no practical impact for most real-life program and other possible alternative implementations I can think of are no good for practical purposes)

data = plural of datum both words are commonly misused as to singular/plural usage. "piece/unit/item of data" is a common substitution for "datum" by people who don't know the word. [Likewise, "media" is the plural of "medium" but people get that confused too.] With numbers, a single number is a datum. More than one numbers are data. You would never refer to "the value X" (where X is a python variable) but you would refer to "the value 3" or "the value pi". Generally, in English when people say "data" they're referring to the numbers in a more abstract sense than 3 and pi. I would say that roughly: value : datum :: instance : object But this is hardly precise. People frequently refer to A as an object when we really mean it's an instance of the object [e.g., class] Alpha. --- Bruce http://www.vroospeak.com 2010/4/15 spir ☣ <denis.spir@gmail.com>

2010/4/15 Bruce Leban <bruce@leapyear.org>:
data = plural of datum both words are commonly misused as to singular/plural usage.
Too pedantic.
No. In Python, object always refers to an instance. "An instance of an object" is nonsense (unless the object happens to be a class object, in which case the object's class is also called a metaclass). We do use type and class interchangeably in Python. (Except in very old Python versions where type refers to a built-in type or a type defined by a built-in or extension module, and class refers to a user-defined class -- but in modern Python there is no longer a difference.) -- --Guido van Rossum (python.org/~guido)

Sorry -- Denis asked not just about Python but about English and I was speaking in that context and it's hard to not be pedantic in that case. :-) Even if Python defines precise meanings of 'object', 'class', 'metaclass' most of us use other languages too which use those words differently so there will invariably be some cross-leakage. I know a programming system that uses the word 'model' where others use table and 'entity' where others use row. :-) --- Bruce http://www.vroospeak.com On Thu, Apr 15, 2010 at 12:00 PM, Guido van Rossum <guido@python.org> wrote:

Bruce Leban writes:
I would say that roughly:
value : datum :: instance : object
You can say that if you like, but unfortunately you cannot expect others to take that meaning without an explicit gloss to explain what you mean. Data in the broadest sense (as the plural of datum) is just tiny pieces of unstructured information, but it is often used in other senses, such as coextensive with "information", or connoting an array of information pieces of some type, or a stream of information pieces conforming to some syntax. Because of current common usage, it is most useful as a collective noun for "information" when you don't want to be precise. Since in English it is rare that we are precise about the "sense" in which we use "data", I agree with Guido that the word should be avoided in this kind of discussion.

spir ☣ wrote:
What useful distinctions are or may be done, for instance in documentation?
Python mainly considers: object identity (x is y, determined by id()) object value (x == y, determined by the implementations of relevant __eq__ methods) For many objects, their value degenerates to being the same as their identity (i.e. they compare equal only with themselves), but others have for more useful comparisons defined (e.g. containers, numbers). In addition to objects with their identities and values, there are references to objects. These 'references' are the only things in Python which aren't first class objects, since they reflect lookup entries in some namespace or container, such as the mapping from names to pointers in a class or module __dict__ or a function's locals, or the items stored in a list or set. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

"spir ☣" <denis.spir@gmail.com> wrote Lots of points here and I suspect the English usage will vary greatly between the different English communities.
I'm not sure I understand your distinction between the two statements. I think in Python it's true to say that value=object but the rule is not commutative. object !=value in every case. (A function is an object but is not really a value (although it will return a value if called, and of course has an id() and is "not None", so in that sense is a value - but that is I think a special case) Data is a concept and is wider than mere values or objects. values can be data. But so can rules. Now a rule can be expressed as a function or as a mapping, and the mapping will contain values, but the mapping - the relationships - are not explicit values, they are rules inherent to the mapping. But the mapping is data. Objects differ from pure values in most languages in that they usually (always?) have operations (at least a constructor, and in modern Python much more). But since I said that all values in Python are objects their difference here is moot.
My dictionary defines data as facts or figures from which conclusions can be inferred; information facts are not always the same as values (defined as "measures" so by definition relative - you can compare values against a known datum of similar type (which is another concept again!) but you cannot really compare facts other than by their truthfulness, but then, an untruthful fact is not a fact!)
Speaking for the English English: "datas" is weird, in fact it is a non-word so far as I know. "datum" is perfectly acceptable and most well read people will recognise it although it is not a commonly used word. But it is by no means restricted to the scientific community. Common parlance uses data as both plural and singular, much as sheep is used both ways. "data set" is often used when plurality is being emphasised. (I have a friend (who works in MIS systems) who is very particular about his use of data/datum.)
How to denote a single unit of data wothout using the phrase "piece of data"?
datum does just fine. Or sometimes "data point" is used.
datum and data are fine when the plurality is obvious. Otherwise I tend to use data set to emphasise plurality. Never use datas! HTH, -- Alan Gauld Author of the Learn to Program web site http://www.alan-g.me.uk/

On 04/16/10 07:37, Alan Gauld wrote:
I'm sorry, but I'm not sure how you define your equality; in my textbook equality is a relationship that is "reflexive, symmetric, and transitive" (commutative is just another way to say symmetric). I'm not sure what you meant when you say value == object but object != value. Did you mean value *is an* object but object *is not (necessarily) a* value?
You don't need to define a mapping; since definition of mapping can be derived from function: - mapping is the set of all (x, F(x)) but so do function can be defined from a mapping: - f(x) = y iff (x,y) ∈ {set of 2-tuple} so we can say that a mapping and function is equivalent:
I guess the problem here is the distinction between data, value, object depends on how you define it. IOW, trying to make a concrete distinction from an inconcrete axioms/definition is bound to be futile. Many Information Systems people like to differentiate between data and information; most English dictionaries does not.

"Lie Ryan" <lie.1296@gmail.com> wrote
I'm sorry, but I'm not sure how you define your equality; ... Did you mean value *is an* object but object *is not (necessarily) a* value?
Yes, exactly. On reflection "=" was probably the wrong symbol to use.
True but that's the point I was making. The mapping whether implemented as a table/dictionary or as a function is still a form of data but it is not a value or even a set of values.
Indeed, and I'm very aware that Denis is coming from another language and even within English there are there are geographical differences of idiom and definition for the same words. It is therefore very difficult to be absolutely precise in any such discussion.
Many Information Systems people like to differentiate between data and information; most English dictionaries does not.
Ooh I was deliverately steering clear of the distinctions between data, information and knowledge! There lie religious wars! :-) Likewise the distinctions between Type and class can take many twists. -- Alan Gauld Author of the Learn to Program web site http://www.alan-g.me.uk/

Alan Gauld writes:
Ooh I was deliverately steering clear of the distinctions between data, information and knowledge! There lie religious wars! :-)
That's precisely why Guido says "don't say 'data' in this context." It's not your choice to steer clear of those differences of definition, it's your reader's, and she is not going to be aware that it's necessary unless you explain. You can't win here.
participants (8)
-
Alan Gauld
-
Bruce Leban
-
Conrad Irwin
-
Guido van Rossum
-
Lie Ryan
-
Nick Coghlan
-
spir ☣
-
Stephen J. Turnbull