Some time ago I posted the following musings to my work colleagues and a few of us chased it around for a while. I came across it recently and thought the ideas might be of some interest to the members of these lists. If anyone knows of resources please let me know... ----------- Included text --------------- I read a collection of papers on Software Engineering recently published by the IEEE and edited by Professor Richard Thayer (and his friend). One recurring theme in these "State of the Practice" papers was the lack of a fundamental theoretical basis for computing. i.e. there's nothing comparable to the laws of physics in software engineering. I started thinking and doodling about what the fundamentals are and came up with several notions (ideas would imply something far too well formed!) These are based around the concept that software manipulates data or more correctly "information"(no surprises there! :). However most information theory (Shannon et al) relates to bits. Far too low level to be useful. That started me thinking about levels of information and I came up with 3 layers of information - rather like an OSI comms model: 1/ Physical - bits/bytes, defined by the machine architecture operations are CPU specific, include bitwise OR/AND/NOT and binary arithmetic... 2/ Implementation/Environment - data types defined by the programming environment - object in Smallktalk; int, float, char in C etc... Operations include built in operators for arithmetic, boolean logic and I/O. [Question: Where do collections: arrays, lists etc fit into the layer proposal?] 3/ Application - User defined data types - records, files, RDBMS Tables etc Operations are user defined functions/procedures etc. Other candidate layers include "Standard libraries" etc, but I rejected these as a subset of either Implementation or Application layers. To be useful any fundamental basis of software would have to express concepts which applied with equal validity across all layers. - ie not be dependant on data format, or semantics but simply relate to *relative* information content. Operations would need to be expressable in terms of data transforms across and within layers. I could go on (onto the nature of operations!) but that's probably enough for now. Now the big question is: Since I am sure this isn't original, who has done this stuff before? - Where can I get papers or books on fundamental information representation/transformation theory? I assume there must be something? somewhere? [ Note: I am not talking about Knowledge Engineering which has more to do with how information is stored and processed than what information is, its empirical qualities etc... ] Alan Gauld BT computing partners Tel : 0141 220 8795 Fax : 0141 248 1284
On Sun, 19 Aug 2001 alan.gauld@bt.com wrote:
One recurring theme in these "State of the Practice" papers was the lack of a fundamental theoretical basis for computing. i.e. there's nothing comparable to the laws of physics in software engineering.
[some text cut]
Since I am sure this isn't original, who has done this stuff before? - Where can I get papers or books on fundamental information representation/transformation theory? I assume there must be something? somewhere?
David Gries has written a book called "The Science of Programming" which states a framework for writing mathematicaly correct programs. I bought it after seeing Jon Bentley's recommendation in "Programming Pearls". As is typical with me, I haven't really gotten past Chapter One yet. *grin* Still, from what I can glean, Gries uses the power of logic and assertions to show how people can be confident in a program's correctness. There's also a chapter on "inverting" programs which looks like a lot of fun.
For the OSI-like model you might want to break it up a bit further: 1 / Physical - bits and bytes defined by machine architecture. 2 / Machine-specific - this is the actual instruction set for a given machine 3 / Implementation - this includes data primitives of the given language. 4 / Implementation Groupings - collections, arrays, lists and structured types which are built into the language. 5 / Application - Developer-defined data types and tie-ins with other systems (RDBMS, etc). 6 / Extensions - Plugins or enhancements which are not part of the original program, but operate within it's context and add additional information structure. (Would XML go here?) 7 / User-defined - Some programs allow the user to extend the data set (by embedding Python, say). Hmmm. I probably should have numbered from zero, my bad %-) --Dethe 1/ Physical - bits/bytes, defined by the machine architecture operations are CPU specific, include bitwise OR/AND/NOT and binary arithmetic... 2/ Implementation/Environment - data types defined by the programming environment - object in Smallktalk; int, float, char in C etc... Operations include built in operators for arithmetic, boolean logic and I/O. [Question: Where do collections: arrays, lists etc fit into the layer proposal?] 3/ Application - User defined data types - records, files, RDBMS Tables etc Operations are user defined functions/procedures etc. (Alan, sorry about the duplicate, I keep forgetting that reply doesn't go to the group on this list). -- Dethe Elza (delza@burningtiger.com) Chief Mad Scientist Burning Tiger Technologies (http://burningtiger.com) -- Dethe Elza (delza@burningtiger.com) Chief Mad Scientist Burning Tiger Technologies (http://burningtiger.com)
Related to the discussion of the "levels of information." There is a "levels of abstraction: Class - a group of related data and functionality Property - data belonging to a class/object which may be a variable or the result of a computation Pattern - a generalization of a recurring problem and its solution set Idiom - a pattern within the context of a given language Framework - a collection of related idioms packaged into a standalone unit Aspect - A cross-section of one consideration in a program (say, Security), factored out to make it modular Module - Some amount of functionality which is packaged as a standalone unit Component - A class or collection of classes which are packaged as a standalone unit and can be swapped in and out of a system. Library - A collection of functionality packaged for re-use Distributed Component - A component which spreads its functionality across multiple computers. Guideline - A recommendation for use, more specific than a pattern. Style Guideline - Standards for presenting code for maintenance. Documentation - Details about the documented system, high-level programs for human metacomputers. Pattern Language - A group of patterns which are mutually supporting or related Don't know how useful this is to anyone. Most of these relate to the tenets of pattern design and OO: * Seperate what changes from what stays the same (or things which change with different frequencies). * Solve problems by adding a layer of abstraction * Work at as high a level as possible to promote clarity. Clarity is the key for maintainable and extensible systems. -- Dethe Elza (delza@burningtiger.com) Chief Mad Scientist Burning Tiger Technologies (http://burningtiger.com)
participants (3)
-
alan.gauld@bt.com
-
Danny Yoo
-
Dethe Elza