Pointrel20030812.1 Data Repository System released

Paul D. Fernhout pdfernhout@kurtz-fernhout.com
Wed, 20 Aug 2003 19:25:31 -0400


Pointrel20030812.1 for Python 2.3 is now available at SourceForge:
   http://sourceforge.net/projects/pointrel/

* Highlights for this version?

+ Uses 64 bit file positions.
+ Log file format is XML.
+ Specification document included.
+ CRC32 now used for string hashes.
+ Locks no longer deleted automatically when old by default.
+ Now uses WILD constant instead of "*" for wildcards in searches.
+ Data strings can be Unicode (UTF-8), binary, or Python objects.

* Release notes?

This version is pure-Python and has been tested under Python 2.3. It may 
work with earlier 2.x Python versions or require only minor changes for 
them. It was tested only under Debian GNU/Linux, but earlier versions 
ran under Windows. The biggest change from the previous version is 
support for very large files (>2GB) by using 64 bit file positions. It 
still needs more testing though, so don't put all your terabytes of 
vital enterprise data in it just yet without some kind of parallel 
system going, just in case. ;-) But seriously, the software comes with 
NO WARRANTY (see the included file license.txt for details).

* In what situations is the Pointrel Data Repository System intended to 
useful?

The Pointrel System is intended to provide a flexible framework for 
storing information that can co-evolve with your data storage needs.

The best match is a situation where you have potentially a lot of adhoc 
data which you or others may want to use in a variety of ways now and in 
the future, and you want a solution that lets you get started right now 
and then keep making changes on the fly, while preserving all data 
contributed permanently as an archive. An example of this would be an 
extensible peer-to-peer groupware application supporting continual 
frequent revision of everything in the repository. I intend to use it in 
this way, as part of the infrastructure of a project related to 
collecting manufacturing knowledge.

Or, you might have well defined data storage needs and just want to 
support some sort of complex persistent transactional data storage by 
including only one or two extra files in your application which don't 
need special installation or configuration, and you are willing to live 
with potentially slower performance and some other limitations. There 
are other potential mini-database candidates for such a role as well 
(including Python's DBM module), so the Pointrel System might only be 
selected over them in this case with an eye to future expansion in more 
adhoc ways, such as supporting third party tools that will use the same 
store.

Or, perhaps, you just might want to stretch your mind after too much SQL 
coding. :-)

* What is the underlying design of the Pointrel Data Repository System?

The Pointrel Data Repository System is a variant of an 
Entity-Relationship (E-R) model database. The Pointrel system provides a 
way to easily handle loosely structured data stored on disk, like for 
INI files, version control systems, bug tracking systems, or simple AI 
type applications. It takes an approach to data storage which emphasizes 
flexibility over speed. It also emphasizes storing new information for 
the long term over modifying or deleting old information. It hopefully 
makes it easier to build new layers of abstraction and indexing over old 
data. The Pointrel Data Repository System bears some resemblance to the 
ROSE/STAR system described by William Kent in his book "Data & Reality".
It also bears some resemblance to RDF. The Pointrel System is to an 
extent mainly a mindset about how to build extensible applications using 
persistent E-R data, and this release is one example of a tool to make 
such applications easier to build.

In a nutshell, the Pointrel Data Repository System helps you build 
associations which define relationships between entities. These 
associations are essentially triadal links between things indicating one 
thing is linked to a second thing in a way defined by a third thing. The 
simplest way to use such links is to make the equivalent of object 
properties or a dictionary, such as "Fluffy weight 20kg" which if a 
dictionary would be Fluffy["weight] = "20kg". However, Pointrel differs 
from a dictionary in that is supports queries like one for all 
dictionaries which define a weight of 20kg or all relationships between 
"Fluffy" and "20kg". Triads are all defined within a specific context 
space that gives meaning to the associations (making triads actually 
have four fields). The context allows triads to be handled within an 
archive in a somewhat more modular fashion using them as filters, since 
you can easily ignore triads not in the context of interest. All fields 
of a triad are indefinite length strings (binary, Unicode, or a pickled 
Python object) -- so they could be anything from the test "foo" to the 
contents of a binary file to a Python dictionary.

* What is the Pointrel Data Repository System not?

At this point the Pointrel system is not a polished product. This 
release is best enjoyed by someone who is more experimentally minded who 
just wants to have some fun while learning about the entity-relationship 
model of data storage as another tool for their toolbox. This release is 
also for those with a desperate need to make sense of a deluge of adhoc 
data and are willing to take risks on something new in practice (but old 
in theory). If you are looking for tried and proven methods and software 
for storing data right now, relational databases like MySQL or 
PostgreSQL are probably safer bets. Obviously, having said that, I hope 
you will try this system anyway and supply constructive criticism and 
positive feedback.

* What is a simple example of the Pointrel API being used?

See "pointrelTestFluffyExample.py" for an example of using the 
simplified global function interface.

Here is an excerpt from that file:

   from pointrel20030812 import *

   Pointrel_initialize("archive_fluffyExample")

   Pointrel_startTransaction()
   Pointrel_add("examplecontext", "Fluffy", "weight", "20kg")
   Pointrel_add("examplecontext", "Fluffy", "color", "beige")
   Pointrel_add("examplecontext", "Fluffy", "teeth", "pointy")
   Pointrel_add("examplecontext", "Fluffy", "teeth", "nasty")
   Pointrel_add("examplecontext", "Fluffy", "preferred food", "Knights 
who say 'Nie!'")
   Pointrel_finishTransaction()

   string = Pointrel_lastMatch("examplecontext", WILD, "weight", "20kg")
   print string
   # string would be --> "Fluffy"

   string = Pointrel_lastMatch("examplecontext", "Fluffy", "weight", WILD)
   print string
   # string would be  --> "20kg"

   string = Pointrel_lastMatch("examplecontext", "Fluffy", "teeth", WILD)
   print string
   # string would be --> "nasty"

   list = Pointrel_allMatches("examplecontext", "Fluffy", "teeth", WILD)
   print list
   # list would be  --> ["pointy", "nasty"]

* What is a more complex example of using the Pointrel system?

See the file "tkPointrelMemex.py". It implements a version of the Memex 
archiving system proposed by Vannevar Bush it the 1940s. To my 
knowledge, it is the first software implementation of something this 
close to his concept (technically, depending on how you read his work, 
his trails may be more like trees than the linear trails here). Feel 
free to send me pointers to earlier software implementations of MEMEX -- 
I will be happy to retract this claim in exchange for learning more of 
the history of his great idea.

Here is a typical API useage for the more complex OO API:

   repository = PointrelDataRepositorySystem(archiveName)
   repository.startTransaction()
   repository.add(context, a, b, c)
   repository.finishTransaction()
   string = repository.lastMatch(context, WILD, b, c)
   string = repository.lastMatch(context, a, b, WILD)
   list = repository.allMatches(context, a, b, WILD)
   string = repository.generateUniqueID()

* What is the license?

BSDish. See license.txt for details.

--Paul Fernhout
http://www.pointrel.org



-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----==  Over 100,000 Newsgroups - 19 Different Servers! =-----