Pointrel20030812.1 Data Repository System released
Paul D. Fernhout
pdfernhout@kurtz-fernhout.com
Wed, 20 Aug 2003 19:25:31 -0400
Pointrel20030812.1 for Python 2.3 is now available at SourceForge:
http://sourceforge.net/projects/pointrel/
* Highlights for this version?
+ Uses 64 bit file positions.
+ Log file format is XML.
+ Specification document included.
+ CRC32 now used for string hashes.
+ Locks no longer deleted automatically when old by default.
+ Now uses WILD constant instead of "*" for wildcards in searches.
+ Data strings can be Unicode (UTF-8), binary, or Python objects.
* Release notes?
This version is pure-Python and has been tested under Python 2.3. It may
work with earlier 2.x Python versions or require only minor changes for
them. It was tested only under Debian GNU/Linux, but earlier versions
ran under Windows. The biggest change from the previous version is
support for very large files (>2GB) by using 64 bit file positions. It
still needs more testing though, so don't put all your terabytes of
vital enterprise data in it just yet without some kind of parallel
system going, just in case. ;-) But seriously, the software comes with
NO WARRANTY (see the included file license.txt for details).
* In what situations is the Pointrel Data Repository System intended to
useful?
The Pointrel System is intended to provide a flexible framework for
storing information that can co-evolve with your data storage needs.
The best match is a situation where you have potentially a lot of adhoc
data which you or others may want to use in a variety of ways now and in
the future, and you want a solution that lets you get started right now
and then keep making changes on the fly, while preserving all data
contributed permanently as an archive. An example of this would be an
extensible peer-to-peer groupware application supporting continual
frequent revision of everything in the repository. I intend to use it in
this way, as part of the infrastructure of a project related to
collecting manufacturing knowledge.
Or, you might have well defined data storage needs and just want to
support some sort of complex persistent transactional data storage by
including only one or two extra files in your application which don't
need special installation or configuration, and you are willing to live
with potentially slower performance and some other limitations. There
are other potential mini-database candidates for such a role as well
(including Python's DBM module), so the Pointrel System might only be
selected over them in this case with an eye to future expansion in more
adhoc ways, such as supporting third party tools that will use the same
store.
Or, perhaps, you just might want to stretch your mind after too much SQL
coding. :-)
* What is the underlying design of the Pointrel Data Repository System?
The Pointrel Data Repository System is a variant of an
Entity-Relationship (E-R) model database. The Pointrel system provides a
way to easily handle loosely structured data stored on disk, like for
INI files, version control systems, bug tracking systems, or simple AI
type applications. It takes an approach to data storage which emphasizes
flexibility over speed. It also emphasizes storing new information for
the long term over modifying or deleting old information. It hopefully
makes it easier to build new layers of abstraction and indexing over old
data. The Pointrel Data Repository System bears some resemblance to the
ROSE/STAR system described by William Kent in his book "Data & Reality".
It also bears some resemblance to RDF. The Pointrel System is to an
extent mainly a mindset about how to build extensible applications using
persistent E-R data, and this release is one example of a tool to make
such applications easier to build.
In a nutshell, the Pointrel Data Repository System helps you build
associations which define relationships between entities. These
associations are essentially triadal links between things indicating one
thing is linked to a second thing in a way defined by a third thing. The
simplest way to use such links is to make the equivalent of object
properties or a dictionary, such as "Fluffy weight 20kg" which if a
dictionary would be Fluffy["weight] = "20kg". However, Pointrel differs
from a dictionary in that is supports queries like one for all
dictionaries which define a weight of 20kg or all relationships between
"Fluffy" and "20kg". Triads are all defined within a specific context
space that gives meaning to the associations (making triads actually
have four fields). The context allows triads to be handled within an
archive in a somewhat more modular fashion using them as filters, since
you can easily ignore triads not in the context of interest. All fields
of a triad are indefinite length strings (binary, Unicode, or a pickled
Python object) -- so they could be anything from the test "foo" to the
contents of a binary file to a Python dictionary.
* What is the Pointrel Data Repository System not?
At this point the Pointrel system is not a polished product. This
release is best enjoyed by someone who is more experimentally minded who
just wants to have some fun while learning about the entity-relationship
model of data storage as another tool for their toolbox. This release is
also for those with a desperate need to make sense of a deluge of adhoc
data and are willing to take risks on something new in practice (but old
in theory). If you are looking for tried and proven methods and software
for storing data right now, relational databases like MySQL or
PostgreSQL are probably safer bets. Obviously, having said that, I hope
you will try this system anyway and supply constructive criticism and
positive feedback.
* What is a simple example of the Pointrel API being used?
See "pointrelTestFluffyExample.py" for an example of using the
simplified global function interface.
Here is an excerpt from that file:
from pointrel20030812 import *
Pointrel_initialize("archive_fluffyExample")
Pointrel_startTransaction()
Pointrel_add("examplecontext", "Fluffy", "weight", "20kg")
Pointrel_add("examplecontext", "Fluffy", "color", "beige")
Pointrel_add("examplecontext", "Fluffy", "teeth", "pointy")
Pointrel_add("examplecontext", "Fluffy", "teeth", "nasty")
Pointrel_add("examplecontext", "Fluffy", "preferred food", "Knights
who say 'Nie!'")
Pointrel_finishTransaction()
string = Pointrel_lastMatch("examplecontext", WILD, "weight", "20kg")
print string
# string would be --> "Fluffy"
string = Pointrel_lastMatch("examplecontext", "Fluffy", "weight", WILD)
print string
# string would be --> "20kg"
string = Pointrel_lastMatch("examplecontext", "Fluffy", "teeth", WILD)
print string
# string would be --> "nasty"
list = Pointrel_allMatches("examplecontext", "Fluffy", "teeth", WILD)
print list
# list would be --> ["pointy", "nasty"]
* What is a more complex example of using the Pointrel system?
See the file "tkPointrelMemex.py". It implements a version of the Memex
archiving system proposed by Vannevar Bush it the 1940s. To my
knowledge, it is the first software implementation of something this
close to his concept (technically, depending on how you read his work,
his trails may be more like trees than the linear trails here). Feel
free to send me pointers to earlier software implementations of MEMEX --
I will be happy to retract this claim in exchange for learning more of
the history of his great idea.
Here is a typical API useage for the more complex OO API:
repository = PointrelDataRepositorySystem(archiveName)
repository.startTransaction()
repository.add(context, a, b, c)
repository.finishTransaction()
string = repository.lastMatch(context, WILD, b, c)
string = repository.lastMatch(context, a, b, WILD)
list = repository.allMatches(context, a, b, WILD)
string = repository.generateUniqueID()
* What is the license?
BSDish. See license.txt for details.
--Paul Fernhout
http://www.pointrel.org
-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----== Over 100,000 Newsgroups - 19 Different Servers! =-----