Pointrel20030812.1 Data Repository System released

Pointrel20030812.1 for Python 2.3 is now available at SourceForge: http://sourceforge.net/projects/pointrel/ * Highlights for this version? + Uses 64 bit file positions. + Log file format is XML. + Specification document included. + CRC32 now used for string hashes. + Locks no longer deleted automatically when old by default. + Now uses WILD constant instead of "*" for wildcards in searches. + Data strings can be Unicode (UTF-8), binary, or Python objects. * Release notes? This version is pure-Python and has been tested under Python 2.3. It may work with earlier 2.x Python versions or require only minor changes for them. It was tested only under Debian GNU/Linux, but earlier versions ran under Windows. The biggest change from the previous version is support for very large files (>2GB) by using 64 bit file positions. It still needs more testing though, so don't put all your terabytes of vital enterprise data in it just yet without some kind of parallel system going, just in case. ;-) But seriously, the software comes with NO WARRANTY (see the included file license.txt for details). * In what situations is the Pointrel Data Repository System intended to useful? The Pointrel System is intended to provide a flexible framework for storing information that can co-evolve with your data storage needs. The best match is a situation where you have potentially a lot of adhoc data which you or others may want to use in a variety of ways now and in the future, and you want a solution that lets you get started right now and then keep making changes on the fly, while preserving all data contributed permanently as an archive. An example of this would be an extensible peer-to-peer groupware application supporting continual frequent revision of everything in the repository. I intend to use it in this way, as part of the infrastructure of a project related to collecting manufacturing knowledge. Or, you might have well defined data storage needs and just want to support some sort of complex persistent transactional data storage by including only one or two extra files in your application which don't need special installation or configuration, and you are willing to live with potentially slower performance and some other limitations. There are other potential mini-database candidates for such a role as well (including Python's DBM module), so the Pointrel System might only be selected over them in this case with an eye to future expansion in more adhoc ways, such as supporting third party tools that will use the same store. Or, perhaps, you just might want to stretch your mind after too much SQL coding. :-) * What is the underlying design of the Pointrel Data Repository System? The Pointrel Data Repository System is a variant of an Entity-Relationship (E-R) model database. The Pointrel system provides a way to easily handle loosely structured data stored on disk, like for INI files, version control systems, bug tracking systems, or simple AI type applications. It takes an approach to data storage which emphasizes flexibility over speed. It also emphasizes storing new information for the long term over modifying or deleting old information. It hopefully makes it easier to build new layers of abstraction and indexing over old data. The Pointrel Data Repository System bears some resemblance to the ROSE/STAR system described by William Kent in his book "Data & Reality". It also bears some resemblance to RDF. The Pointrel System is to an extent mainly a mindset about how to build extensible applications using persistent E-R data, and this release is one example of a tool to make such applications easier to build. In a nutshell, the Pointrel Data Repository System helps you build associations which define relationships between entities. These associations are essentially triadal links between things indicating one thing is linked to a second thing in a way defined by a third thing. The simplest way to use such links is to make the equivalent of object properties or a dictionary, such as "Fluffy weight 20kg" which if a dictionary would be Fluffy["weight] = "20kg". However, Pointrel differs from a dictionary in that is supports queries like one for all dictionaries which define a weight of 20kg or all relationships between "Fluffy" and "20kg". Triads are all defined within a specific context space that gives meaning to the associations (making triads actually have four fields). The context allows triads to be handled within an archive in a somewhat more modular fashion using them as filters, since you can easily ignore triads not in the context of interest. All fields of a triad are indefinite length strings (binary, Unicode, or a pickled Python object) -- so they could be anything from the test "foo" to the contents of a binary file to a Python dictionary. * What is the Pointrel Data Repository System not? At this point the Pointrel system is not a polished product. This release is best enjoyed by someone who is more experimentally minded who just wants to have some fun while learning about the entity-relationship model of data storage as another tool for their toolbox. This release is also for those with a desperate need to make sense of a deluge of adhoc data and are willing to take risks on something new in practice (but old in theory). If you are looking for tried and proven methods and software for storing data right now, relational databases like MySQL or PostgreSQL are probably safer bets. Obviously, having said that, I hope you will try this system anyway and supply constructive criticism and positive feedback. * What is a simple example of the Pointrel API being used? See "pointrelTestFluffyExample.py" for an example of using the simplified global function interface. Here is an excerpt from that file: from pointrel20030812 import * Pointrel_initialize("archive_fluffyExample") Pointrel_startTransaction() Pointrel_add("examplecontext", "Fluffy", "weight", "20kg") Pointrel_add("examplecontext", "Fluffy", "color", "beige") Pointrel_add("examplecontext", "Fluffy", "teeth", "pointy") Pointrel_add("examplecontext", "Fluffy", "teeth", "nasty") Pointrel_add("examplecontext", "Fluffy", "preferred food", "Knights who say 'Nie!'") Pointrel_finishTransaction() string = Pointrel_lastMatch("examplecontext", WILD, "weight", "20kg") print string # string would be --> "Fluffy" string = Pointrel_lastMatch("examplecontext", "Fluffy", "weight", WILD) print string # string would be --> "20kg" string = Pointrel_lastMatch("examplecontext", "Fluffy", "teeth", WILD) print string # string would be --> "nasty" list = Pointrel_allMatches("examplecontext", "Fluffy", "teeth", WILD) print list # list would be --> ["pointy", "nasty"] * What is a more complex example of using the Pointrel system? See the file "tkPointrelMemex.py". It implements a version of the Memex archiving system proposed by Vannevar Bush it the 1940s. To my knowledge, it is the first software implementation of something this close to his concept (technically, depending on how you read his work, his trails may be more like trees than the linear trails here). Feel free to send me pointers to earlier software implementations of MEMEX -- I will be happy to retract this claim in exchange for learning more of the history of his great idea. Here is a typical API useage for the more complex OO API: repository = PointrelDataRepositorySystem(archiveName) repository.startTransaction() repository.add(context, a, b, c) repository.finishTransaction() string = repository.lastMatch(context, WILD, b, c) string = repository.lastMatch(context, a, b, WILD) list = repository.allMatches(context, a, b, WILD) string = repository.generateUniqueID() * What is the license? BSDish. See license.txt for details. --Paul Fernhout http://www.pointrel.org -----= Posted via Newsfeeds.Com, Uncensored Usenet News =----- http://www.newsfeeds.com - The #1 Newsgroup Service in the World! -----== Over 100,000 Newsgroups - 19 Different Servers! =-----
participants (1)
-
Paul D. Fernhout