![](https://secure.gravatar.com/avatar/db32238d5eebf878622c8bd2770a7d0e.jpg?s=120&d=mm&r=g)
On Wed, 29 Oct 2003 16:12:50 -0800 Chuq Von Rospach <chuqui@plaidworks.com> wrote:
On Oct 29, 2003, at 2:28 PM, Peter C. Norton wrote:
I may not have made it clear, but I'm focusing on the metadata. Once you've parsed rfc822/2822, then it may become easier to have things in the database that can manipulate those types. I.e. to do be able to do simple searches for a property of given arbitrary headers (w/o having to have a database schema that consists of a few known headers and "others" which you then have to treat as a blob or as text).
my only real worry is that from what I've seen, 99.99% of the time, the user is going to want content searches. header stuff is fine, but of really low priority in the scheme of things (necessary to put useful things together, meaningless if you can't content/context search in fulltext).
I see two needs, for significantly different populations. The first wants a browsing interface with keyed and indexed by date, thread, and author. The second wands full text search with rapid location and retrieval of matching messages. Often a single user will move between the access methods, reading by thread, bouncing over to a search, then reading all an author has written that match, then searching again, etc. As such two distinct sets of indexes seem called for: full text and message meta-data.
that's why I'm leaning, blob issues or no, towards full-text storage in MySQL 4. Because if you can't easily chop up the message body content and find the messages you want to deal with, elegant storage of the headers is irrelevant...
True. However, but this seems to conflate two distinct problems. If you're going to do unindexed searches then this makes sense, however except for minimal cases that's an interesting space. It scales like crap and has an even worse feature set. It is more interesting to split storage and indexing into distinct solution designs, and to build or pick something tailored for that smaller problem. That way you don't do full text searching, you do full text indexing and then search the indexes.
I think you need that, too. But until you get a reasonable context search for the message body, designing the rest is silly.
Is searching message bodies really interesting, or is building indexes of message bodies such that you can later search those indexes the actually interesting point?
And it seems to me there are few better methods than dumping the text into MySQL and letting it do the work. Compromises, tradeoffs and etc notwithstanding...
How does MySQL help you in building language-sensitive rapid response indexes of large text blobs?
--
J C Lawrence
---------(*) Satan, oscillate my metallic sonatas.
claw@kanga.nu He lived as a devil, eh?
http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live.