Optimizing for binary storage/bit-fiddling

John Hall wweexxsseessssaa at telusplanet.net
Sat May 4 09:03:31 EDT 2002


On Sat, 04 May 2002 12:37:05 GMT, "Thomas Weholt" <2002 at weholt.org>
wrote:

(Much good stuff snipped for brevity...)
>PS! Any good references on very, very simple bit-fiddling ( tutorials etc. )
>would also be highly appreciated.

How apposite - I was just pondering such a thing myself.
I'm new to Python, and am considering a small app for keeping chunks
of text, and being able to search for "word1 AND word2 NOT (word3 OR
word4)" etc. Years ago I did this in PL/I on IBM's MVS mainframe, when
a 5MB HD was about the size of a laundry spin-dryer, and cost more
than a decent car. We needed to use storage economically.

It had a word list, where each entry had the word as the key, and a
bit-string represented numbers where that word occurred:-
Each text item had a sequence number (used as a key) assigned as the
item was added, and for each (indexable) word, that bit# was set on,
i.e. if "lunch" occurs in textitems 3, 17, 87 & 142, those numbered
bits were ON in lunch's bitstring.

For searching, the relevant word bit-strings were assembled, and the
appropriate AND, OR, NOT etc operations applied. The resulting
bit-string represented the textitem keys of the hitlist.

PL/I made this bit-twiddling simple and efficient.

NB such an app is overkill for my intended use, but I'm considering it
as a learning exercise. I suspect that adressing individual bits in
Python will not be worth the effort, but I eagerly await responses to
your query.

>a novice in this area :)
Another novice (but manyvirtues)

-- 
John W Hall <wweexxsseessssaa at telusplanet.net>
Calgary, Alberta, Canada.
"Helping People Prosper in the Information Age"



More information about the Python-list mailing list