How would you design scalable solution?
bryanvick at gmail.com
Tue Oct 27 18:10:08 CET 2009
I'm designing a system and wanted to get some feedback on a potential
performance problem down the road while it is still cheap to fix.
The system is similar to an accounting system where a system tracks
which move between different "Buckets". The system answers these
- How many Things are in this Bucket today?
- Move Things from Bucket1 to Bucket2...
- Now how many Things are in each Bucket?
So each time a Thing is moved between Buckets, I imagine a DB row like
| id | thingId | toBucket | fromBucket | qty |
Then to find how many Things are in a certain Bucket:
1. Start with initial qty in the bucket at the beginning of time
2. *Add* all qty moved *to* the bucket since beginning of time
3. *Subtract* all qty moved *from* the bucket since beginning of time
Simple system to "account" for Things.
My problem is this. This design will inherantly get slower as time
As the number of rows recording a transfer between buckets increases,
to see how many Things are in a Bucket will get slower. I experience
I use gnucash (which I love). I don't do "closing entries" at the end
year, so each account has every transaction I have every made. I see
slower. It is nothing I am going to do anything about, because it is
enough for me. But I have to wonder how big companies with thousands
transactions a day do this?
One solution would be to do a "closing entry" at certain periods in
time, so old
info would be archived. Each bucket would start the new time period
balance of Things equal to what it was at the point in time we
How else to keep a record of every transaction, but not have the speed
question "How many Things in Bucket x" depend on looking @ every
record ever made?
More information about the Python-list