bsddb3 database file, are there any unexpected file size limits occuring in practice?
Claudio Grondi
claudio.grondi at freenet.de
Tue Feb 28 06:05:37 EST 2006
Klaas wrote:
> Claudio writes:
>
>>I am on a Windows using the NTFS file system, so I don't expect problems
>>with too large file size.
>
>
> how large can files grow on NTFS? I know little about it.
No practical limit on current harddrives. i.e.:
Maximum file size
Theory: 16 exabytes minus 1 KB (2**64 bytes minus 1 KB)
Implementation: 16 terabytes minus 64 KB (2**44 bytes minus 64 KB)
Maximum volume size
Theory: 2**64 clusters minus 1
Implementation: 256 terabytes minus 64 KB (2**32 clusters minus 1)
Files per volume
4,294,967,295 (2**32 minus 1 file)
>
>
>>(I suppose it in having only 256 MB RAM available that time) as it is
>>known that MySQL databases larger than 2 GByte exist and are in daily
>>use :-( .
>
>
> Do you have more ram now?
I have now 3 GByte RAM on my best machine, but Windows allows a process
not to exceed 2 GByte, so in practice a little bit less than 2 GByte are
the actual upper limit.
I've used berkeley dbs up to around 5 gigs
> in size and they performed fine. However, it is quite important that
> the working set of the database (it's internal index pages) can fit
> into available ram. If they are swapping in and out, there will be
> problems.
Thank you very much for your reply.
In my current project I expect the data to have much less volume than
the indexes. In my failed MySQL project the size of the indexes was
appr. same as the size of the indexed data (1 GByte).
In my current project I expect the total size of the indexes to exceed
by far the size of the data indexed, but because Berkeley does not
support multiple indexed columns (i.e. only one key value column as
index) if I access the database files one after another (not
simultaneously) it should work without problems with RAM, right?
Do the data volume required to store the key values have impact on the
size of the index pages or does the size of the index pages depend only
on the number of records and kind of the index (btree, hash)?
In last case, I were free to use for the key values also larger sized
data columns without running into the problems with RAM size for the
index itself, else I were forced to use key columns storing a kind of
hash to get their size down (and two dictionaries instead of one).
What is the upper limit of number of records in practice?
Theoretical, as given in the tutorial, Berkeley is capable of holding up
to billions of records with sizes of up to 4 GB each single record with
tables up to total storage size of 256 TB of data.
By the way: are billions in the given context multiple of 1.000.000.000
or of 1.000.000.000.000 i.e. in US or British sense?
I expect the number of records in my project in the order of tens of
millions (multiple of 10.000.000).
I would be glad to hear if someone has already successful run Berkeley
with this or larger amount of records and how much RAM and which OS had
the therefore used machine (I am on Windows XP with 3 GByte RAM).
Claudio
>
> -Mike
>
More information about the Python-list
mailing list