why is my hash being weird??
Steven D'Aprano
steve at REMOVEMEcyber.com.au
Thu Jan 19 03:41:07 EST 2006
pycraze wrote:
> I do the following Steps to the .c file to create the .py file
>
> 1 cc -o s s.c
> 2 ./s (input) >>test.py
> 3 python test.py
You are appending to the test file. How many times have
you appended to it? Once? Twice? A dozen times? Just
what is in the file test.py after all this time?
> When i run python to this .py file , i find that this process eats
> lots of virtual memory of my machine. I can give
> detailed examples
> to what heights can the virtual memory can go ,
> when i do a top ,
> with the inputs given to the c file
>
> 1. input is 100000 VIRT is 119m
The dictionary you create is going to be quite small:
at least 780KB. Call it a megabyte. (What's a dozen or
two K between friends?) Heck, call it 2MB.
That still leaves a mysterious 117MB unaccounted for.
Some of that will be the Python virtual machine and
various other overhead. What else is there?
Simple: you created a function summa with 100000 lines
of code. That's a LOT of code to go into one object.
Normally, 100,000 lines of code will be split between
dozens, hundreds of functions and multiple modules. But
you've created one giant lump of code that needs to be
paged in and out of memory in one piece. Ouch!
>>> def summa():
... global hash
... hash[0] = 0
... hash[1] = 1
...
>>> import dis # get the byte-code disassembler
>>> dis.dis(summa) # and disassemble the function
3 0 LOAD_CONST 1 (0)
3 LOAD_GLOBAL 0 (hash)
6 LOAD_CONST 1 (0)
9 STORE_SUBSCR
4 10 LOAD_CONST 2 (1)
13 LOAD_GLOBAL 0 (hash)
16 LOAD_CONST 2 (1)
19 STORE_SUBSCR
20 LOAD_CONST 0 (None)
23 RETURN_VALUE
That's how much bytecode you get for two keys. Now
imagine how much you'll need for 100,000 keys.
You don't need to write the code from C, just do it all
in Python:
hash = {}
def summa(i):
global hash
for j in range(i):
hash[j] = j
import sys
summa(sys.argv[1])
Now run the script:
python test.py 100000
--
Steven.
More information about the Python-list
mailing list