why is my hash being weird??

Steven D'Aprano steve at REMOVEMEcyber.com.au
Thu Jan 19 03:41:07 EST 2006


pycraze wrote:

>   I do the following Steps to the .c file to create the  .py file
> 
>     1 cc -o s s.c
>     2  ./s (input)  >>test.py
>     3  python test.py

You are appending to the test file. How many times have 
you appended to it? Once? Twice? A dozen times? Just 
what is in the file test.py after all this time?


> When i run python to this .py file  , i find that this process eats
> lots of virtual memory of my machine.  I can give
 > detailed examples
 > to what heights can the virtual memory can go  ,
 > when i do a top  ,
 > with the inputs given to the c file
 >
 >   1.   input is 100000                 VIRT is 119m

The dictionary you create is going to be quite small: 
at least 780KB. Call it a megabyte. (What's a dozen or 
two K between friends?) Heck, call it 2MB.

That still leaves a mysterious 117MB unaccounted for. 
Some of that will be the Python virtual machine and 
various other overhead. What else is there?

Simple: you created a function summa with 100000 lines 
of code. That's a LOT of code to go into one object. 
Normally, 100,000 lines of code will be split between 
dozens, hundreds of functions and multiple modules. But 
you've created one giant lump of code that needs to be 
paged in and out of memory in one piece. Ouch!


 >>> def summa():
...     global hash
...     hash[0] = 0
...     hash[1] = 1
...
 >>> import dis  # get the byte-code disassembler
 >>> dis.dis(summa)  # and disassemble the function
   3           0 LOAD_CONST               1 (0)
               3 LOAD_GLOBAL              0 (hash)
               6 LOAD_CONST               1 (0)
               9 STORE_SUBSCR

   4          10 LOAD_CONST               2 (1)
              13 LOAD_GLOBAL              0 (hash)
              16 LOAD_CONST               2 (1)
              19 STORE_SUBSCR
              20 LOAD_CONST               0 (None)
              23 RETURN_VALUE

That's how much bytecode you get for two keys. Now 
imagine how much you'll need for 100,000 keys.

You don't need to write the code from C, just do it all 
in Python:

hash = {}
def summa(i):
     global hash
     for j in range(i):
         hash[j] = j

import sys
summa(sys.argv[1])


Now run the script:


python test.py 100000



-- 
Steven.




More information about the Python-list mailing list