[Tutor] Dictionaries and aggregation

paul.churchill at cyanfusion.com paul.churchill at cyanfusion.com
Tue Apr 25 20:00:26 CEST 2006


Kent Johnson writes: 

>> However here's what I'm now trying to do:
>>  
>> 1)       Not have to rely on using awk at all.
>>  
>>  
>> 2)       Create a dictionary with server names for keys e.g. server001,
>> server002 etc and the aggregate of the request for that server as the value
>> part of the pairing.
>>  
>>  
>> I got this far with part 1)
>>  
>> lbstat = commands.getoutput("sudo ~/ZLBbalctl --action=cells")
>> tmpLst = lbstat.split('\n')
>>  
>> rLst = []
>> for i in tmpLst:
>>     m = re.search(' server[0-9]+', i)
>>     if m:
>>         rLst.append(i)
>>  
>> for i in rLst:
>>         print i, type(i)
>>  
>>   server001      alive 22.3%     6 requests/s 14527762 total <type 'str'>
>>   server002      alive 23.5%     7 requests/s 14833265 total <type 'str'>
>>   server003      alive 38.2%    14 requests/s 14872750 total <type 'str'>
>>   server004      alive 15.6%     4 requests/s 15083443 total <type 'str'>
>>   server001      alive 24.1%     8 requests/s 14473672 total <type 'str'>
>>   server002      alive 23.2%     7 requests/s 14810866 total <type 'str'>
>>   server003      alive 30.2%     8 requests/s 14918322 total <type 'str'>
>>   server004      alive 22.1%     6 requests/s 15137847 total <type 'str'>
>>  
>> At this point I ran out of ideas and began to think that there must be
>> something fundamentally wrong with my approach. Not least of my concerns was
>> the fact that I needed integers and these were strings.
> 
> Don't get discouraged, you are on the right track! You had one big 
> string that included some data you are interested in and some you don't 
> want, you have converted that to a list of strings containing only the 
> lines of interest. That is a good first step. Now you have to extract 
> the data you want out of each line. 
> 
> Use line.split() to split the text into fields by whitespace:
> In [1]: line = '  server001      alive 22.3%     6 requests/s 14527762 
> total' 
> 
> In [2]: line.split()
> Out[2]: ['server001', 'alive', '22.3%', '6', 'requests/s', '14527762', 
> 'total'] 
> 
> Indexing will pull out the field you want:
> In [3]: line.split()[5]
> Out[3]: '14527762' 
> 
> It's still a string:
> In [4]: type(line.split()[5])
> Out[4]: <type 'str'> 
> 
> Use int() to convert a string to an integer:
> In [5]: int(line.split()[5])
> Out[5]: 14527762 
> 
> Then you have to figure out how to accumulate the values in a dictionary 
> but get this much working first. 
> 
> Kent 
> 
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor 
> 
 


Thanks very much for the steer. I've made a fair bit of progress and look to 
be within touching distance of getting the problem cracked. 

 

Here's the list I'm starting with: 

>>> for i in rLst:
>>>     print i, type(i)

server001      alive 17.1%     2 requests/s 14805416 total <type 'str'>
server001      alive 27.2%     7 requests/s 14851125 total <type 'str'>
server002      alive 22.9%     6 requests/s 15173311 total <type 'str'>
server002      alive 42.0%     8 requests/s 15147869 total <type 'str'>
server003      alive 17.9%     4 requests/s 15220280 total <type 'str'>
server003      alive 22.0%     4 requests/s 15260951 total <type 'str'>
server004      alive 18.5%     3 requests/s 15484524 total <type 'str'>
server004      alive 31.6%     9 requests/s 15429303 total <type 'str'> 

I've split each string in the list, extracted what I want and feed it into 
an empty dictionary. 

>>> rDict ={}
>>> i = 0
>>> while i < (len(rLst)):
>>>     x, y =  rLst[i].split()[0], int(rLst[i].split()[3])
>>>     rDict[x] = y
>>>     print x, y, type(x), type(y)
>>>     i += 1

server001 4 <type 'str'> <type 'int'>
server001 5 <type 'str'> <type 'int'>
server002 5 <type 'str'> <type 'int'>
server002 9 <type 'str'> <type 'int'>
server003 4 <type 'str'> <type 'int'>
server003 6 <type 'str'> <type 'int'>
server004 8 <type 'str'> <type 'int'>
server004 12 <type 'str'> <type 'int'> 

I end up with this. 

>>> for key, value in rDict.items():
>>>     print key, value

server001 5
server003 6
server002 9
server004 12 


As I understand things this is because the keys must be unique and are being 
replaced by the final key value pair being feed in from the loop. 

What I'm hoping to be able to do is update the value, rather than replace 
it,  so that it gives me the total i.e. 

server001 9		
server003 10
server002 14
server004 20 

 

Regards, 

Paul 

 

 



More information about the Tutor mailing list