[Tutor] Sorting/filtering data, dictionary question & Re:OT

Kent Johnson kent37 at tds.net
Mon Jan 3 12:03:28 CET 2005


See comments inline...

Bill Burns wrote:
> One problem I ran into was sorting my lists. The dictionary I initially came
> up with contained a pipe size designation in this format: 1/2", 3/4", 1", etc.
> This format is the standard way to write pipe sizes (at least in the US
> anyway). When I sorted the data (using this format), I found that the inches
> didn't sort the way I wanted them to.
>            For example:
>                >>>sizes = ['1/2"', '3/4"', '1"', '1-1/4"', '1-1/2"']
>                >>>sizes.sort()
>                >>>sizes
>                ['1"', '1-1/2"', '1-1/4"', '1/2"', '3/4"']
>                >>>
> Although Python is sorting exactly as it is supposed to, I needed the inches
> to be in sequential order. Not knowing what to do, I added another size to the
> dictionary before each of the "inch sizes". This new "size" has this type of
> format: .5, .75, 1, 1.25, 1.5, etc. And when the data is sorted, it works
> perfectly. But now I was left with both a "size" and an "inch size" and I only
> wanted the "inch size" in the pipe reports (the csv file).
> 
> Below is a small section of the dictionary the program uses (pipeDict). I'm
> also providing two functions to give you a general idea of how I'm:
> 	1). Creating and sorting the pipe data and
> 	2). How I'm filtering the "size" out of the list.
> 
> <code>
> pipeDict = \
> {('lineEdit1','Steel(Std)',.5,'1/2"',.85): .622, 
> ('lineEdit2','Steel(Std)',.75,'3/4"',1.13): .824,
>       
> def somePipeReport():
>     report = []
>     for name, typ, size, inchSize, weight in pipeDict:
>         report.append((typ,size,inchSize,weight))
>         report.sort()
>         newReport = filterPipeData(report)
>     print newReport
> 
> def filterPipeData(data):
>     filteredData = []
>     for typ, size, inchSize, weight in data:
>         filteredData.append((typ,inchSize,weight))  
>     return filteredData   
> </code>
> 
> Question #1:
> Am I going about this sorting and filtering thing correctly or have I just
> gone insane? My gut feeling is, there's probably an easier/smarter way to do
> this.

This is actually a common approach to sorting a list in Python - add enough fields to the list so it 
sorts the way you want, then filter out the fields you don't need any more. It even has a name, it's 
called Decorate - Sort - Undecorate. In your case the 'decorate' step is built-in to the dictionary, 
you are just doing the sort and undecorate.

There is a shorter way to write filterPipeData() using a list comprehension, which is just a 
shortcut for what you wrote:

def filterPipeData(data):
     return [ (typ,inchSize,weight) for typ, size, inchSize, weight in data ]

When you get used to it, this form is more readable; it will also be faster though I don't think you 
will notice.

> 
> Question #2:
> As I mentioned above, this program calculates the water volume inside of the
> pipe. I do that using this function:
> 
>         def volCalc(ID, length):
>                 from math import pi
>                 gal = ((ID*.5)**2)*pi*(12*length)/(230.9429931) 
>                 return gal
> 
> The ID (inside diameter) is pulled from pipeDict (it's the value in the
> dictionary) and the length comes from user input. What I'm wondering is,
> would it be a better idea to store in the dictionary a "gallon per foot value"
> for each pipe? For example, looking at 10" Steel(Std) pipe in the above
> dictionary we find that this type & size of pipe has an ID of 10.02 (inches).
> When we plug this ID and a length of 1 foot into the volCalc() function it
> returns a total of 4.10 gallons (rounded). Would it be better to store this
> "gallon per foot value" (4.10) in the dictionary (as calculated for each pipe)
> vs. calculating the gallons the way I'm currently doing it? I guess the real
> question is, which method will return the most accurate results? I'm not
> looking for a speed improvement.

If you calculate gallons per foot from your current ID values and put those numbers in the dict, the 
new numbers will not be any more accurate than your current calculations. If you have another source 
for gal/foot that you think is more accurate then you could use those numbers. But I think what you 
have is probably good enough and if it is working there is no need to change it.

Kent


More information about the Tutor mailing list