Fwd: Sorting Countries by Region

patrick.waldo at gmail.com patrick.waldo at gmail.com
Sat Nov 17 12:34:43 CET 2007


This is how I solved it last night in my inefficient sort of way and
after re-reading some of my Python books on dictionaries.  So far this
gets the job done.  However, I'd like to test if there are any
countries in the excel input that are not represented, ie the input is
all the information I have and the dictionary functions as the
information I expect.  What I did worked yesterday, but doesn't work
anymore more...see comment

Otherwise I tried doing this:

for i, country in countries_list:
    if country in REGIONS_COUNTRIES['European Union']:
        matrix.write(i+2, 1, country)
but I got "ValueError: too many values to unpack"

Again, this has been a great help.  Any ideas of how I can make this a
bit more efficient, as I'm dealing with 5 regions and numerous
countries, would be greatly appreciated.  Here's the code:


#keeping all the countries short
REGIONS_COUNTRIES = {'European Union':["Austria","Belgium"," "France",
"Germany", "Greece"],\
                     'North America':["Canada", "United States"]}

path_file = "c:\\1\\build\\data\\matrix2\\Update_oct07a.xls"

book = xlrd.open_workbook(path_file)
Counts = book.sheet_by_index(1)
wb=pyExcelerator.Workbook()
matrix = wb.add_sheet("matrix")

countries = Counts.col_values(0,start_rowx=1, end_rowx=None)
countries_list = list(set(countries))
countries_list.sort()

#This seems to not work today and I don't know why
#for country in countries_list:
#    if country not in REGIONS_COUNTRIES['European Union'] or not in
REGIONS_COUNTRIES['North America']:
#        print "%s is not in the expected list", country

#This sorts well
n=2
for country in countries_list:
    if country in REGIONS_COUNTRIES['European Union']:
        matrix.write(n, 1, country)
        n=n+1
for country in countries_list:
    if country in REGIONS_COUNTRIES['North America']:
        matrix.write(n, 1, country)
        n=n+1

wb.save('c:\\1\\matrix.xls')



On Nov 17, 1:12 am, "Sergio Correia" <sergio.corr... at gmail.com> wrote:
> About the sort:
>
> Check this (also onhttp://pastebin.com/f12b5b6ca)
>
> def make_regions():
>
>     # Values you provided
>     EU = ["Austria","Belgium", "Cyprus","Czech Republic",
>     "Denmark","Estonia", "Finland"]
>     NA = ["Canada", "United States"]
>     AP = ["Australia", "China", "Hong Kong", "India", "Indonesia",
>     "Japan"]
>     regions = {'European Union':EU, 'North America':NA, 'Asia Pacific':AP}
>
>     ans = {}
>     for reg_name, reg in regions.items():
>         for cou in reg:
>             ans[cou] = reg_name
>     return ans
>
> def cmp_region(cou1, cou2):
>     ans = cmp(regions[cou1], regions[cou2])
>     if ans: # If the region is the same, sort by country
>         return cmp(cou1, cou2)
>     else:
>         return ans
>
> regions = make_regions()
> some_countries = ['Austria', 'Canada', 'China', 'India']
>
> print 'Old:', some_countries
> some_countries.sort(cmp_region)
> print 'New:', some_countries
>
> Why that code?
> Because the first thing I want is a dictionary where the key is the
> name of the country and the value is the region. Then, I just make a
> quick function that compares considering the region and country.
> Finally, I sort.
>
> Btw, the code is just a quick hack, as it can be improved -a lot-.
>
> About the rest of your code:
> - martyw's example is much more useful than you think. Why? because
> you can just iterate across your document, adding the values you get
> to the adequate object property. That is, instead of using size or
> pop, use the variables you are interested in.
>
> Best, and good luck with python,
> Sergio
>
> On Nov 16, 2007 5:15 PM,  <patrick.wa... at gmail.com> wrote:
>
> > Great, this is very helpful.  I'm new to Python, so hence the
> > inefficient or nonsensical code!
>
> > > 2) I would suggest using countries.sort(...) or sorted(countries,...),
> > > specifying cmp or key options too sort by region instead.
>
> > I don't understand how to do this.  The countries.sort() lists
> > alphabetically and I tried to do a lambda x,y: cmp() type function,
> > but it doesn't sort correctly.  Help with that?
>
> > For martyw's example, I don't need to get any sort of population
> > info.  I'm actually getting the number of various types of documents.
> > So the entry is like this:
>
> > Argentina       Food and Consumer Products      Food Additives  Color
> > Additives               1
> > Argentina       Food and Consumer Products      Food Additives  Flavors                 1
> > Argentina       Food and Consumer Products      Food Additives
> > General                 6
> > Argentina       Food and Consumer Products      Food Additives  labeling                1
> > Argentina       Food and Consumer Products      Food Additives  Prohibited
> > Additives       1
> > Argentina       Food and Consumer Products      Food Contact    Cellulose               1
> > Argentina       Food and Consumer Products      Food Contact    Food
> > Packaging               1
> > Argentina       Food and Consumer Products      Food Contact    Plastics                4
> > Argentina       Food and Consumer Products      Food Contact
> > Waxes                   1
> > Belize
> > etc...
>
> > So I'll need to add up all the entries for Food Additives and Food
> > contacts, the other info like Color Additives isn't important.
>
> > So I will have an output like this
> >           Food Additives    Food Contact
> > Argentina     10                7
> > Belize
> > etc...
>
> > Thanks so much for the help!
> > --
>
> >http://mail.python.org/mailman/listinfo/python-list




More information about the Python-list mailing list