From gdavidzon at gmail.com  Wed Jul 29 23:11:01 2009
From: gdavidzon at gmail.com (Guido Davidzon)
Date: Wed, 29 Jul 2009 17:11:01 -0400
Subject: [Csv] CSV module. DictReader uses string values instead of int
Message-ID: <81D62875-D54D-482F-9BAF-9F8219716D38@gmail.com>

Hi,

I am new to Python and I am trying to use the CSV module to import a  
dataset as a dictionary data structure. The reason I am doing this, is  
to find similarities between instances of my dataset using similarity  
algorithms (like pearson). These algorithms take data in dictionary  
format and then compute similarities between any given instances.
The problem I encountered is that when importing a CSV file using the  
CSV Module, the values of each key are represented in a string format  
instead of integer. Hence, when running the algorithm using the  
imported dataset in the dictionary, I get this error:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "readcsv.py", line 99, in sim_pearson
for item in prefs[p1]:
TypeError: string indices must be integers, not str

My question is: Is there a way of importing the values of the  
dictionary in integer instead of string? I know that variables are  
immutable in python, but then what can I do?

Using DictReader from the CSV module, my dictionary looks like this:

mydict={'variable1': '0', 'variable2': '1', 'variable3': '0',  
'variable4': '1'}

and I want it to look like this:

mydict={'variable1': 0, 'variable2': 1, 'variable3': 0, 'variable4': 1}

Thanks in advance,

- Guido

From skip at pobox.com  Thu Jul 30 00:39:41 2009
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 29 Jul 2009 17:39:41 -0500
Subject: [Csv] CSV module. DictReader uses string values instead of int
In-Reply-To: <81D62875-D54D-482F-9BAF-9F8219716D38@gmail.com>
References: <81D62875-D54D-482F-9BAF-9F8219716D38@gmail.com>
Message-ID: <19056.53165.222232.87624@montanaro.dyndns.org>

    Guido> Using DictReader from the CSV module, my dictionary looks like this:

    Guido> mydict={'variable1': '0', 'variable2': '1', 'variable3': '0',  
    Guido> 'variable4': '1'}

    Guido> and I want it to look like this:

    Guido> mydict={'variable1': 0, 'variable2': 1, 'variable3': 0, 'variable4': 1}

Sure.  Just convert your dictionary's keys:

    for key in mydict:
        try:
            mydict[key] = int(mydict[key])
        except ValueError:
            # not an int
            pass

You can hide this from your application code by subclassing csv.DictReader
and overriding its next method (or __next__ method for Python 3.x):

    class MyDictReader(csv.DictReader):
        def next(self):
            d = csv.DictReader.next(self)
            for key in d:
                try:
                    d[key] = int(d[key])
                except ValueError:
                    # not an int
                    pass
            return d

-- 
Skip Montanaro - skip at pobox.com - http://www.smontanaro.net/
    That's more than a dress. That's an Audrey Hepburn movie. -- Jerry Maguire