TypeError: unhashable type: 'dict' when attempting to hash list - advice sought
kbtyo
ahlusar.ahluwalia at gmail.com
Sun Aug 30 13:21:29 EDT 2015
On Sunday, August 30, 2015 at 1:16:12 PM UTC-4, MRAB wrote:
> On 2015-08-30 17:31, kbtyo wrote:
> > On Saturday, August 29, 2015 at 10:50:18 PM UTC-4, MRAB wrote:
> >> On 2015-08-30 03:05, kbtyo wrote:
> >> > I am using Jupyter Notebook and Python 3.4. I have a data structure in the format, (type list):
> >> >
> >> > [{'AccountNumber': N,
> >> > 'Amount': '0',
> >> > 'Answer': '12:00:00 PM',
> >> > 'ID': None,
> >> > 'Type': 'WriteLetters',
> >> > 'Amount': '10',
> >> > {'AccountNumber': Y,
> >> > 'Amount': '0',
> >> > 'Answer': ' 12:00:00 PM',
> >> > 'ID': None,
> >> > 'Type': 'Transfer',
> >> > 'Amount': '2'}]
> >> >
> >> > The end goal is to write this out to CSV.
> >> >
> >> > For the above example the output would look like:
> >> >
> >> > AccountNumber, Amount, Answer, ID, Type, Amount
> >> > N,0,12:00:00 PM,None,WriteLetters,10
> >> > Y,2,12:00:00 PM,None,Transfer,2
> >> >
> >> > Below is the function that I am using to write out this data structure. Please excuse any indentation formatting issues. The data structure is returned through the function "construct_results(get_just_xml_data)".
> >> >
> >> > The data that is returned is in the format as above. "construct_headers(get_just_xml_data)" returns a list of headers. Writing out the row for "headers_list" works.
> >> >
> >> > The list comprehension "data" is to maintain the integrity of the column headers and the values for each new instance of the data structure (where the keys in the dictionary are the headers and values - row instances). The keys in this specific data structure are meant to check if there is a value instance, and if there is not - place an ''.
> >> >
> >> > def write_to_csv(results, headers):
> >> >
> >> > headers = construct_headers(get_just_xml_data)
> >> > results = construct_results(get_just_xml_data)
> >> > headers_list = list(headers)
> >> >
> >> > with open('real_csv_output.csv', 'wt') as f:
> >> > writer = csv.writer(f)
> >> > writer.writerow(headers_list)
> >> > for row in results:
> >> > data = [row.get(index, '') for index in results]
> >> > writer.writerow(data)
> >> >
> >> >
> >> >
> >> > However, when I run this, I receive this error:
> >> >
> >> > ---------------------------------------------------------------------------
> >> > TypeError Traceback (most recent call last)
> >> > <ipython-input-747-7746797fc9a5> in <module>()
> >> > ----> 1 write_to_csv(results, headers)
> >> >
> >> > <ipython-input-746-c822437eeaf0> in write_to_csv(results, headers)
> >> > 9 writer.writerow(headers_list)
> >> > 10 for item in results:
> >> > ---> 11 data = [item.get(index, '') for index in results]
> >> > 12 writer.writerow(data)
> >> >
> >> > <ipython-input-746-c822437eeaf0> in <listcomp>(.0)
> >> > 9 writer.writerow(headers_list)
> >> > 10 for item in results:
> >> > ---> 11 data = [item.get(index, '') for index in results]
> >> > 12 writer.writerow(data)
> >> >
> >> > TypeError: unhashable type: 'dict'
> >> >
> >> >
> >> > I have done some research, namely, the following:
> >> >
> >> > https://mail.python.org/pipermail//tutor/2011-November/086761.html
> >> >
> >> > http://stackoverflow.com/questions/27435798/unhashable-type-dict-type-error
> >> >
> >> > http://stackoverflow.com/questions/1957396/why-dict-objects-are-unhashable-in-python
> >> >
> >> > However, I am still perplexed by this error. Any feedback is welcomed. Thank you.
> >> >
> >> You're taking the index values from 'results' instead of 'headers'.
> >
> > Would you be able to elaborate on this? I partially understand what you mean. However, each dictionary (of results) has the same keys to map to (aka, headers when written out to CSV). I am wondering if you would be able to explain how the index is being used in this case?
> >
> In the list comprehension on line 11, you have "item.get(index, '')".
>
> What is 'index'?
>
> You have "for index in results" in the list comprehension, and 'results'
> is a list of dicts, therefore 'index' is a _dict_.
>
> That means that you're trying to look up an entry in the 'item' dict
> using a _dict_ as the key.
>
> Oh, and incidentally, line 12 should be indented to the same level as
> line 11.
Yes, as mentioned in my OP, please forgive formatting issues with indentation:
I feel that I need to provide some context to avoid any confusion over my motivations for choosing to do something.
My original task was to parse an XML data structure stored in a CSV file with other data types and then add the elements back as headers and the text as row values. I went back to drawing board and creating a "results" list of dictionaries where the keys have values as lists using this.
def convert_list_to_dict(get_just_xml_data):
d = {}
for item in get_just_xml_data(get_all_data):
for k, v in item.items():
try:
d[k].append(v)
except KeyError:
d[k] = [v]
return d
This creates a dictionary for each XML tag - for example:
{
'Number1': ['0'],
'Number2': ['0'],
'Number3': ['0'],
'Number4': ['0'],
'Number5': ['0'],
'RepgenName': [None],
'RTpes': ['Execution', 'Letters'],
'RTID': ['3', '5']}
I then used this to create a "headers" set (to prevent duplicates to be added) and the list of dictionaries that I mentioned in my OP.
I achieve this via:
#just headers
def construct_headers(convert_list_to_dict):
header = set()
with open('real.csv', 'rU') as infile:
reader = csv.DictReader(infile)
for row in reader:
xml_data = convert_list_to_dict(get_just_xml_data) #get_just_xml_data(get_all_data)
row.update(xml_data)
header.update(row.keys())
return header
#get all of the results
def construct_results(convert_list_to_dict):
header = set()
results = []
with open('real.csv', 'rU') as infile:
reader = csv.DictReader(infile)
for row in reader:
xml_data = convert_list_to_dict(get_just_xml_data) #get_just_xml_data(get_all_data)
# print(row)
row.update(xml_data)
# print(row)
results.append(row)
# print(results)
header.update(row.keys())
# print(type(results))
return results
I guess I am using the headers list originally written out. My initial thought is to just write out the values corresponding with each transaction. For example, citing this data structure:
{
'Number1': ['0'],
'Number2': ['0'],
'Number3': ['0'],
'Number4': ['0'],
'Number5': ['0'],
'RPN': [None],
'RTypes': ['Execution', 'Letters'],
'RTID': ['3', '5']}
I would get a CSV
Number1, Number2, Number3, Number4, Number5, RPN, RTypes,RTID
0, 0, 0, 0, 0, None, Execution, 3
None, None, None,None,None, Letters, 5
I am wondering how I would achieve this when all of the headers set is not sorted (should I do so before writing this out?). Also, since I have millions of transactions I want to make sure that the values for each of the headers is sequentially placed. Any guidance would be very helpful. Thanks.
More information about the Python-list
mailing list