[Tutor] Debugging a sort error.

Sun Jan 13 04:55:06 EST 2019

Everyone,

I did find out the issue. When looking at the output in a spreadsheet. I was inserting floats into the description dictionary from the code I was using to extract the data. Thus I have gone back to the drawing board. Rather than extracting by columns which became difficult to achieve what I want to do. I am now extracting by row and the code has greatly simplified.

Sorry about the data example. AS I am dealing with personal information. I don't want to share the original data. I will look at my editor to see if it is able to change the indent.

Note: I am a Vision Impaired (blind) person learning to code in Python. Thus indents don't really matter to me. 😊 But I will change the indents to make it easier to read.

Thanks for the test, this will help greatly.

-----Original Message-----
From: Cameron Simpson <cs at cskk.id.au> 
Sent: Sunday, 13 January 2019 8:12 PM
To: mhysnm1964 at gmail.com
Cc: Tutor at python.org
Subject: Re: [Tutor] Debugging a sort error.

Discussion inline below.

On 13Jan2019 13:16, mhysnm1964 at gmail.com <mhysnm1964 at gmail.com> wrote:
>I am hoping someone can help with the below error using Python3.5 in 
>the Windows 10 bash environment. I found the below link which I am not 
>sure if this is related to the issue or not. As I don't fully understand the answer.
>
>https://github.com/SethMMorton/natsort/issues/7

I'm not sure that URL is very relevant, except to the extent that it points out that Python 3 issues an error when comparing incomparable types. In Python 2 this problem could go unnoticed, and that just leads to problems later, much farther from the source of the issue.

>Issue, following error is generated after trying to sort a list of strings.
>
>description.sort()
>TypeError: unorderable types: float() < str()
>
>Each list items (elements) contain a mixture of alpha chars, numbers, 
>punctuation chars like / and are in a string type. Below is an example 
>extract of the data from the list.
>
>['Description', 'EFTPOS WOOLWORTHS      1294     ", "withdrawal Hudson
>street 3219"]

The error message says that some of these values are not strings. One at least is a float.

My expectation is that the openpyxl module is reading a floating point value into your description array. This might be openpxyl being too clever, or it might be (more likely IMO) be Excel turning something that looked like a float into a float. Spreadsheets can be ... helpful like that.

>There is over 2000 such entries. This used to work and now doesn't.  

You'll need to examine the values. But I see that you're trying to do this. I've snipped the data loading phase. Here:

>description = data['Description']
>for i in description:
>  if not str(i):
>    print "not a string")

This is not a valid check that "i" is a string. That expression:

  str(i)

tries to convert "i" into a string (via its __str__ method). Most objects have such a method, and str() of a float is the textual representation of the float. So the if statement doesn't test what you want to test. Try this:

  if not isinstance(i, str):
    print("not a string:", i, type(i))

>description.sort()
>I am suspecting it is something to do with the data but cannot track 
>down the cause. Any suggestions on how to debug this?

Your exception is in here, but as you expect you want to inspect the description types first.

If the description column does contain a float in the original data then you could convert it to a string first! Note that this may not match visually what was in the spreadsheet. (BTW, your cited code never fills out the description list, not it cannot be current.)

But first, fine out what's wrong. Try the type test I suggest and see how far you get.

Cheers,
Cameron Simpson <cs at cskk.id.au>