[Tutor] The Python way and two dimensional lists

Tue Dec 7 23:43:33 EST 2021

On 08/12/2021 13.31, Phil wrote:
> On 7/12/21 20:34, Alan Gauld via Tutor wrote:
> 
> I previously mentioned that this current sudoku solver project is one
> that I've translated from a C++ project that I wrote many years ago. It
> does, as it stands, solve all but the more difficult puzzles. I've added
> some extra solving strategies over the past couple of months and I'm
> attempting to make the code, even though it works, more understandable.

Which indicates I have 'fallen-down on the job' because the code I
threw-together only assures a 'move' - it doesn't 'design' the next
move/step towards solution!
(further comments, to follow)

> Visualising and manipulating objects in space is one of my many failings
> and one that cost me a job at an IBM interview.

This is a surprise, because back in the ?good, old, days (when our Stone
Age computers ran on dino-ary code), we spent a lot of time drawing
diagrams. In the beginning we were banned from writing code unless we
had first flowchart-ed a solution. Later we moved to DFDs (Data Flow
Diagrams) and DSDs (Data Structure Diagrams) etc.

Yes, the world has moved-on, and many don't feel they are 'working'
unless they're writing code. (wrong!) That doesn't mean that we 'silver
surfers' have moved with it! However, before some young-buck jumps-in to
prove superiority, it actually gives us an advantage, because we can
pick-and-choose between approaches and use the one which works best for
us - instead of 'the one' that 'everyone' uses! Suitability beats
'trending', all day long!

I laugh when people ask me to demonstrate problem-solving
("whiteboarding" a problem/solution), usually with the expectation that
this will present some sort of 'challenge' (like the fear of
public-speaking?); because that's exactly what I do, always, even
something 'small' like this, and even if I have to use the proverbial
paper-napkin or 'back of a post-card' ("no correspondence will be
entered into").

Cognitive Psychology, my research area (yes, that's two counts of
'weird' - at least), has shown that people who write notes, or even
doodle, during a lecture, will remember 'more'. An irony is that many
write notes and then don't need to refer to them later - which becomes
an error when 'logic' says: don't bother making the notes then!

Perhaps you might try sketching a few rows and columns to see if the
visual (and labelling!) helps, particularly when it comes to the problem
of extracting columnar information from what appears to be horizontal
data-structures.

>> I believe you are building a model of a Sudoku game?
>> But from your description it's not clear what data
>> structure you are using. Please just show it!
>>
> The first list of lists represents the board.
> 
>         self.solution = [[None] * self.num_cols for _ in
> range(self.num_rows)]
> 
> The second list of lists represents the Tkinter GUI where the user can
> enter a puzzle to be solved. The programme solves the puzzle, not the user.
> 
>         self.entry_grid = [[None] * self.num_cols for _ in
> range(self.num_rows)]

+1

(refer back to @Alan's advice about Separation of Concerns, and mine
about different ways of 'seeing'/handling the same information)

>> If it works it is "correct" for some definition of correct.
> Both working correctly and being understandable is the issue here.

You've had such a fire-hose of information thrown at you, it'll take
time - and much re-reading and 'reading around', to absorb it all.
Perhaps some slowing down to 'walk before we can run' is in-order?

Much of the advice is related to simplification and solving a series of
'lower-level' problems rather than trying to do 'too much' in one go.
(this echoed and echoing...)

>> But personally here is how I would tackle your problem.
>>
>> I'd just use a single list of 81 cells.
> I'll think about this. As dn pointed out, knowing where the line endings
> are could be a complication. I'll think about this and helper functions.
> I'm a little reluctant to move from the model that I have because I
> understand how it represents the board, I'm just having a problem with
> manipulating the board grid to extract the columns. I don't have a
> problem with the rows. I will experiment with the class methods provided
> by Dennis, it may persuade me to change from a list of lists to a snake
> list as dn calls it.
>> I'd then write helper functions to extract the sublists I needed:
>>
>> def get_cell(table, row, col)
>>
>> def get_row(table, row_num)
>>
>> def get_col(table, col_num)
>>
>> def get_box(table, row_num, col_num)  # or just box_num?
> 
> I don't have a function that just gets a column. Column extraction is
> performed as part of other functions. I know that a function should do
> just one job, so I'll work on that. I have started a function that gets
> the box number but I cannot see, at the moment, how to integrate it with
> the rest of my programme.

Neither could I, but then 'solving' wasn't in the spec.

With regard to these 'helper functions' (or methods), three of us have
now suggested the exact same thing. (I'll be billing both of them for
"stealing" my idea...)

Because selecting a column of data will be required at various stages
within the program[me], doesn't that SHOUT: "make me a function"? If
it's only a few lines of code (see below), that's not a problem. A
function (once tested and proven) is one less 'problem' to consider!

(conversely, a function with only one line of code is possibly NOT a
valid candidate) Thinking 'small' is probably a virtue, at this stage of
learning.

With the two-list idea, finding columnar-data is trivial - which column,
ie which list, do you want? EOJ!

With the 2D dicts (that is a dict of dicts, not a dict of lists, or
v-v!), what is required is to 'hold' the row-index fixed, and iterate
the column-index. I'd think this very familiar from the approaches
required/imposed/implemented in earlier languages, eg:

    def check_col( self, col: int, value: int, ) -> None:
        """Assure no duplicate values."""
        column = [ self.board[ row ][ col ]
                   for row in self.board_dimensions
                   ]
        if value in column:
            raise ValueError( "No duplicate values." )

Notice that because my partial-solution/illustration can enjoy the
relative-luxury of not needing a transition between algorithm
data-structures and GUI-structures, the only time column-data is needed
is during the duplicate-check.

In your situation, it would be better to split the above function into
halves, perhaps something like (untested):

    def get_column( self, col: int, ) -> list[ int ]:
        """Extract nominated column-data from the board."""
        return [ self.board[ row ][ col ]
                   for row in self.board_dimensions
                   ]

    def check_col( self, col: int, value: int, ) -> None:
        """Assure no duplicate values."""
        column = self.get_column( col )
        if value in column:
            raise ValueError( "No duplicate values." )

If the list-comprehension (in the return) is more complex than
preferred, 'unwind' it to a multi-line for-loop, appending one item per
loop, with preparatory list declaration, and use that list as the
return-value.

Now, get_column() can be used wherever needed, from multiple points in
the code, etc.

The solution for the 'snake' is to use slices (and 'striding') - by
definition all of the values which constitute one 'column' must be nine
cells apart from each-other in the 1D 'snake'. Thus, row = 0, column = 0
is the zero-th cell, and row = 1, column = 0 will have a snake-index of
9, ie 0 + 9. Proceeding: row = 2, column = 0 is nine cells further
'along', ie 0 + 9 + 9. Pretty soon, you'll notice that instead of adding
9 each time, we could multiply by the number of rows 'down'!

    def get_col( self, col_index: int, ) -> None:
        """Return single column from board."""
        first_cell_index = col_index % 9
        return [ self.board[ cell_index ]
                 for cell_index in range( first_cell_index, 81, 9 )
                 ]

The reasons for the "remainder", is the presumption that the column
could be defined as that above-and-below any cell on the board, cf a
column-number. Once the nominated-cell's column is ascertained, the
column-number becomes the start of the 'stride'.
(81 being the length of the snaking tableau, and 9 being the width of
each row)

As you can see, in both 'solutions', a 'get-column' "helper function"
furnishes utility!

Also, here's-hoping that one-or-the-other will help to make the
technique 'click' in your mind.
(no 'click' is perhaps not a snooty, cognitive-psych term - but who will
understand these days if I say "when the penny drops"? Maybe an "ahah
moment"!)

>> The next problem (based on previous posts) is that your
>> cells are sets which are immutable, so you need to:
>> 1) create another set of functions for writing back changes
>> to a cell - tricky...
> 
> I'm not sure that I understand why this is difficult. This is how I
> update a cell:
> 
> self.solution[row][col] = set(num)

Why a set?

How to be sure that the cell is not already 'occupied' by part of the
puzzle-definition?

>> OR
>> 2) use lists instead of sets because they are mutable.
> I had originally used lists instead of sets as a result of my
> translation from arrays to lists. The use of sets was an experiment that
> didn't cause any harm so it's use stayed.

Be careful! A set has no (guarantee of) sequence.

Sequence is important when it comes to the tableau.

Sets are useful when it comes to ensuring that a digit isn't used twice
in one row/column/region...

(as below)
= tools for the job

>> OR
>> 3) Use OOP and make your cells objects with methods
>> to add/remove values and using a set internally to
>> check uniqueness etc. This would be my choice.
> I'll give this some thought as well.
>> Personally, I'd use OOP for the table too, and make the
>> functions above methods of the table. So you would have
>> a table object with 81 mutable cells.
>>
>> But you haven't mentioned OOP up to now so that may be
>> an extra learning curve you don't want to deal with.
> 
> A grid class would probably be a good idea, more to think about.

> I pride myself in not being a rigid thinker but I suppose I am, other wise I would have completed this project by now.

You have been considering so many alternatives, started and re-started,
are learning Python, and trying to 'remember' old-code. Confusion reigns!

> I'm not sure about the idea of two tables, more complications perhaps?

"Two" sounds more complicated than "one'. However, it will have the
advantage of forcing you to concentrate on one (little) thing at a time.

If you take-to-heart the suggestion of 'helper functions', you can do
everything (easier) that needs to be done with rows first. Thereafter,
you can put such thoughts out of your mind, and concentrate on similar
for columns. Because the two function independently at all times, except
when data is (twice) added to the tableau - only one is necessary to
assess completeness/success/transfer to output, etc; it does not make
'the work' more complicated.

However, success at this is going to require the process of sitting-down
and planning/sketching/charting. When you know how to put individual
data-points into the row/column/snake/tableau, how to pull-out data from
specific cells, or entire row/column constructs, and such-like; then you
have the 'little problems' which can be individually solved, coded
one-at-a-time AND you can test each as you go, eg if column-n is
requested the value-returned will be of x-type and contain the following
values... This enables you to ensure that each helper function works
(properly!) BEFORE incorporating it/the columnar-data retrieved, into a
wider part of the algorithm!

eg if the final 'solution' is presented with whole rows of 1s and rows
of 2s, there is plainly a fault - but where? It is unlikely to be in the
GUI, or the solver. Almost certainly it will 'start' at a lower-level,
in - (wait for it) - a helper-function. So, testing each 'little bit'
first, is worthwhile, because it adds confidence as the snow-ball starts
to grow larger, and at the same time as that happens, reduces the number
of things to be kept in-mind...

> This is a rather easy puzzle to solve, my programme solved it in 3 passes. I'm not skiting, just pointing out that I have achieved something.

...and well done! Nothing succeeds like success!

Yes, I think it was described as a 'medium' or maybe 'hard' puzzle. No
matter, it was only needed to test those (dare I say it again?) 'helper
functions'.

> Are the several descriptions sufficient to start you off?
> 
> Yes, thank you and keep your code safe, I may contact you.

Because you are treating this as a learning exercise (and after
reviewing it again), I'm still comfortable recommending the two-list
approach. Once solved, you won't keep it (shouldn't keep it?) but it
might help with solving the non-Python problems you have outlined.

Once you have a 'model' straightened-out in your mind (confidence that
you know where to go and how to get there), then I'd suggest
using/expanding/amending that set of helper-functions to implement the
2D dictionary approach.

Once you're really feeling like you've 'earned your stripes', then have
a go at the 'snake'.

However, if you prefer the more mathematical type of problem/algorithm,
if the first step and/or diagramming/charting gives you confidence, then
maybe snake-second.

An algorithmic thinker will prefer the 'snake', because (s)he has
greater confidence in 'playing with numbers' and an ability to 'see the
patterns'.

The 2D approach has the advantage that one is more easily able to 'see'
what is what, and which is where.

That code is sitting in my GitLab instance - but remember, its purpose
is to illustrate 'the basics' and compare approaches. It does not 'do'
half of what your system apparently intends!

-- 
Regards =dn

-- 
Regards,
=dn