[Tutor] The Python way and two dimensional lists

dn PyTutor at DancesWithMice.info
Tue Dec 7 16:47:12 EST 2021

```On 07/12/2021 22.13, Phil wrote:
>
> On 7/12/21 19:39, David wrote:
>>
>> I don't know what it is that you aren't understanding.
>> But you seem to be unclear so I have written a quick
>> demo of 2D list indexing below for you.
>
> Thank you David. I will study and run your code trough a debugger.
>
> While I was typing my previous e-mail I thought of a solution.
>
> col = [row[c] for row in self.solution] gives me the entire column. So
> what I need is to extract the first three rows for the top three boxes.
> Then for the next row of boxes I need the middle section of the column
> (rows 3, 4 and 5) and for the last row of boxes I need the last three
> rows of the column. At the moment I don't know how I might do this but
> that's a project for tomorrow.

This may represent a significant break-through! You've started thinking
about how the data-structure and the algorithm must combine, in order to
produce an effective solution.

A random selection from a "Sudoku rules" web-search yields the
following: "The object of Sudoku is to fill the ... empty cells with
numbers between 1 and 9 (1 number only in each cell) according the
following guidelines: ... that a number should appear only once on each
row, column and a region." A "region" is what has been called a "box"

Thinking about it, the first two "guidelines" obviate the third. When we
print/display a Sudoku tableau, it is easier on the eyes if the nine
regions are highlighted. However, do they add anything to the need to
check 'the rules' - indeed, is it an easy check?

So, when it comes to evaluating a Sudoku 'solution', each column in the
tableau must feature all of the nine digits, and each row of the
tableau, the same. QED(?) These rules are 1D in nature - taken in-turn.

However, shifting our view slightly, when we look at a Sudoku puzzle we
see the tableau as a matrix, and in two dimensions.

Notice how the evaluations (above) are 1D, but the visual 'picture' is
2D! Thus, if we can dispense with the 'region rule', the only 'use' we
have for regions is for back-end display formatting.

When you separate the two ways to look at a Sudoku, life becomes a lot
easier. We can look at it one 'way' when running the algorithm, and any
"stepwise decomposition" - break large problem into multiple smaller ones!)

@Alan has made the suggestion of using a single list to represent the
tableau (I refer to this a "snake" - rather than a separate construct,
the second row becomes (something like) an extension of the first). The
2D complexity is reduced to 1D - however, keeping track of where rows
begin/end adds (some/different) complexity back in.

The algorithm has been partially-developed by @wulfraed. This is likely
to be a very storage-efficient method, but some may find the 'snake' a
little awkward to visualise, and having to manipulate the math
(especially zero-based) will challenge others.

Contrarily, your own first foray utilised a 2D approach, which is much
easier to visualise. Probably because that's the way Sudoku puzzles are
presented!

Because lists are (typically) built by appending, the techniques which
we may have applied in other languages where arrays are 'dimensioned' at
compile-time, can't be carried-across directly. This is why folk will
talk about writing Python 'pythonically', ie using a Python idiom,
rather than trying to literally translate from a previous-language to
Python, 'word for word'.

NB this is a little different from the confusion of which dimension
represents rows, and which 'holds' the columns (an issue of
"consistency") - from which apparently, you've been suffering/dug
yourself out.

For this reason, I would use 2D/nested dict[ionarie]s. This has the
For whatever reason, these could run 1~9, rather than Python's 0~8 (even
'Row1', 'Row2', and thus distinct from 'Col1' - which may further dispel
earlier confusion/promote consistent use. Such may also assist
comprehension and/or building the logic.

I guess that if we really want to go 'mathematically-nuts' we could
consider the opening Sudoku tableau to be a "sparse matrix". Such
approaches typically store the data as a triplet - (row, column, value).
Thus, each time a row must be evaluated (eg to see if it obeys that
rule), the entire list of triplets must be scanned - similarly for each
column. Added to that, a completed tableau is the exact opposite of a
'sparse' matrix. So, I'll say no more about that idea, and claim its
consideration as 'making progress' - if only because we can concentrate
our investigations elsewhere!

There is another idea that you may like to consider - and perhaps as a
next step from yesterday's report mental-breakthrough/learning. There
are two 'rules'. Why not have two data-structures: one to facilitate the
checking of one rule, and the other, um, the other?

In the case of a Sudoku tableau, our 'data lake' is exactly 81 integers
in size. [pauses to wipe sweat from eyes] So, if we were to duplicate
the data, it's not particularly "expensive".

The idea here, is to maintain a list of nine lists which represent the
nine rows of the tableau. The second pile of nine lists represents the
columns. Yes, when you boil-it-down, the data is repeated - each 'cell'
appears once in a row list and again in a column list - and when a new
value is added to the tableau, it must be added in two places. Ouch!

Computer people will tell you that this 'repetition' is a "code smell" -
a recipe for future-faults (and that assessment is completely correct -
see "DRY = Do not Repeat Yourself Principle").

However, this 'separation' is another way to 'decompose' the problem of
dealing with the tableau into a single 'view' - think horizontally first
(your choice) and prove that, then switch... (after all, this is a
method many use to solve a puzzle!) It is an inefficient solution, but
offers a learning-path and an opportunity to figure-out a simpler
version of 2D lists, and builds upon your 'consistency' determination.
(maybe - YMMV!)

So, after theorising, it was time to 'have some fun'. To compare the
'snake' with a 2D dict solution, I prepared some code and sufficient
tests which progress from:

An initial tableau:

-------------------------------------
|   | 2 | 7 |   | 8 | 9 | 6 |   |   |
-------------------------------------
|   |   |   |   |   | 1 | 7 |   | 2 |
-------------------------------------
| 9 |   |   |   |   |   |   |   |   |
-------------------------------------
|   |   |   |   | 5 |   |   |   | 7 |
-------------------------------------
| 1 | 7 |   |   |   |   |   | 2 | 6 |
-------------------------------------
| 8 |   |   |   | 4 |   |   |   |   |
-------------------------------------
|   |   |   |   |   |   |   |   | 4 |
-------------------------------------
| 2 |   | 8 | 5 |   |   |   |   |   |
-------------------------------------
|   |   | 1 | 8 | 6 |   | 2 | 3 |   |
-------------------------------------

To 'proving' the first row and the first column, and assuring the
handling of data-entry errors:

-------------------------------------
| 3 | 2 | 7 | 4 | 8 | 9 | 6 | 5 | 1 |
-------------------------------------
| 5 |   |   |   |   | 1 | 7 |   | 2 |
-------------------------------------
| 9 |   |   |   |   |   |   |   |   |
-------------------------------------
| 4 |   |   |   | 5 |   |   |   | 7 |
-------------------------------------
| 1 | 7 |   |   |   |   |   | 2 | 6 |
-------------------------------------
| 8 |   |   |   | 4 |   |   |   |   |
-------------------------------------
| 6 |   |   |   |   |   |   |   | 4 |
-------------------------------------
| 2 |   | 8 | 5 |   |   |   |   |   |
-------------------------------------
| 7 |   | 1 | 8 | 6 |   | 2 | 3 |   |
-------------------------------------

NB due apologies - text-only email will not carry the visual information
that the puzzle-values, as-provided, are emboldened; whereas the
user-entries are not.
(which rather ruins the effect)

This emphasises another important question, which IIRC you have already
struck: the realisation that a cell must be in one of three 'states': no
value, an initial-tableau value (as forms the puzzle, and which may not
be changed), and a user-entered value (which may be changed). Thus the
question: how to represent the three 'states'?

Once again, I counsel that the needs/best way to implement the methods
and data-structures implementing the algorithm, may be quite different
from the way the puzzle is 'output'. That's OK - you can perform
necessary translation as part of the output-routine (eg at the same time
as the row and column delimiting lines are added - emboldened around
each box/region (which I didn't bother to do)).

After developing those two 'sample-solutions', I experimented with the
two-lists method, to see if (a) it would work, and (b) to attempt to
assess if it might be an easier learning-approach. So, there are now
three classes, identical in function but different in data-structure and
algorithm.

Each initialises the board/tableau, accepts input (either to set-up the
puzzle, or from the user attempting solution), checks that
entered-values are legal, checks that the board is legal, and displays a
rather Q&D board (as above).

It does not actually play a game (there's no front-end, at all), nor
will it start ringing the bells or flashing lights, once the tableau is
(correctly and completely) filled. Purely the mechanics/the algorithms!

The total weighs-in at about 400-lines. So, I wouldn't be thanked for
posting it to the list! If you (or you, gentle (and patient) reader)