Problems with string and lists (searching and replaceing)

Thu Sep 25 06:24:03 EDT 2003

jblazi <jblazi at hotmail.com> wrote:

>I shall have a computer science class this year and I decided to use
>Python. It is my favourite language and I have used it for many years. Now
>I thought, one of our first "real" programs, in our pre-graphical and
>pre-class state, would be to write simple program that works like this:
>
>One of the pupils enters a word. It should be a valid German word
>consisting of five letters, for example 'abcde' (which is not a German
>word by the way).

That may come as a surprise to non-German speakers ;-)

>The the other player may enter a guess which must be a five letter word as
>well, for example 'xbxxx'. Then the system answers with the string '-*---'
>as the 'b' in 'xbxxx' was correct and at the right place as well.
>
>Had the second player entered 'xxxbx', the system had responded with
>'---.-', as the 'b' is correct but not correctly positioned.
>
>The second player must find out the original word.
>
>Now I did not see how to do it as simply as posible, that was the problem.

Part of the confusion arose because it was thought to be a simple
problem, but instead it proved to be harder than it seemed. I have
experienced this while playing with it and loosing track of what was
going on inside my code. Thanks for posting your algorithm, without
that as a reference I probably would have had wrong code, thinking the
problem was easily solved. (I'm not completely sure I got it right
*this time* even)

<tangent>

Another matter is whether Python makes programming simpler for newbies
or if it only complicates things for them by introducing a lot of high
level language constructs. This is reminiscent of discussions about
whether computer processors with reduced instruction set (RISC) are
better than processors with more -and more complex- instructions.

The argument of the RISC philosophers is that higher functions could
be build using the few basic instructions and since higher functions
are subject to market fluctuations (sometimes they are used a lot and
sometimes not) it is preferable to only keep the basic set and to
build from that. This way of thinking is not limited to processor
developers but can also be observed in higher languages that pride
themselves in being flexible and fast (e.g. Lisp).

Of course the complex instruction set advocates remind us that we need
to cater to the needs of the programmers and that it is better to
provide batteries in order to prevent people making their own
sublanguage each time a commonly needed procedure has to be coded.
This way we get a coherent community and readable sourcecode.

The problem gets to the rockbottom of decision making if one is
presented with the need to teach newbies how to program. Of course
newbies are better off learning only a few basic functions and to
develop their own solutions as they go. It is even not uncommon for
teachers to have to face the raw uninterestedness that newbies tend to
express for complex instructions that they do not see the advantages
of. Sometimes later it is seen that it was good to learn all that
stuff, but more often than not it proves to be useless since history
has proven that higher functions are subject to popular demand anyway.

Now being a member of a community of programmers that all via one way
or another have come to understand different coding requirements and
seeing that all in all Python delivers a lot of the things one will
reach anyway after hard years of reinventing the wheel each time, the
question can be asked whether this knowledge can be found only by
having done it the hard way, and -only after that- recognizing the
elegance of the solutions offered, or if it is also possible to
transfer these insights directly to untrained programmers.

One problem that arises is that for newbies things make no difference
that for experienced Pythoneers are considered different. For example
are zip and enumerate really more difficult than using range and len
for iterating ? A newbie couldn't care less since they're both new.
The same goes for using dict.get versus dict[i]. So it could well be
that *experienced* Pythoneers are the real newbies here.

At the eve of introducing new language features like the ternary
operator, iterator tools, metaclasses, backward iterators, generators
etcetera, this discussion becomes a bit heated sometimes, because
every new function that is accepted adds to the constant threat of
total redesign of all Python functions using only a few basic
instructions. Projects like Pypy that try to rebuild Python using the
higher level constructs only are like someone having found a last
crate of beer at a very late hour of a garden party. It's good for
some fun but everybody knows that the end is near.

There is just no way around rebuilding the basic structure from the
ground up sometimes and IMO it would be advisable to rebuild Pythons
functionality not using Pypy -although it's a nice project- but by
making very small basic building blocks -like bitfields- and instead
of having all functions inside one mega-dll (pythonxx.dll) it would be
better to have each function inside its own dll and to access these
dlls using Ctypes or something like it. 

This way it would be possible to construct a Python from the ground
up, and instead of having a new release each time it would be possible
to just update a single dll. For an analogy of what I mean see the way
Cygwin distributes updates. It would create huge compatibility
problems for sure, but the gains would be even greater because new
higher order functions could be created on the fly, and newbies and
Pythoneers would be working side by side, with the newbies having the
advantage.

</tangent>

Ok, back to reality, here's some code I produced about your specific
problem, you can decide for yourself whether newbies would like it
more than the script you provided (I changed your script minimally in
order to be able to run it from my interpreter). It depends a lot on
dict.get returning "None" when an item is not found and on having the
value "0" in a dictionary being evaluated as "False" just as well as
"None" and on being able to specify a default value for dict.get that
it can return if there is no corresponding key in the dictionary. It
can handle strings of different size.

Anton

def vergleiche_woerter(wort,eingabe):
    eing = list(eingabe)
    wo   = list(wort)
    ergebnis=list('_'*len(wort))

    # Suche zuerst nach Bullen
    for i in range(len(eing)):
        if eing[i] == wo[i]:
            ergebnis[i] = '*'
            eing[i]=wo[i] = None

    for i in range(len(eing)):
        if eing[i] == None: continue
        if eing[i] in wo:
            j = wo.index(eing[i])
            ergebnis[i] = '.'
            wo[j] = None

    return "".join(ergebnis)

def hint(word,guess):
    res,cnt = [],{}
    for w,g in zip(word,guess):
        if w == g:
            res.append('*')
        else: 
            res.append('_')
            cnt[w] = cnt.get(w,0)+1
    for i,g in enumerate(guess):
        if cnt.get(g):
            res[i] = '.'
            cnt[g] -= 1
    return "".join(res)

def test(): 
    a,b = 'aabbab','aaaaaa'
    print a,b
    print vergleiche_woerter(a,b)
    print hint(a,b)

if __name__=='__main__':
    test()