Problem with list.insert
sjmachin at lexicon.net
Sat Aug 30 00:39:02 CEST 2008
On Aug 29, 5:10 pm, SUBHABRATA <subhabrata.i... at hotmail.com> wrote:
> Dear group,
> Thanx for your idea to use dictionary instead of a list. Your code is
> more or less, OK, some problems are there, I'll debug them. Well, I
> feel the insert problem is coming because of the Hindi thing.
It's nothing to do with the Hindi thing. Quite simply, you are
inserting into the list over which you are iterating; this is the
"a16" in the first and last lines in the following snippet from your
code. The result of doing such a thing (in general, mutating a
container that is being iterated over) is not defined and can cause
all sorts of problems. It can be avoided by iterating over a copy of
the container that you want to change. However I suggest that you
seriously look at what you are actually trying to achieve, and rewrite
for word in a16:
#MATCHING WITH GIVEN STRING
print "The word is found in the Source String"
#INSERTING IN THE LIST OF TARGET STRING
This code has several problems:
if a8 in a5:
elif a8 not in a5:
(1) If you ever execute that print statement, it means that the end of
the universe is nigh -- throw away the else part and replace "elif a8
not in a5" with "else".
(2) The statement "found.extend(not_found)" is emitting a very foul
aroma. Your "found" list ends up with the translated words followed by
the untranslated words -- this is not very useful and you then have to
write some weird code to try to untangle it; just build your desired
output as you step through the words to be translated.
(3) Your "dictionary" is implemented as a string of the whole
dictionary contents -- you are linearly searching a long string for
each input word. You should load your dictionary file into a Python
dictionary, and load it *once* at the start of your program, not once
per input sentence.
> And Python2.5 is supporting Hindi quite fluently.
Python supports any 8-bit encoding to the extent that the platform's
console can display the characters correctly. What is the '\xe0'? The
PC-ISCII ATR character?
More information about the Python-list