[Tutor] Absolute newbie - Transliteration

David Rogers davidrogers@telus.net
Thu May 22 23:47:01 2003


Thank you, Mr. Lyck=E5, for your very helpful and detailed answer.  I'll=20=

chew on it for a while and see what happens.


> I have a feeling, it might not be completely trivial to do this at =
all.
> But that depends...

I see how much complication one could get into.  My goal here is to=20
make it faster for myself to transcribe the text of songs, which are=20
usually not too long - that means (I think) that I can have the script=20=

do the obvious stuff, and leave the "special" things to do myself.  Not=20=

a perfect solution, but a drudgery-remover nonetheless.

> Your main problem is that little softing symbol (that looks a bit
> like 'b'). Somehow, you need to look ahead, to see if that's coming
> after the current consonant, or perhaps it's easier to handle that
> whe it comes, and make a correction after the fact.

Can I put some letter-combinations that include the 'soft' symbol at=20
the beginning of my dictionary, and have them evaluated first, thus=20
bypassing the single-letter entries that come later?  Or does a=20
dictionary work in non-sequential order?

> I think most other languages are much, much harder than Russian. :(
>
> English is hopeless. Laugh, Garage, Women... Swedish is fairly =
hopeless
> as well.

Clearly, I'm only going to be able to use this on languages I already=20
know at least a little, so that I can correct the results afterward.  =20=

(btw, I think Laugh Garage Women would make an excellent name for a=20
band...)

> I think you realize by now (if not before) that the amount of shared
> code for a thing like this is fairly small. =46rom Russian seems to be
> truly trivial compared to translitteration from most western European
> languages. For English, you would need to build in a major=20
> understanding
> of the language. I don't know if the information you need to include=20=

> can
> be described in a much shorter format than the output you would=20
> generate
> from a really big word list. And if that's the case, it's obviously=20
> rather
> futile... I assume there is linguistic research done in that sector=20
> though.
> Danny Yoo usually knows these things...

I think you're right about the general futility of a project like this,=20=

for use by real translators or anything like that.  For my little=20
project of transcribing song texts to make them easier for non-native=20
speakers of those languages, I think it will save me some time, since I=20=

only have to do the simple stuff once, in the dictionary, and can then=20=

concentrate on fixing the exceptions.

For me, not knowing how to use Python yet, the "shared code" amounts to=20=

(1) seeing examples of the possible dictionary formats, and (2) samples=20=

of the incantations required to get stuff back out of them.    :-)

You've given me that and much more besides, and now it's time for me to=20=

experiment and see if I can get it to work.  I'll post again with=20
details if I get a half-decent result.


Again, thank you very much.
David=