How to guess the language of a given textstring?

Roman roman.bischoff at googlemail.com
Tue May 16 00:16:52 CEST 2006


Does anybody know an easy way (or tool) to guess the language of a
given text string?

e.g.
Feeding in "This is an example."  --> should return "english" or ISO
code
Feeding in  "Das ist ein Beispiel." --> should return "german" or ISO
code
Feeding in "Esto es un ejemplo." --> should return "spanish" or ISO
code

I would prefer something more lightweight than using nltk/corpus/...

And it's ok if the success ratio is just about 90% or so.

Roman




More information about the Python-list mailing list