Japanese (speaking) developer needed for a bit of regex magic

Sebastian basti at redtoad.de
Wed Apr 21 05:31:11 EDT 2010


> General advice with character sets in Python apply: always explicitly
> declare the encoding of input, then decode to Unicode interally as early
> as possible, and process all text that way. Only fix into an encoding
> when it's time to output.

Maybe I was too vague when describing my problem. As Chris correctly
guessed, I have a literal language problem.

> > All locales return error messages in English. Only the Japanese uses
> > Japanese which my regular expressions cannot handle at the moment.
>
> What exactly are you expecting to happen, and what exactly happens
> instead?

My regular expressions turn the Amazon error messages into Python
exceptions.

This works fine as long as they are in English: "??? is not a valid
value for BrowseNodeId. Please change this value and retry your
request.", for instance, will raise an InvalidParameterValue
exception. However, the Japanese version returns the error message "???
は、BrowseNodeIdの値として無効です。値を変更してから、再度リクエストを実行してください。" which will not be
successfully handled.

This renders the my module pretty much useless for Japanese users.

I'm was therefore wondering if someone with more knowledge of Japanese
than me can have a look at my expressions. Maybe the Japanese messages
are completely different...

I have a collection of sample messages here (all files *-jp-*.xml):
http://bitbucket.org/basti/python-amazon-product-api/src/tip/tests/2009-11-01/

Any help is appreciated!

Cheers,
Sebastian



More information about the Python-list mailing list