Unicode (Japanese) fonts in Images

Wed Mar 10 11:26:54 EST 2004

On Tue, 2004-03-09 at 18:38, Jeff Epler wrote:
> > > On Mon, 2004-03-08 at 23:12, Rich wrote:
> > >> kanji = u'\ufffe\ua295\uc07b
> On Tue, Mar 09, 2004 at 05:21:11PM +0000, Rich wrote:
> > I thought that saving them in notepad as unicode, then opening that in a 
> > hex editor and using the values was the way forward but obviously not :(
> 
> The file which looked like
>     FF FE A2 95 C0 7B
> is a probably utf-16 encoded with a BOM (byte-order mark) at the beginning.
> 
> You can convert this into a Python unicode string like so:
>     >>> "\xff\xfe\xa2\x95\xc0\x7b".decode('utf-16')
>     u'\u95a2\u7bc0'
> On my system, printing this to my terminal shows a pair of japanese-looking
> symbols.

when windows talks about unicode, he always means UTF-16. when linux
talks about unicode, he usually means UTF-8.

you have to be very careful which encodings you use.

for example if you save the stuff in notepad as 'unicode', imho it will
save them as UTF-16.

gabor