Unicode drives me crazy...

Tim Golden tim.golden at viacom-outdoor.co.uk
Mon Jul 4 14:02:30 CEST 2005

[fowlertrainer at citromail.hu]

| I want to get the WMI infos from Windows machines.
| I use Py from HU (iso-8859-2) charset.

OK, there are people better placed than I to explain
about Unicode. Check out the following article, for


but in short, you need to understand that WMI hands
you back a unicode object -- not a string, a unicode
object. If you want to write that out to a file, you
must write out a string representation of it. And
that string representation must use one of the
standard encodings (or you could invent your own, I
suppose, but why bother?).

One way of doing this would be:

import wmi
import codecs

f = codecs.open ("c:/temp/info.txt", "w", encoding="iso-8859-2")

  c = wmi.WMI ()
  for port in c.Win32_SerialPort ():
    f.write ("Caption = %s\n" % port.Caption)
    f.write ("DeviceID = %s\n" % port.DeviceID)
  f.close ()

You now have a text file -- info.txt -- which holds
string representations of several unicode objects. If
the characters all fall within the common A-Z/1-9
characters, it will appear just the same as an old-fashioned
ascii file. If the original data represented (presumably) 
Hungarian characters, there will be some representation of
that, probably using \x sequences.

This file can be read by any other program which knows
that it's an iso-8859-2 encoding of unicode. (Knows, because
you write the program or because you've told the programmer).
Obviously, if this were an XML file, you would put an
encoding tag or whatever it's called (I'm not up on XML).


This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:

More information about the Python-list mailing list