[Tutor] bogus characters in a windows file

Garry Willgoose garry.willgoose at newcastle.edu.au
Thu Feb 9 02:46:45 CET 2012


I'm reading a file output by the system utility WMIC in windows (so I can track CPU usage by process ID) and the text file WMIC outputs seems to have extra characters in I've not seen before.

I use os.system('WMIC /OUTPUT:c:\cpu.txt PROCESS GET ProcessId') to output the file and parse file c:\cpu.txt

The first few lines of the file look like this in notepad

ProcessId  
0         
4          
568        
624        
648        


I input the data with the lines

infile = open('c:\cpu.txt','r')
infile.readline()
infile.readline()
infile.readline()

the readline()s yield the following output

'\xff\xfeP\x00r\x00o\x00c\x00e\x00s\x00s\x00I\x00d\x00 \x00 \x00\r\x00\n'
'\x000\x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00\r\x00\n'
'\x004\x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00\r\x00\n'

Now for the first line the title 'ProcessId' is in this string but the individual characters are separated by '\x00' and at least for the first line of the file there is an extra '\xff\xfe'. For subsequent its just '\x00. Now I can just replace the '\x**' with '' but that seems a bit inelegant. I've tried various options on the open 'rU' and 'rb' but no effect. 

Does anybody know what the rubbish characters are and what has caused the. I'm using the latest Enthought python if that matters.







More information about the Tutor mailing list