Unicode string handling problem
Richard Schulman
raschulmanxx at verizon.net
Tue Sep 5 20:22:12 EDT 2006
The following program fragment works correctly with an ascii input
file.
But the file I actually want to process is Unicode (utf-16 encoding).
The file must be Unicode rather than ASCII or Latin-1 because it
contains mixed Chinese and English characters.
When I run the program below I get an attribute_count of zero, which
is incorrect for the input file, which should give a value of fifteen
or sixteen. In other words, the count function isn't recognizing the
", characters in the line being read. Here's the program:
in_file = open("c:\\pythonapps\\in-graf1.my","rU")
try:
# Skip the first line; make the second available for processing
in_file.readline()
in_line = readline()
attribute_count = in_line.count('",')
print attribute_count
finally:
in_file.close()
Any suggestions?
Richard Schulman
(For email reply, delete the 'xx' characters)
More information about the Python-list
mailing list