[Tutor] Read in text file containing non-English characters

Martin A. Brown martin at linux-ip.net
Fri Jan 13 17:59:17 CET 2012


Greetings Francis,

 : >You don't show even a snippet of code.  If you are asking
 : >for help here, it is good form to show us your code.  Since
 : >you don't state how you are reading the data and how you are
 : >printing the data, we can't help much.  Here are some tips:
 : >
 : >  * Consider learning how to use the csv module, particularly in
 : >    your case, csv.reader (as Ramit Prasad has already suggested).
 : >
 : >  * Consider checking the bytestream to see if the bytes produced
 : >    on output are the same as on input (also, read the text that
 : >    Mark Tompkins indicated and learn to distinguish Unicode from
 : >    UTF-8).
 : >
 : >  * Report back to the list the version of Python you are using.
 : >    [Different versions of Python have subtly different handling of
 : >    non ASCII character set data, but this should probably not be an
 : >    issue for the more obvious issue you are showing above.]
 : >
 : >We can have no idea what your ultimate goal is with the data, but
 : >can help you much more if you show us the code.
 : >
 : >Here's a sample of what I would/could do (Python 2.6.5):
 : >
 : >    import csv
 : >    reader = csv.reader(open('input-data.txt'),delimiter=',')
 : >    for row in reader:
 : >        print 'The capital of %s is %s' % (row[0], row[1],).
 : >
 : >The above is trivial, but if you would like some more substantive
 : >assistance, you should describe your problem in a bit more detail.
 : 
 : 
 : I apologize for not including any code, but that's because I 
 : didn't have any. I had no idea where to even begin. I have a 450 
 : page book on beginner Python programming and nothing like the 
 : above is in there. Incidentally, when I try the above code in 
 : Python 3.2 I get an "invalid syntax" message.

One of the famous differences between Python 2.x and Python 3.x [0] 
is 'print'.  Try this:

    import csv
    reader = csv.reader(open('input-data.txt'),delimiter=',')
    for row in reader:
        print('The capital of %s is %s' % (row[0], row[1],))

 : My ultimate goal is to be able to do what I've done for years in 
 : SAS, where I consider myself an expert: read in some raw data, 
 : perform some mathematical operations on the data, then output it.  
 : Later this spring I will be teaching an audience that does not 
 : have access to SAS (community college students) and Python was 
 : suggested as an alternative.

OK, so this list is a good place to be for such initial 
explorations.  There are a number of libraries that can help with 
the mathematical operations and you will probably get many good 
suggestions.

Welcome to the list,

-Martin

 [0] http://wiki.python.org/moin/Python2orPython3

--
Martin A. Brown
http://linux-ip.net/


More information about the Tutor mailing list