[Tutor] read in text file containing non-English characters

Prasad, Ramit ramit.prasad at jpmorgan.com
Fri Jan 13 00:16:31 CET 2012


>I would like to know to how to read in the file and then access arbitary rows in the file, so that I can print a line such as: 
> Cabañas,Sensuntepeque,-88.6300,13.8800
>The capital of Cabañas is Sensuntepeque

>while preserving the non-English characters

>now, for example, I get

>Cabañas

Make sure to open the file with correct encoding or convert in your program. 
Incorrect encoding is a very likely reason why you get funky characters (or
not having the appropriate codepages/languages installed). According to 
the internet you can open the file and specify the encoding all at once.
http://docs.python.org/library/codecs.html#codecs.open 
>>>import codecs
>>>f = codecs.open("test", "r", "utf-8")

The list of standard codecs is on that page at 7.8.3.

You can also manually convert a string, but that is more of a pain..and you 
would still need to know the codec.

As for the rest of the problem, I would suggest you take a quick peek at the csv
module which will parse your data for you. 

Ramit


Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology
712 Main Street | Houston, TX 77002
work phone: 713 - 216 - 5423

--
 
This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.  


More information about the Tutor mailing list