Encoding for Devanagari Script.
atulskulkarni at gmail.com
Mon Jul 28 17:48:38 CEST 2008
Hi Fredrik and Terry,
Well I got this on IDLE I think I have done something wrong.
>>> import codecs
>>> f = open("C:\Documents and Settings\admin\My Documents\corpus\dainaikAikya collected by sushant.txt","r", "utf_8")
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
f = open("C:\Documents and Settings\admin\My Documents\corpus
\dainaikAikya collected by sushant.txt","r", "utf_8")
TypeError: an integer is required
after that I tried the read binary mode and tried reading the firt 32
bytes and this is what I got.
>>> f = open("C:\Documents and Settings\\admin\\My Documents\\corpus\\dainaikAikya collected by sushant.txt","rb")
Now based on my knowledge of Unicode I think this is a utf-8 file (the
first 3 bytes \xef\xbb\xbf), please correct me if I am wrong. How do I
PS: the above code I wrote using the information from the Library
Reference pdf section 4.8 "Codecs". Something wrong I am doing? Please
do let me know.
On Jul 25, 6:21 am, Terry Reedy <tjre... at udel.edu> wrote:
> Atul. wrote:
> > Hello All,
> > I wanted to know what encoding should I use to open the files with
> >Devanagaricharacters. I was thinking of UTF-8 but was not sure, any
> > leads on this? Anyone used it earlier?
> You cannot hurt your machine by giving that a try.
> This is a general comment for all beginners. Before posting, open the
> interactive interpreter (or IDLE) and try something(s). If the result
> puzzles you, copy and paste into a post. Or if more appropriate, open
> the Python manuals and search a bit, or try a search engine.
More information about the Python-list