a simple unicode question
Mark Tolonen
metolone+gmane at gmail.com
Tue Oct 20 00:51:02 EDT 2009
"George Trojan" <george.trojan at noaa.gov> wrote in message
news:hbidd7$i9o$1 at news.nems.noaa.gov...
> A trivial one, this is the first time I have to deal with Unicode. I am
> trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is
> "iso-8859-1". To get the degrees I did
> >>> encoding='iso-8859-1'
> >>> q=s.decode(encoding)
> >>> q.split()
> [u'48\xc2\xb0', u"13'", u'16.80"', u'N']
> >>> r=q.split()[0]
> >>> int(r[:r.find(unichr(ord('\xc2')))])
> 48
>
> Is there a better way of getting the degrees?
It seems your string is UTF-8. \xc2\xb0 is UTF-8 for DEGREE SIGN. If you
type non-ASCII characters in source code, make sure to declare the encoding
the file is *actually* saved in:
# coding: utf-8
s = '''48° 13' 16.80" N'''
q = s.decode('utf-8')
# next line equivalent to previous two
q = u'''48° 13' 16.80" N'''
# couple ways to find the degrees
print int(q[:q.find(u'°')])
import re
print re.search(ur'(\d+)°',q).group(1)
-Mark
More information about the Python-list
mailing list