Unicode 7
wxjmfauth at gmail.com
wxjmfauth at gmail.com
Wed Apr 30 03:06:41 EDT 2014
@ Time Chase
I'm perfectly aware about what I'm doing.
@ MRAB
"...Although the third example is the fastest, it's also the wrong
way to handle Unicode: ..."
Maybe that's exactly the opposite. It illustrates very well,
the quality of coding schemes endorsed by Unicode.org.
I deliberately choose utf-8.
>>> sys.getsizeof('\u0fce')
40
>>> sys.getsizeof('\u0fce'.encode('utf-8'))
20
>>> sys.getsizeof('\u0fce'.encode('utf-16-be'))
19
>>> sys.getsizeof('\u0fce'.encode('utf-32-be'))
21
>>>
Q. How to save memory without wasting time in encoding?
By using products using natively the unicode coding schemes?
Are you understanding unicode? Or are you understanding
unicode via Python?
---
A Tibetan monk [*] using Py32:
>>> timeit.repeat("(x*1000 + y)[:-1]", setup="x = 'abc'; y = 'z'")
[2.3394840182882186, 2.3145832750782653, 2.3207231951529685]
>>> timeit.repeat("(x*1000 + y)[:-1]", setup="x = 'abc'; y = '\u0fce'")
[2.328517624800078, 2.3169403900011076, 2.317586282812048]
>>>
[*] Your curiosity has certainly shown, what this code point means.
For the others:
U+0FCE TIBETAN SIGN RDEL NAG RDEL DKAR
signifies good luck earlier, bad luck later
(My comment: Good luck with Python or bad luck with Python)
jmf
More information about the Python-list
mailing list