[Tutor] re Python built-in methods, functions, libraries, etc. [unichr()]

Sat Dec 7 03:47:02 2002

On Fri, 6 Dec 2002, gmlloyd wrote:

> Can you suggest some online lists/discussions of Python's built-in
> methods & functions, Python libraries, etc.

Hi Geoff,

We occasionally talk about some of the builtin functions here on Tutor;
you might be able to find some archives of our discussions here:

    http://mail.python.org/pipermail/tutor/

Is there a particular builtin that you're interested in?  Maybe we could
have some sort of "builtin-of-the-week" feature that someone can write
up...

I'll kick it off.  *grin*

Let's see... let's pick something out of the hat at random.

###
>>> import random
>>> random.choice(dir(__builtins__))
'unichr'
###

What is 'unichr'?

###
>>> type(unichr)
<type 'builtin_function_or_method'>
>>> callable(unichr)
1
###

I've used the builtin 'type()' function, that tells us what kind of thing
we're looking at.  'unichr' looks like a function, and we've verified that
it's callable.

We can find out more about unichr() by either looking at the documentation
at:

    http://www.python.org/doc/lib/built-in-funcs.html

... or, we can also take advantage of the builtin help() that lets us
query for more information:

###
>>> help(unichr)

Help on built-in function unichr:

unichr(...)
    unichr(i) -> Unicode character

    Return a Unicode string of one character with ordinal i; 0 <= i <=
0x10ffff.
###

'unichr()', then, is a function that takes an integer between 0 and
0x10fff, and returns the unicode string for that character.  For example,
I was curious, so I visited:

    http://www.unicode.org/charts/

and picked a random character out of the Hangul Syllables:

###
>>> my_unicode_char = unichr(0xc720)
>>> my_unicode_char
u'\uc720'
###

C720, according to the Unocde Code Pages, is a single Hangul character.
In ASCII-art, it sorta looks like this:

      +-+
      +-+
    -------
      | |

Unicode is meant to represent a vast majority of languages in the world,
and most web browsers support it, so even if we can't see it from our
console, we might have some success seeing it by rendering it into a Web
page, and then looking at it from our favorite browser:

###
>>> my_unicode_char = unichr(0xC720)
>>> f = open('test-unicode.html', 'w')
>>> f.write('''<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
... <html>
... <head>
... <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
... </head>
... <body>
... <h1>My first Unicode character: %s</h1>
... </body>
... </html>''' % my_unicode_char.encode('utf-8'))
>>> f.close()
###

For those who don't want to type all that, but still want to see the
results, I've put up the HTML generated by that program here:

    http://hkn.eecs.berkeley.edu/~dyoo/test-unicode.html

So, in summary, unichr() lets us write out a single unicode character if
we know its numeric code.

I hope this was somewhat interesting!