From JoyceUlysses.txt -- words occurring exactly once

Grant Edwards grant.b.edwards at gmail.com
Mon Jun 3 14:58:26 EDT 2024


On 2024-06-03, Edward Teach via Python-list <python-list at python.org> wrote:

> The Gutenburg Project publishes "plain text".  That's another
> problem, because "plain text" means UTF-8....and that means
> unicode...and that means running some sort of unicode-to-ascii
> conversion in order to get something like "words".  A couple of
> hours....a couple of hundred lines of C....problem solved!

I'm curious.  Why does it need to be converted frum Unicode to ASCII?

When you read it into Python, it gets converted right back to Unicode...





More information about the Python-list mailing list