.title() - annoying mistake

Robert Latest boblatest at yahoo.com
Mon Mar 22 09:58:28 EDT 2021


Chris Angelico wrote:
> There are a small number of characters which, when case folded, become
> more than one character. The sharp S from German behaves thusly:
>
>>>> "ß".upper(), "ß".lower(), "ß".casefold(), "ß".title()
> ('SS', 'ß', 'ss', 'Ss')

Now we're getting somewhere. I'm a native German speaker and I can tell you
that this doesn't happen in the real world, simply because 'ß' never appears at
the beginning of a word and thus is never "title cased". The only occurence of
uppercase 'ß' is in all-caps text, which Python handles properly:

'bißchen'.upper()
'BISSCHEN'

...that is, properly until 2008, when the capital 'ß' was officially introduced
into German ortography:

https://en.wikipedia.org/wiki/%C3%9F#Capital_form
"Traditionally, ß did not have a capital form, although some type designers
introduced de facto capitalized variants of ß. In 2017, the Council for German
Orthography ultimately adopted capital ß, ẞ, into German orthography, ending a
long orthographic debate.[3] [...] The capital variant (U+1E9E ẞ LATIN CAPITAL
LETTER SHARP S) was encoded by ISO 10646 in 2008." So Python 3.6.8 is about 12
years behind.

As a German I also appreciate the reduced occurence of the letter combination
'SS'.

That said, the concept of "title casing" doesn't even exist in German. Titles
are spelt just like any regular sentence. I know only two definitions of the
concept "title case":

1) From Wikipedia
"Title case or headline case is a style of capitalization used for rendering
the titles of published works or works of art in English. [...]"

2) From Python (paraphrased):
"Perform an arbitrary (but defined) operation on the characters of a string."

I don't mind .title() being in Python. I would very much mind to be the person
in charge of maintaining it and having to port it into new versions of Python,
always keeping an eye on the evolution of Unicode or other standards (see
above).

It probably just comes down to me not being able to conjure up a single
sensible use case for .title() as well as the whole concept of "title casing"
in the context of a programming language.

> The neat thing about Unicode is 

[many things]

> The documentation sometimes shorthands things with terms like "upper
> case" and "lower case", but that's partly because being pedantically
> correct in a docstring doesn't actually help anything, and the code
> itself IS correct.

...but hard to maintain and useless. I just love to hate .title() ;-)

robert


More information about the Python-list mailing list