Python Unicode handling wins again -- mostly
ned at nedbatchelder.com
Sun Dec 1 00:07:36 CET 2013
On 11/30/13 5:37 PM, Gregory Ewing wrote:
> wxjmfauth at gmail.com wrote:
>> And do you know the origin of this typographical feature?
>> Because, mechanically, the dot of the "i" broke too often.
>> In my opinion, a very plausible explanation.
> It doesn't sound very plausible to me, because there
> are a lot more stand-alone 'i's in English text than
> there are ones following an f. What is there to stop
> them from breaking?
> It's more likely to be simply a kerning issue. You
> want to get the stems of the f and the i close together,
> and the only practical way to do that with mechanical
> type is to merge them into one piece of metal.
> Which makes it even sillier to have an 'ffi' character
> in this day and age, when you can simply space the
> characters so that they overlap.
The fi ligature was created because visually, an f and i wouldn't work
well together: the crossbar of the f was near, but not connected to the
serif of the i, and the terminal bulb of the f was close to, but not
coincident, with the dot of the i.
This article goes into great detail, and has a good illustration of how
an f and i can clash, and how an fi ligature can fix the problem:
http://opentype.info/blog/2012/11/20/whats-a-ligature/ . Note the second
fi illustration, which demonstrates using a ligature to make the letters
appear *less* connected than they would individually!
This is also why "simply spacing the characters" isn't a solution: a
specially designed ligature looks better than a separate f and i, no
matter how minutely kerned.
It's unfortunate that Unicode includes presentation alternatives like
the fi (and ff, fl, ffi, and fl) ligatures. It was done to be a
superset of existing encodings.
Many typefaces have other non-encoded ligatures as well, especially
display faces, which also have alternate glyphs. Unicode is a funny mix
in that it includes some forms of alternates, but can't include all of
them, so we have to put up with both an ad-hoc Unicode that includes
presentational variants, and also some other way to specify variants
because Unicode can't include all of them.
More information about the Python-list