People's names (was Re: sqlite3 error)

John Machin sjmachin at lexicon.net
Sun Oct 8 02:08:58 CEST 2006


Lawrence D'Oliveiro wrote:
> In message <mailman.912.1159494600.10491.python-list at python.org>, Steve
> Holden wrote:
>
> > John Machin wrote:
> >
> > [lots of explanation about peculiarities of people's names]
> >
> > While I don't dispute any of this erudite display of esoteric
> > nomenclature wisdom the fact remains that many (predominantly Western)
> > databases do tend to use first and last name (in America often with the
> > addition of a one- or two-character "middle initial" field).
>
> Just because most Western designers of databases do it wrong doesn't mean
> that a) you should do it wrong, or b) they will continue to do it wrong
> into the future, as increasing numbers of those designers come from Asian
> and other non-Western backgrounds.

Unfortunately, lack of appreciation that different rules and customs
may apply on the other side of the county/state/national boundary is a
universal trait, not one restricted to Westerners.

>
> > So, having distilled your knowledge to its essence could you please give
> > me some prescriptive advice about what I *should* do? :-)
>
> Has anyone come up with a proper universal table design for storing people's
> names?
>
> Certainly "first name" and "last name" are the wrong column names to use. I
> think "family name" and "given names" would be a good start.

So far so good.

>  For the
> Icelanders, Somalians and the Muslims, their father's name goes in
> the "family name" field, which makes sense because all their siblings (of
> the same sex, at least) would have the same value in this field.

Two problems so far:
(1) If you then assume that you should print the phone directory in
order of family name, that's not appropriate in some places e.g.
Iceland; neither is addressing Jon Jonsson as "Mr Jonsson", and BTW it
can be their mother's name e.g. if she has more fame or recognition
than their father.
(2) Arabic names: you may or may not have their father's name. You
might not even have the [usually only one] given name. For example: the
person who was known as Abu Musab al-Zarqawi: this means "father of
Musab, the man from Zarqa [a city in Jordan]". You may have the family
name as well as the father's and grandfather's given name. You can have
the occupation, honorifics, nicknames. For a brief overview, read this:
http://en.wikipedia.org/wiki/Arabic_names

>
> I wonder if we need another "middle" field for holding the "bin/binte" part
> (could also hold, e.g. "Van" for those names that use this).

Not a good idea, IMHO. Consider "Nguyen Van Tran" vs 'Rembrandt van
Rijn". Would you peel the Da off Da Costa but not the D' off
D'Oliveiro? What do you do with the bod who fills in a form as Dermot
O'Sullivan one month and Diarmaid Ó Súilleabháin the next?

>
> There would also need to be a flag field to indicate the canonical ordering
> for writing out the full name: e.g. family-name-first, given-names-first.
> Do we need something else for the Vietnamese case?

As I said before, it depends on the application. In some applications,
it will be preferable to capture, in one field, the whole name as
supplied by the person, together with clues like nationality and place
of birth that will help in parsing it later. However if all you want to
do is post out the electricity bill to an address that your
meter-reader has verified, then you can afford to be a  little casual
with the name.

This is all a bit OT.  Before we close the thread down, let me leave
you with one warning:
Beware of enthusiastic maintenance programmers on a mission to clean up
the dirty names in your database:
E.g. (1) "Karim bin Md" may not appreciate getting a letter addressed
to "Dr Karim Bin" (Md is an abbreviation of Muhammad).
E.g. (2) Billing job barfs on a customer who has no given names and no
family name. Inspection reveals that he is over-endowed in the title
department: "Mr Earl King".

Cheers,
John




More information about the Python-list mailing list