[portland] Script to Change UPPER CASE to Mixed Case

Dylan Reinhardt python at dylanreinhardt.com
Tue Sep 22 17:35:05 CEST 2009


On Mon, Sep 21, 2009 at 5:59 PM, Rich Shepard <rshepard at appl-ecosys.com>wrote:

> I'm sure it will help. Some strings are single words (such as the city
> name), others are multiple words (such as the facility description).
> Probably only the first word should be capitalized, not all of them. But,
> tomorrow I'll look again at the data and see what I really want.
>
>
This task belongs in the category of "things that seem like they should be
really simple but aren't."

You're attempting to discover information in data that doesn't contain it.
Unless you have very narrow and specific data set, you may not be able to do
much better than applying a simple and consistent *format* such as .upper()
to all data.

Consider the following lines that you might find in a "street address" line:

 - Dept. of Motor Vehicles
 - Attn: Guido van Rossum
 - Attn: Henry Higgins III
 - San Francisco Chapter
 - PO Box 1234
 - Mail Stop C5A
 - Attn: A/R

It's going to be *really* difficult to develop rules to handle those
examples correctly and that's even before you get to military addresses and
ex-US addresses.  The more sophisticated you attempt to be, the more glaring
and difficult the exceptions will become.

FWIW,

Dylan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/portland/attachments/20090922/5fd432fa/attachment.htm>


More information about the Portland mailing list