case sensitivity and XML

Michal Wallace (sabren) sabren at
Sun May 21 21:34:34 CEST 2000

On Sun, 21 May 2000, Neil Hodgson wrote:

> > 11. Given the above, a case-insensitive python cannot parse all
> >     XML documents.
>    This is wrong. Python can already parse all XML documents. It just can't
> present the results in the form you like. The dash is a valid character in
> XML attributes but not in Python identifiers. Therefore you must have a fall
> back strategy to deal with some cases.

Not so. In xmllib, you use functions like:

   start_sometag(self, attrs):

which gets called when the parser sees "<sometag/>"
Suppose you have this: '<sometag a="lowercase" A="uppercase" a-b="a dash b"/>'
Then, case sensitive python would yield this dictionary:

   attrs == {"a":"lowercase", "A":"uppercase", "a-b":"a dash b"}

No problem whatsoever. Case INsensitive python yields the same dictionary,
but how do you tell "a" apart from "A"? For example, what is the truth
value of: attrs["a"] == attrs["A"] ?

If we assume that attrs["a"] and attrs["A"] are different, then
someModule.__dict__["A"] != someModule.__dict__["a"]

Which means someModule.A != someModule.a .. Which means
python is case sensitive again.


One option is to say that python just won't be able to parse some
XML documents. I think that would be suicide for the language, as
more and more people and companies start using XML.

Another option is to introduce partial case insensitivity.. 

Another option is to get rid of the whole notion of __dict__.
I think this would break a lot more code.

Another option is to build it into the editor, with the caveat that
strings are not case sensitive, so that by the time someone figures
out what __dict__ is, they turn off the case insensitive training


- Michal

More information about the Python-list mailing list