Encode HTML CDATA name token

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Mon Sep 20 05:02:40 EDT 2010


On Mon, 20 Sep 2010 10:45:46 +0200, Gregor Horvath wrote:

> Hi,
> 
> ID and NAME tokens must begin with a letter ([A-Za-z]) and may be
> followed by any number of letters, digits ([0-9]), hyphens ("-"),
> underscores ("_"), colons (":"), and periods ("."). [1]
> 
> Is there a encoder / decoder in Python that can convert arbitrary text
> to and from this encoding in an readable manner?

What is "this encoding" called? The article you link to describes a 
specification, not an encoding.

What makes you think that you should be able to convert arbitrary text to 
strings suitable for use as ID or NAME tokens?

What would you expect this encoding to do with these strings?

"1234"
"?*(#@!+{})"
"          "



-- 
Steven



More information about the Python-list mailing list