Xml Special Characters Escaping issue
Hello, I am working with a localization tool Wordforge Editor. I am trying to keep the tags inside a text of a xliff file without changing the xml special characters. For example, here is a source string and a target string of a xliff file. <source xml:lang="en-us"> *Texts before inline tag<bpt id="1">texts inside inline tag</bpt> Texts after inline tag*</source> <target"> *Texts before inline tag<bpt id="1">texts inside inline tag</bpt> Texts after inline tag*</target> The source and target string can be shown in the localization tool and user translates the English target to another language. Here in the target, the text *"<bpt id="1">texts inside inline text</bpt>"*has to be kept as it is. It can not be changed or distorted. But when I save the text in the xliff file, it can not keep the original tag. Rather it escapes the special character of xml tag "<" and ">" to *"<"* and *">" * respectively. As a result, the string is saved in the file as <target>Texts before inline tag *<bpt id="1"> texts inside inline tag </bpt>* Texts after inline tag <target>* *It supposed to be saved in the file like this : * *<target"> *Texts before inline tag<bpt id="1">texts inside inline tag</bpt> Texts after inline tag*</target> * *Can you please tell me how can I prevent xml special characters*("<" and ">")* from being replaced with character entities (*"<"* and *">"*) while saving in the file. Is there any function or technique in lxml library to handle this? *-- * Thank you, Aditi.
Aditi Barua, 13.04.2011 11:43:
I am working with a localization tool Wordforge Editor. I am trying to keep the tags inside a text of a xliff file without changing the xml special characters. For example, here is a source string and a target string of a xliff file.
<source xml:lang="en-us"> *Texts before inline tag<bpt id="1">texts inside inline tag</bpt> Texts after inline tag*</source> <target"> *Texts before inline tag<bpt id="1">texts inside inline tag</bpt> Texts after inline tag*</target>
The source and target string can be shown in the localization tool and user translates the English target to another language. Here in the target, the text *"<bpt id="1">texts inside inline text</bpt>"*has to be kept as it is. It can not be changed or distorted. But when I save the text in the xliff file, it can not keep the original tag. Rather it escapes the special character of xml tag "<" and">" to *"<"* and *">" * respectively.
As a result, the string is saved in the file as
<target>Texts before inline tag *<bpt id="1"> texts inside inline tag </bpt>* Texts after inline tag<target>*
*It supposed to be saved in the file like this : *
*<target"> *Texts before inline tag<bpt id="1">texts inside inline tag</bpt> Texts after inline tag*</target>
* *Can you please tell me how can I prevent xml special characters*("<" and ">")* from being replaced with character entities (*"<"* and *">"*) while saving in the file. Is there any function or technique in lxml library to handle this?
ISTM that you are using the wrong approach here. If what you have is text with tags, and you want to insert it into an XML tree, then parse it and insert the result. Stefan
participants (2)
-
Aditi Barua
-
Stefan Behnel