[XML-SIG] Content is split into two
"Martin v. Löwis"
martin at v.loewis.de
Sat Apr 5 22:51:30 CEST 2008
> Wow, totally unexpected. Wonder why it's designed as it is? This is
> especially weird to me since the string size isn't big (small buffer)
> and this add a bit of complexity to the text processing.
There are two reasons:
1. Efficiency. The parser reads a block of input into a buffer, and then
parses out of this buffer. If the buffer is exhausted, it first
passes the data to the application, rather than having to grow the
buffer if the text content is not complete (which would involve
copying the data, potentially several times).
2. Correctness. If you have an entity reference (such as © in HTML)
in your input, the parser needs to tell the application what the
source entity is (ie. what system and public identifier it has). If
it would return all data in a single buffer, the source data would
be distributed across different entities, making it impossible to
refer to the source with a single URL.
More information about the XML-SIG