[ expat-Bugs-580258 ] Problem with GetBuffer/ParseBuffer
noreply@sourceforge.net
noreply@sourceforge.net
Fri Jul 12 13:40:02 2002
Bugs item #580258, was opened at 2002-07-11 15:40
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=580258&group_id=10127
Category: None
Group: None
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Christopher M. Woods (cmwoods)
Assigned to: Nobody/Anonymous (nobody)
Summary: Problem with GetBuffer/ParseBuffer
Initial Comment:
This is my first post so forgive me if I leave anything
out...
I've encountered a problem while using the
XML_GetBuffer/XML_ParserBuffer methods of the Expat
library. [Using libexpatw.lib, on Win2K w/ MSVC++ 6.0 -
sp5, wide chars (UNICODE), with a UTF-16 encoded
XML file of roughly 33KB of data.] When using these
methods, I've experienced errors from the parser stating
one of the following: not well-formed XML, illegal token,
or unclosed token. Each of the errors appear
consistently for a given file and a given buffer size.
I haven't narrowed down the problem yet - and will
include more information once I get a chance to dig into
the code further. I can tell you that I get no errors on the
file if I read it into my own buffer and use the XML_Parse
method. I can also tell you that I DO GET errors if I
request a buffer large enough for the file, read the file
into the buffer, and then call XML_ParseBuffer... so the
problem appears to be [at least on the surface] with
XML_ParseBuffer.
----------------------------------------------------------------------
>Comment By: Karl Waclawek (kwaclaw)
Date: 2002-07-12 16:39
Message:
Logged In: YES
user_id=290026
Great!
Bug report closed.
----------------------------------------------------------------------
Comment By: Christopher M. Woods (cmwoods)
Date: 2002-07-12 16:34
Message:
Logged In: YES
user_id=576763
Karl,
Your suggestion worked - thank you. You can pull this
from the bug list then.
-Chris
----------------------------------------------------------------------
Comment By: Karl Waclawek (kwaclaw)
Date: 2002-07-12 12:51
Message:
Logged In: YES
user_id=290026
I could get it to work, but it does not produce any
output. Anyway, I had a look at your loop.
It should not work the way it is written.
Expat expects a buffer of bytes, but you are passing
the buffer chunks as null terminated wide strings.
Expat can even handle it if the buffer boundaries are
*within* a wide character.
The following code should work (or come close):
while (bRC && dwSize!=0)
{
pBuffer = XML_GetBuffer(m_pParser, READ_SIZE);
if (pBuffer)
{
dwSize = fread(pBuffer, 1, READ_SIZE, pInputFile);
bRC = XML_ParseBuffer(m_pParser, dwSize,
dwSize==0);
} else {
bRC = false;
}
}
Please test.
At this point I don't see a bug in Expat.
Karl
----------------------------------------------------------------------
Comment By: Christopher M. Woods (cmwoods)
Date: 2002-07-12 12:02
Message:
Logged In: YES
user_id=576763
Karl,
Don't know what to say... I'm using Win2K from the cmd
line also...
Sample.cpp <D:\Temp\Sample\Sample.xml
-Chris
----------------------------------------------------------------------
Comment By: Karl Waclawek (kwaclaw)
Date: 2002-07-12 11:39
Message:
Logged In: YES
user_id=290026
Your sample project expats input redirection,
but that does not work on my system, it seems.
Using W2K, cmd.exe.
Karl
----------------------------------------------------------------------
Comment By: Christopher M. Woods (cmwoods)
Date: 2002-07-12 10:59
Message:
Logged In: YES
user_id=576763
File was too large... I had to remove libexpatw_d.dll... you'll
want to rebuild or change project settings to link to
libexpatw.dll (for debug)
----------------------------------------------------------------------
Comment By: Christopher M. Woods (cmwoods)
Date: 2002-07-12 10:56
Message:
Logged In: YES
user_id=576763
Hmm... I'll try again.
----------------------------------------------------------------------
Comment By: Karl Waclawek (kwaclaw)
Date: 2002-07-12 10:09
Message:
Logged In: YES
user_id=290026
I can't see your zip file.
Karl
----------------------------------------------------------------------
Comment By: Christopher M. Woods (cmwoods)
Date: 2002-07-12 09:41
Message:
Logged In: YES
user_id=576763
Hmm... Perhaps I'm missing something in my setup or the
delphi code isn't quite the same? I'm using wchar_t
(UNICODE & _UNICODE defines) and maybe the issue is
related to that? I've included a sample project in the zip file
that exhibits the behaviorism that I'm seeing.
The libexpatw.lib is right from the win32bin distribution
(1.95.3) and the libexpatw_d.lib is a debug build I made from
the included source code.
If you look in XMLExtractor.cpp, in the Process() function,
starting at line 52, you'll see both methodologies are coded
and one is commented out. The GetBuffer/ParseBuffer
method fails for me and the Parse method (the 2nd one)
seems to work fine. [Separate question: Why does
XML_Parse take const char* and not XML_Char*?]
I'd be perfectly happy if this turns out to be nothing more
that a simple configuration problem on my part. But I'm
curious why it seems to work fine one way and not the other.
Thank you for your time,
-Chris
----------------------------------------------------------------------
Comment By: Karl Waclawek (kwaclaw)
Date: 2002-07-11 21:22
Message:
Logged In: YES
user_id=290026
I cannot reproduce your problem with the current CVS
and also not with 1.95.3. I tested with the same buffer sizes
you gave me.
I compiled the Dll with VC++6 (I believe I have SP5 also).
My test program is written in Delphi, but you can probably
still compare my parsing loop with yours:
const
XML_READBUFSIZE = 65536;
...
IsFinal := False;
while not IsFinal do
begin
Buffer := XMLGetBuffer(Parser, XML_READBUFSIZE);
if Buffer = nil then OutOfMemoryError;
ReadCount := Stream.Read(Buffer^, XML_READBUFSIZE);
IsFinal := ReadCount < XML_READBUFSIZE;
if XMLParseBuffer(Parser, ReadCount, Integer(IsFinal)) = 0
then
begin
ErrorCode := Ord(XMLGetErrorCode(Parser));
...
Break;
end;
end;
Is that different?
Karl
----------------------------------------------------------------------
Comment By: Christopher M. Woods (cmwoods)
Date: 2002-07-11 17:51
Message:
Logged In: YES
user_id=576763
Frank,
I was/am using the 1.95.3 version of the project. I'll try the
CVS version when I get a chance (very busy).
Karl,
I've attached the sample file that I was using/having
problems with. I tried several buffer sizes including: 32768,
65536, 1000000, and (if I remember correctly) 8192.
Thanks,
-Chris
----------------------------------------------------------------------
Comment By: Karl Waclawek (kwaclaw)
Date: 2002-07-11 16:01
Message:
Logged In: YES
user_id=290026
Would you mind attaching the file and giving me the
buffer size you used?
I will try to duplicate the problem.
Karl
----------------------------------------------------------------------
Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-07-11 15:51
Message:
Logged In: YES
user_id=3066
Are you using Expat 1.95.3 or the CVS version of Expat?
If you're not using the CVS version, can you take a look at
that version and try to reproduce the problem? You can get
information on getting the CVS version anonymously at:
http://www.libexpat.org/dev/cvs.html
Thanks!
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=580258&group_id=10127