From kwaclaw at users.sourceforge.net Thu Sep 4 10:13:12 2003
From: kwaclaw at users.sourceforge.net (Karl Waclawek)
Date: Thu Sep 4 17:14:50 2003
Subject: [Expat-checkins] expat/tests/benchmark README.txt,NONE,1.1
Message-ID:
Update of /cvsroot/expat/expat/tests/benchmark
In directory sc8-pr-cvs1:/tmp/cvs-serv4815
Added Files:
README.txt
Log Message:
Explains usage of benchmark utility.
--- NEW FILE: README.txt ---
Use this benchmark command line utility as follows:
benchmark [-n] <# iterations>
The command line arguments are:
-n ... optional; if supplied, then namepsace processing is turned on
... name/path of test xml file
... size of processing buffer;
the file is parsed in chunks of this size
<# iterations> ... how often will the file be parsed
Returns:
The time (in seconds) it takes to parse the test file,
averaged of the number of iterations.
From fdrake at users.sourceforge.net Thu Sep 4 10:06:54 2003
From: fdrake at users.sourceforge.net (Fred L. Drake)
Date: Thu Sep 4 17:14:53 2003
Subject: [Expat-checkins] CVSROOT loginfo,1.4,1.5
Message-ID:
Update of /cvsroot/expat/CVSROOT
In directory sc8-pr-cvs1:/tmp/cvs-serv3678
Modified Files:
loginfo
Log Message:
Don't generate email for the large test files.
Index: loginfo
===================================================================
RCS file: /cvsroot/expat/CVSROOT/loginfo,v
retrieving revision 1.4
retrieving revision 1.5
diff -u -d -r1.4 -r1.5
--- loginfo 26 Aug 2002 20:10:07 -0000 1.4
+++ loginfo 4 Sep 2003 16:06:52 -0000 1.5
@@ -26,3 +26,6 @@
#DEFAULT (echo ""; id; echo %{sVv}; date; cat) >> $CVSROOT/CVSROOT/commitlog
DEFAULT $CVSROOT/CVSROOT/syncmail -u %{sVv} expat-checkins@libexpat.org
+
+# Don't generate email for the big files...
+testdata/largefiles true
From fdrake at users.sourceforge.net Thu Sep 4 10:05:05 2003
From: fdrake at users.sourceforge.net (Fred L. Drake)
Date: Thu Sep 4 17:14:54 2003
Subject: [Expat-checkins] testdata/largefiles README.txt,NONE,1.1
Message-ID:
Update of /cvsroot/expat/testdata/largefiles
In directory sc8-pr-cvs1:/tmp/cvs-serv3429/largefiles
Added Files:
README.txt
Log Message:
Add explanations for this directory.
--- NEW FILE: README.txt ---
This directory contains some really large test files, mostly used to
benchmark various aspects of Expat's performance.
(As files are added, they should be described here, including what
benchmark program they're intended to be used with and what that
resulting measurements tell us.)
From fdrake at users.sourceforge.net Thu Sep 4 10:05:04 2003
From: fdrake at users.sourceforge.net (Fred L. Drake)
Date: Thu Sep 4 17:14:55 2003
Subject: [Expat-checkins] testdata README.txt,NONE,1.1
Message-ID:
Update of /cvsroot/expat/testdata
In directory sc8-pr-cvs1:/tmp/cvs-serv3429
Added Files:
README.txt
Log Message:
Add explanations for this directory.
--- NEW FILE: README.txt ---
This directory contains various collections of files used for test
data. It currently contains the following collections:
largefiles/
This contains some really large files; most are used for
benchmarking various aspects of Expat's performance.
From fdrake at users.sourceforge.net Thu Sep 4 10:02:38 2003
From: fdrake at users.sourceforge.net (Fred L. Drake)
Date: Thu Sep 4 17:14:55 2003
Subject: [Expat-checkins] testdata/largefiles - New directory
Message-ID:
Update of /cvsroot/expat/testdata/largefiles
In directory sc8-pr-cvs1:/tmp/cvs-serv3023/largefiles
Log Message:
Directory /cvsroot/expat/testdata/largefiles added to the repository
From fdrake at users.sourceforge.net Thu Sep 4 10:02:24 2003
From: fdrake at users.sourceforge.net (Fred L. Drake)
Date: Thu Sep 4 17:14:56 2003
Subject: [Expat-checkins] testdata - New directory
Message-ID:
Update of /cvsroot/expat/testdata
In directory sc8-pr-cvs1:/tmp/cvs-serv2956/testdata
Log Message:
Directory /cvsroot/expat/testdata added to the repository
From kwaclaw at users.sourceforge.net Thu Sep 4 10:01:56 2003
From: kwaclaw at users.sourceforge.net (Karl Waclawek)
Date: Thu Sep 4 17:14:56 2003
Subject: [Expat-checkins] expat/tests/benchmark benchmark.c, NONE,
1.1 benchmark.dsp, NONE, 1.1 benchmark.dsw, NONE, 1.1
Message-ID:
Update of /cvsroot/expat/expat/tests/benchmark
In directory sc8-pr-cvs1:/tmp/cvs-serv2871
Added Files:
benchmark.c benchmark.dsp benchmark.dsw
Log Message:
Small benchmark utility to test pure parser speed.
Tested on Windows only. Includes MS VC++ 6.0 workspace.
--- NEW FILE: benchmark.c ---
#include
#include
#include
#include
#include "expat.h"
static void
usage(const char *prog, int rc)
{
fprintf(stderr,
"usage: %s [-n] filename bufferSize nr_of_loops\n", prog);
exit(rc);
}
int main (int argc, char *argv[])
{
XML_Parser parser;
char *XMLBuf, *XMLBufEnd, *XMLBufPtr;
FILE *fd;
struct stat fileAttr;
int nrOfLoops, bufferSize, fileSize, i, isFinal;
int j = 0, ns = 0;
clock_t tstart, tend;
double cpuTime = 0.0;
if (argc > 1) {
if (argv[1][0] == '-') {
if (argv[1][1] == 'n' && argv[1][2] == '\0') {
ns = 1;
j = 1;
}
else
usage(argv[0], 1);
}
}
if (argc != j + 4)
usage(argv[0], 1);
if (stat (argv[j + 1], &fileAttr) != 0) {
fprintf (stderr, "could not access file '%s'\n", argv[j + 1]);
return 2;
}
fd = fopen (argv[j + 1], "r");
if (!fd) {
fprintf (stderr, "could not open file '%s'\n", argv[j + 1]);
exit(2);
}
bufferSize = atoi (argv[j + 2]);
nrOfLoops = atoi (argv[j + 3]);
if (bufferSize <= 0 || nrOfLoops <= 0) {
fprintf (stderr,
"buffer size and nr of loops must be greater than zero.\n");
exit(3);
}
XMLBuf = malloc (fileAttr.st_size);
fileSize = fread (XMLBuf, sizeof (char), fileAttr.st_size, fd);
fclose (fd);
i = 0;
XMLBufEnd = XMLBuf + fileSize;
while (i < nrOfLoops) {
XMLBufPtr = XMLBuf;
isFinal = 0;
if (ns)
parser = XML_ParserCreateNS(NULL, '!');
else
parser = XML_ParserCreate(NULL);
tstart = clock();
do {
int parseBufferSize = XMLBufEnd - XMLBufPtr;
if (parseBufferSize <= bufferSize)
isFinal = 1;
else
parseBufferSize = bufferSize;
if (!XML_Parse (parser, XMLBufPtr, parseBufferSize, isFinal)) {
fprintf (stderr, "error '%s' at line %d character %d\n",
XML_ErrorString (XML_GetErrorCode (parser)),
XML_GetCurrentLineNumber (parser),
XML_GetCurrentColumnNumber (parser));
free (XMLBuf);
XML_ParserFree (parser);
exit (4);
}
XMLBufPtr += bufferSize;
} while (!isFinal);
tend = clock();
cpuTime += ((double) (tend - tstart)) / CLOCKS_PER_SEC;
XML_ParserFree (parser);
i++;
}
free (XMLBuf);
printf ("%d loops, with buffer size %d. Average time per loop: %f\n",
nrOfLoops, bufferSize, cpuTime / (double) nrOfLoops);
return 0;
}
--- NEW FILE: benchmark.dsp ---
# Microsoft Developer Studio Project File - Name="benchmark" - Package Owner=<4>
# Microsoft Developer Studio Generated Build File, Format Version 6.00
# ** DO NOT EDIT **
# TARGTYPE "Win32 (x86) Console Application" 0x0103
CFG=benchmark - Win32 Debug
!MESSAGE This is not a valid makefile. To build this project using NMAKE,
!MESSAGE use the Export Makefile command and run
!MESSAGE
!MESSAGE NMAKE /f "benchmark.mak".
!MESSAGE
!MESSAGE You can specify a configuration when running NMAKE
!MESSAGE by defining the macro CFG on the command line. For example:
!MESSAGE
!MESSAGE NMAKE /f "benchmark.mak" CFG="benchmark - Win32 Debug"
!MESSAGE
!MESSAGE Possible choices for configuration are:
!MESSAGE
!MESSAGE "benchmark - Win32 Release" (based on "Win32 (x86) Console Application")
!MESSAGE "benchmark - Win32 Debug" (based on "Win32 (x86) Console Application")
!MESSAGE
# Begin Project
# PROP AllowPerConfigDependencies 0
# PROP Scc_ProjName ""
# PROP Scc_LocalPath ""
CPP=cl.exe
RSC=rc.exe
!IF "$(CFG)" == "benchmark - Win32 Release"
# PROP BASE Use_MFC 0
# PROP BASE Use_Debug_Libraries 0
# PROP BASE Output_Dir "Release"
# PROP BASE Intermediate_Dir "Release"
# PROP BASE Target_Dir ""
# PROP Use_MFC 0
# PROP Use_Debug_Libraries 0
# PROP Output_Dir "Release"
# PROP Intermediate_Dir "Release"
# PROP Target_Dir ""
# ADD BASE CPP /nologo /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c
# ADD CPP /nologo /W3 /GX /O2 /I "..\..\lib" /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c
# ADD BASE RSC /l 0x1009 /d "NDEBUG"
# ADD RSC /l 0x1009 /d "NDEBUG"
BSC32=bscmake.exe
# ADD BASE BSC32 /nologo
# ADD BSC32 /nologo
LINK32=link.exe
# ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /machine:I386
# ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /machine:I386
!ELSEIF "$(CFG)" == "benchmark - Win32 Debug"
# PROP BASE Use_MFC 0
# PROP BASE Use_Debug_Libraries 1
# PROP BASE Output_Dir "Debug"
# PROP BASE Intermediate_Dir "Debug"
# PROP BASE Target_Dir ""
# PROP Use_MFC 0
# PROP Use_Debug_Libraries 1
# PROP Output_Dir "Debug"
# PROP Intermediate_Dir "Debug"
# PROP Target_Dir ""
# ADD BASE CPP /nologo /W3 /Gm /GX /ZI /Od /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /GZ /c
# ADD CPP /nologo /W3 /Gm /GX /ZI /Od /I "..\..\lib" /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /GZ /c
# ADD BASE RSC /l 0x1009 /d "_DEBUG"
# ADD RSC /l 0x1009 /d "_DEBUG"
BSC32=bscmake.exe
# ADD BASE BSC32 /nologo
# ADD BSC32 /nologo
LINK32=link.exe
# ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept
# ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept
!ENDIF
# Begin Target
# Name "benchmark - Win32 Release"
# Name "benchmark - Win32 Debug"
# Begin Source File
SOURCE=.\benchmark.c
# End Source File
# End Target
# End Project
--- NEW FILE: benchmark.dsw ---
Microsoft Developer Studio Workspace File, Format Version 6.00
# WARNING: DO NOT EDIT OR DELETE THIS WORKSPACE FILE!
###############################################################################
Project: "benchmark"=.\benchmark.dsp - Package Owner=<4>
Package=<5>
{{{
}}}
Package=<4>
{{{
Begin Project Dependency
Project_Dep_Name expat
End Project Dependency
}}}
###############################################################################
Project: "expat"=..\..\lib\expat.dsp - Package Owner=<4>
Package=<5>
{{{
}}}
Package=<4>
{{{
}}}
###############################################################################
Global:
Package=<5>
{{{
}}}
Package=<3>
{{{
}}}
###############################################################################
From kwaclaw at users.sourceforge.net Thu Sep 4 09:59:01 2003
From: kwaclaw at users.sourceforge.net (Karl Waclawek)
Date: Thu Sep 4 17:14:57 2003
Subject: [Expat-checkins] expat/tests/benchmark - New directory
Message-ID:
Update of /cvsroot/expat/expat/tests/benchmark
In directory sc8-pr-cvs1:/tmp/cvs-serv2292/benchmark
Log Message:
Directory /cvsroot/expat/expat/tests/benchmark added to the repository
From kwaclaw at users.sourceforge.net Thu Sep 4 18:42:49 2003
From: kwaclaw at users.sourceforge.net (Karl Waclawek)
Date: Thu Sep 4 20:42:57 2003
Subject: [Expat-checkins] expat/tests/benchmark README.txt,1.1,1.2
Message-ID:
Update of /cvsroot/expat/expat/tests/benchmark
In directory sc8-pr-cvs1:/tmp/cvs-serv9443
Modified Files:
README.txt
Log Message:
Corrected typo.
Index: README.txt
===================================================================
RCS file: /cvsroot/expat/expat/tests/benchmark/README.txt,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -d -r1.1 -r1.2
--- README.txt 4 Sep 2003 16:13:10 -0000 1.1
+++ README.txt 5 Sep 2003 00:42:46 -0000 1.2
@@ -4,7 +4,7 @@
The command line arguments are:
- -n ... optional; if supplied, then namepsace processing is turned on
+ -n ... optional; if supplied, then namespace processing is turned on
... name/path of test xml file
... size of processing buffer;
the file is parsed in chunks of this size
From kwaclaw at users.sourceforge.net Fri Sep 5 11:20:49 2003
From: kwaclaw at users.sourceforge.net (Karl Waclawek)
Date: Fri Sep 5 13:20:53 2003
Subject: [Expat-checkins] htdocs/dev cvs.html,1.6,1.7
Message-ID:
Update of /cvsroot/expat/htdocs/dev
In directory sc8-pr-cvs1:/tmp/cvs-serv17853
Modified Files:
cvs.html
Log Message:
Updated mailing list links.
Index: cvs.html
===================================================================
RCS file: /cvsroot/expat/htdocs/dev/cvs.html,v
retrieving revision 1.6
retrieving revision 1.7
diff -u -d -r1.6 -r1.7
--- cvs.html 4 Sep 2002 15:46:35 -0000 1.6
+++ cvs.html 5 Sep 2003 17:20:47 -0000 1.7
@@ -75,11 +75,11 @@
change made to the CVS repository via this mailing list:
From kwaclaw at users.sourceforge.net Sat Sep 13 13:30:32 2003
From: kwaclaw at users.sourceforge.net (Karl Waclawek)
Date: Sat Sep 13 13:30:36 2003
Subject: [Expat-checkins] expat/lib xmlparse.c,1.109,1.110
Message-ID:
Update of /cvsroot/expat/expat/lib
In directory sc8-pr-cvs1:/tmp/cvs-serv9344
Modified Files:
xmlparse.c
Log Message:
Applied patch #699487. For details see patch description.
Index: xmlparse.c
===================================================================
RCS file: /cvsroot/expat/expat/lib/xmlparse.c,v
retrieving revision 1.109
retrieving revision 1.110
diff -u -d -r1.109 -r1.110
--- xmlparse.c 7 Mar 2003 15:54:08 -0000 1.109
+++ xmlparse.c 13 Sep 2003 17:30:30 -0000 1.110
@@ -102,12 +102,36 @@
typedef struct {
NAMED **v;
+ unsigned char power;
size_t size;
size_t used;
- size_t usedLim;
const XML_Memory_Handling_Suite *mem;
} HASH_TABLE;
+/* Basic character hash algorithm, taken from Python's string hash:
+ h = h * 1000003(prime number) ^ character.
+*/
+#ifdef XML_UNICODE
+#define CHAR_HASH(h, c) \
+ (((h) * 0xF4243) ^ (unsigned short)(c))
+#else
+#define CHAR_HASH(h, c) \
+ (((h) * 0xF4243) ^ (unsigned char)(c))
+#endif
+
+/* For probing (after a collision) we need a step size relative prime
+ to the hash table size, which is a power of 2. We use double-hashing,
+ since we can calculate a second hash value cheaply by taking those bits
+ of the first hash value that were discarded (masked out) when the table
+ index was calculated: index = hash & mask, where mask = table->size - 1.
+ We limit the maximum step size to table->size / 4 (mask >> 2) and make
+ it odd, since odd numbers are always relative prime to a power of 2.
+*/
+#define SECOND_HASH(hash, mask, power) \
+ ((((hash) & ~(mask)) >> ((power) - 1)) & ((mask) >> 2))
+#define PROBE_STEP(hash, mask, power) \
+ ((unsigned char)((SECOND_HASH(hash, mask, power)) | 1))
+
typedef struct {
NAMED **p;
NAMED **end;
@@ -116,6 +140,7 @@
#define INIT_TAG_BUF_SIZE 32 /* must be a multiple of sizeof(XML_Char) */
#define INIT_DATA_BUF_SIZE 1024
#define INIT_ATTS_SIZE 16
+#define INIT_ATTS_VERSION 0xFFFFFFFF
#define INIT_BLOCK_SIZE 1024
#define INIT_BUFFER_SIZE 1024
@@ -224,6 +249,7 @@
} DEFAULT_ATTRIBUTE;
typedef struct {
+ unsigned long version;
unsigned long hash;
const XML_Char *uriName;
} NS_ATT;
@@ -505,12 +531,13 @@
int m_idAttIndex;
ATTRIBUTE *m_atts;
NS_ATT *m_nsAtts;
- int m_nsAttsSize;
+ unsigned long m_nsAttsVersion;
+ unsigned char m_nsAttsPower;
POSITION m_position;
STRING_POOL m_tempPool;
STRING_POOL m_temp2Pool;
char *m_groupConnector;
- unsigned m_groupSize;
+ unsigned int m_groupSize;
XML_Char m_namespaceSeparator;
XML_Parser m_parentParser;
#ifdef XML_DTD
@@ -609,7 +636,8 @@
#define nSpecifiedAtts (parser->m_nSpecifiedAtts)
#define idAttIndex (parser->m_idAttIndex)
#define nsAtts (parser->m_nsAtts)
-#define nsAttsSize (parser->m_nsAttsSize)
+#define nsAttsVersion (parser->m_nsAttsVersion)
+#define nsAttsPower (parser->m_nsAttsPower)
#define tempPool (parser->m_tempPool)
#define temp2Pool (parser->m_temp2Pool)
#define groupConnector (parser->m_groupConnector)
@@ -757,7 +785,8 @@
ns_triplets = XML_FALSE;
nsAtts = NULL;
- nsAttsSize = 0;
+ nsAttsVersion = 0;
+ nsAttsPower = 0;
poolInit(&tempPool, &(parser->m_mem));
poolInit(&temp2Pool, &(parser->m_mem));
@@ -2419,7 +2448,10 @@
+ XmlNameLength(enc, atts[i].name));
if (!attId)
return XML_ERROR_NO_MEMORY;
- /* detect duplicate attributes by their QNames */
+ /* Detect duplicate attributes by their QNames. This does not work when
+ namespace processing is turned on and different prefixes for the same
+ namespace are used. For this case we have a check further down.
+ */
if ((attId->name)[-1]) {
if (enc == encoding)
eventPtr = atts[i].name;
@@ -2519,64 +2551,87 @@
}
appAtts[attIndex] = 0;
- /* expand prefixed attribute names and
- clear flags that say whether attributes were specified */
+ /* expand prefixed attribute names, check for duplicates,
+ and clear flags that say whether attributes were specified */
i = 0;
if (nPrefixes) {
- int j;
- if ((nPrefixes * 2) > nsAttsSize) {
- NS_ATT *temp = (NS_ATT *)REALLOC(nsAtts, nPrefixes * 2 * sizeof(NS_ATT));
+ int j; /* hash table index */
+ unsigned long version = nsAttsVersion;
+ int nsAttsSize = (int)1 << nsAttsPower;
+ /* size of hash table must be at least 2 * (# of prefixed attributes) */
+ if ((nPrefixes << 1) >> nsAttsPower) { /* true for nsAttsPower = 0 */
+ NS_ATT *temp;
+ /* hash table size must also be a power of 2 and >= 8 */
+ while (nPrefixes >> nsAttsPower++);
+ if (nsAttsPower < 3)
+ nsAttsPower = 3;
+ nsAttsSize = (int)1 << nsAttsPower;
+ temp = (NS_ATT *)REALLOC(nsAtts, nsAttsSize * sizeof(NS_ATT));
if (!temp)
return XML_ERROR_NO_MEMORY;
nsAtts = temp;
- nsAttsSize = nPrefixes * 2;
+ version = 0; /* force re-initialization of nsAtts hash table */
}
- /* clear nsAtts hash table */
- for (j = 0; j < nsAttsSize; j++)
- nsAtts[j].uriName = NULL;
+ /* using a version flag saves us from initializing nsAtts every time */
+ if (!version) { /* initialize version flags when version wraps around */
+ version = INIT_ATTS_VERSION;
+ for (j = nsAttsSize; j != 0; )
+ nsAtts[--j].version = version;
+ }
+ nsAttsVersion = --version;
+ /* expand prefixed names and check for duplicates */
for (; i < attIndex; i += 2) {
const XML_Char *s = appAtts[i];
- if (s[-1] == 2) {
+ if (s[-1] == 2) { /* prefixed */
ATTRIBUTE_ID *id;
const BINDING *b;
unsigned long uriHash = 0;
- ((XML_Char *)s)[-1] = 0;
+ ((XML_Char *)s)[-1] = 0; /* clear flag */
id = (ATTRIBUTE_ID *)lookup(&dtd->attributeIds, s, 0);
b = id->prefix->binding;
if (!b)
return XML_ERROR_UNBOUND_PREFIX;
- /* b->uri includes namespace separator */
+ /* as we expand the name we also calculate its hash value */
for (j = 0; j < b->uriLen; j++) {
const XML_Char c = b->uri[j];
if (!poolAppendChar(&tempPool, c))
return XML_ERROR_NO_MEMORY;
- uriHash = (uriHash << 5) + uriHash + (unsigned char)c;
+ uriHash = CHAR_HASH(uriHash, c);
}
while (*s++ != XML_T(':'))
;
- do {
+ do { /* copies null terminator */
const XML_Char c = *s;
if (!poolAppendChar(&tempPool, *s))
return XML_ERROR_NO_MEMORY;
- uriHash = (uriHash << 5) + uriHash + (unsigned char)c;
+ uriHash = CHAR_HASH(uriHash, c);
} while (*s++);
- /* detect duplicate attributes based on uriName = uri + local name */
- for (j = uriHash & (nsAttsSize - 1);
- nsAtts[j].uriName;
- j == 0 ? j = nsAttsSize - 1 : --j) {
- if (uriHash == nsAtts[j].hash) {
- const XML_Char *s1 = poolStart(&tempPool); /* null-terminated */
- const XML_Char *s2 = nsAtts[j].uriName;
- for (; *s1 == *s2 && *s1 != 0; s1++, s2++);
- if (*s1 == 0)
- return XML_ERROR_DUPLICATE_ATTRIBUTE;
+ { /* Check hash table for duplicate of expanded name (uriName).
+ Derived from code in lookup(HASH_TABLE *table, ...).
+ */
+ unsigned char step = 0;
+ unsigned long mask = nsAttsSize - 1;
+ j = uriHash & mask; /* index into hash table */
+ while (nsAtts[j].version == version) {
+ /* for speed we compare stored hash values first */
+ if (uriHash == nsAtts[j].hash) {
+ const XML_Char *s1 = poolStart(&tempPool);
+ const XML_Char *s2 = nsAtts[j].uriName;
+ /* s1 is null terminated, but not s2 */
+ for (; *s1 == *s2 && *s1 != 0; s1++, s2++);
+ if (*s1 == 0)
+ return XML_ERROR_DUPLICATE_ATTRIBUTE;
+ }
+ if (!step)
+ step = PROBE_STEP(uriHash, mask, nsAttsPower);
+ j < step ? ( j += nsAttsSize - step) : (j -= step);
}
}
- if (ns_triplets) {
+ if (ns_triplets) { /* append namespace separator and prefix */
tempPool.ptr[-1] = namespaceSeparator;
s = b->prefix->name;
do {
@@ -2585,19 +2640,21 @@
} while (*s++);
}
+ /* store expanded name in attribute list */
s = poolStart(&tempPool);
- appAtts[i] = s;
poolFinish(&tempPool);
+ appAtts[i] = s;
- /* fill empty slot with new attribute */
+ /* fill empty slot with new version, uriName and hash value */
+ nsAtts[j].version = version;
nsAtts[j].hash = uriHash;
nsAtts[j].uriName = s;
if (!--nPrefixes)
break;
}
- else
- ((XML_Char *)s)[-1] = 0;
+ else /* not prefixed */
+ ((XML_Char *)s)[-1] = 0; /* clear flag */
}
}
/* clear flags for the remaining attributes */
@@ -5312,7 +5369,7 @@
return 1;
}
-#define INIT_SIZE 64
+#define INIT_POWER 6
static XML_Bool FASTCALL
keyeq(KEY s1, KEY s2)
@@ -5328,7 +5385,7 @@
{
unsigned long h = 0;
while (*s)
- h = (h << 5) + h + (unsigned char)*s++;
+ h = CHAR_HASH(h, *s++);
return h;
}
@@ -5338,31 +5395,38 @@
size_t i;
if (table->size == 0) {
size_t tsize;
-
if (!createSize)
return NULL;
- tsize = INIT_SIZE * sizeof(NAMED *);
+ table->power = INIT_POWER;
+ /* table->size is a power of 2 */
+ table->size = (size_t)1 << INIT_POWER;
+ tsize = table->size * sizeof(NAMED *);
table->v = (NAMED **)table->mem->malloc_fcn(tsize);
if (!table->v)
return NULL;
memset(table->v, 0, tsize);
- table->size = INIT_SIZE;
- table->usedLim = INIT_SIZE / 2;
- i = hash(name) & (table->size - 1);
+ i = hash(name) & ((unsigned long)table->size - 1);
}
else {
unsigned long h = hash(name);
- for (i = h & (table->size - 1);
- table->v[i];
- i == 0 ? i = table->size - 1 : --i) {
+ unsigned long mask = (unsigned long)table->size - 1;
+ unsigned char step = 0;
+ i = h & mask;
+ while (table->v[i]) {
if (keyeq(name, table->v[i]->name))
return table->v[i];
+ if (!step)
+ step = PROBE_STEP(h, mask, table->power);
+ i < step ? (i += table->size - step) : (i -= step);
}
if (!createSize)
return NULL;
- if (table->used == table->usedLim) {
- /* check for overflow */
- size_t newSize = table->size * 2;
+
+ /* check for overflow (table is half full) */
+ if (table->used >> (table->power - 1)) {
+ unsigned char newPower = table->power + 1;
+ size_t newSize = (size_t)1 << newPower;
+ unsigned long newMask = (unsigned long)newSize - 1;
size_t tsize = newSize * sizeof(NAMED *);
NAMED **newV = (NAMED **)table->mem->malloc_fcn(tsize);
if (!newV)
@@ -5370,21 +5434,27 @@
memset(newV, 0, tsize);
for (i = 0; i < table->size; i++)
if (table->v[i]) {
- size_t j;
- for (j = hash(table->v[i]->name) & (newSize - 1);
- newV[j];
- j == 0 ? j = newSize - 1 : --j)
- ;
+ unsigned long newHash = hash(table->v[i]->name);
+ size_t j = newHash & newMask;
+ step = 0;
+ while (newV[j]) {
+ if (!step)
+ step = PROBE_STEP(newHash, newMask, newPower);
+ j < step ? (j += newSize - step) : (j -= step);
+ }
newV[j] = table->v[i];
}
table->mem->free_fcn(table->v);
table->v = newV;
+ table->power = newPower;
table->size = newSize;
- table->usedLim = newSize/2;
- for (i = h & (table->size - 1);
- table->v[i];
- i == 0 ? i = table->size - 1 : --i)
- ;
+ i = h & newMask;
+ step = 0;
+ while (table->v[i]) {
+ if (!step)
+ step = PROBE_STEP(h, newMask, newPower);
+ i < step ? (i += newSize - step) : (i -= step);
+ }
}
}
table->v[i] = (NAMED *)table->mem->malloc_fcn(createSize);
@@ -5404,7 +5474,6 @@
table->mem->free_fcn(table->v[i]);
table->v[i] = NULL;
}
- table->usedLim = table->size / 2;
table->used = 0;
}
@@ -5420,8 +5489,8 @@
static void FASTCALL
hashTableInit(HASH_TABLE *p, const XML_Memory_Handling_Suite *ms)
{
+ p->power = 0;
p->size = 0;
- p->usedLim = 0;
p->used = 0;
p->v = NULL;
p->mem = ms;
@@ -5752,4 +5821,5 @@
}
return ret;
}
+
From kwaclaw at users.sourceforge.net Mon Sep 22 10:14:58 2003
From: kwaclaw at users.sourceforge.net (Karl Waclawek)
Date: Mon Sep 22 10:15:08 2003
Subject: [Expat-checkins] expat/lib xmlparse.c,1.110,1.111
Message-ID:
Update of /cvsroot/expat/expat/lib
In directory sc8-pr-cvs1:/tmp/cvs-serv31618
Modified Files:
xmlparse.c
Log Message:
Improved comment on hash algorithm.
Index: xmlparse.c
===================================================================
RCS file: /cvsroot/expat/expat/lib/xmlparse.c,v
retrieving revision 1.110
retrieving revision 1.111
diff -u -d -r1.110 -r1.111
--- xmlparse.c 13 Sep 2003 17:30:30 -0000 1.110
+++ xmlparse.c 22 Sep 2003 14:14:54 -0000 1.111
@@ -109,7 +109,8 @@
} HASH_TABLE;
/* Basic character hash algorithm, taken from Python's string hash:
- h = h * 1000003(prime number) ^ character.
+ h = h * 1000003 ^ character, the constant being a prime number.
+
*/
#ifdef XML_UNICODE
#define CHAR_HASH(h, c) \
From gstein at users.sourceforge.net Sun Sep 28 17:57:25 2003
From: gstein at users.sourceforge.net (Greg Stein)
Date: Sun Sep 28 17:58:37 2003
Subject: [Expat-checkins] expat/lib internal.h,1.6,1.7
Message-ID:
Update of /cvsroot/expat/expat/lib
In directory sc8-pr-cvs1:/tmp/cvs-serv9203
Modified Files:
internal.h
Log Message:
Suggested fix from jtalkington@users.sf.net.
See bug #765227.
* lib/internal.h:
(FASTCALL, PTRFASTCALL): only define these macros for the GNU C compiler
on i386 platforms. apprently, they do not work well on PPC ports.
Index: internal.h
===================================================================
RCS file: /cvsroot/expat/expat/lib/internal.h,v
retrieving revision 1.6
retrieving revision 1.7
diff -u -d -r1.6 -r1.7
--- internal.h 14 Mar 2003 17:25:12 -0000 1.6
+++ internal.h 28 Sep 2003 21:57:22 -0000 1.7
@@ -20,7 +20,7 @@
and therefore subject to change.
*/
-#if defined(__GNUC__) && defined(linux)
+#if defined(__GNUC__) && defined(__i386__)
/* We'll use this version by default only where we know it helps.
regparm() generates warnings on Solaris boxes. See SF bug #692878.