From kwaclaw at users.sourceforge.net Thu Sep 4 10:13:12 2003 From: kwaclaw at users.sourceforge.net (Karl Waclawek) Date: Thu Sep 4 17:14:50 2003 Subject: [Expat-checkins] expat/tests/benchmark README.txt,NONE,1.1 Message-ID: Update of /cvsroot/expat/expat/tests/benchmark In directory sc8-pr-cvs1:/tmp/cvs-serv4815 Added Files: README.txt Log Message: Explains usage of benchmark utility. --- NEW FILE: README.txt --- Use this benchmark command line utility as follows: benchmark [-n] <# iterations> The command line arguments are: -n ... optional; if supplied, then namepsace processing is turned on ... name/path of test xml file ... size of processing buffer; the file is parsed in chunks of this size <# iterations> ... how often will the file be parsed Returns: The time (in seconds) it takes to parse the test file, averaged of the number of iterations. From fdrake at users.sourceforge.net Thu Sep 4 10:06:54 2003 From: fdrake at users.sourceforge.net (Fred L. Drake) Date: Thu Sep 4 17:14:53 2003 Subject: [Expat-checkins] CVSROOT loginfo,1.4,1.5 Message-ID: Update of /cvsroot/expat/CVSROOT In directory sc8-pr-cvs1:/tmp/cvs-serv3678 Modified Files: loginfo Log Message: Don't generate email for the large test files. Index: loginfo =================================================================== RCS file: /cvsroot/expat/CVSROOT/loginfo,v retrieving revision 1.4 retrieving revision 1.5 diff -u -d -r1.4 -r1.5 --- loginfo 26 Aug 2002 20:10:07 -0000 1.4 +++ loginfo 4 Sep 2003 16:06:52 -0000 1.5 @@ -26,3 +26,6 @@ #DEFAULT (echo ""; id; echo %{sVv}; date; cat) >> $CVSROOT/CVSROOT/commitlog DEFAULT $CVSROOT/CVSROOT/syncmail -u %{sVv} expat-checkins@libexpat.org + +# Don't generate email for the big files... +testdata/largefiles true From fdrake at users.sourceforge.net Thu Sep 4 10:05:05 2003 From: fdrake at users.sourceforge.net (Fred L. Drake) Date: Thu Sep 4 17:14:54 2003 Subject: [Expat-checkins] testdata/largefiles README.txt,NONE,1.1 Message-ID: Update of /cvsroot/expat/testdata/largefiles In directory sc8-pr-cvs1:/tmp/cvs-serv3429/largefiles Added Files: README.txt Log Message: Add explanations for this directory. --- NEW FILE: README.txt --- This directory contains some really large test files, mostly used to benchmark various aspects of Expat's performance. (As files are added, they should be described here, including what benchmark program they're intended to be used with and what that resulting measurements tell us.) From fdrake at users.sourceforge.net Thu Sep 4 10:05:04 2003 From: fdrake at users.sourceforge.net (Fred L. Drake) Date: Thu Sep 4 17:14:55 2003 Subject: [Expat-checkins] testdata README.txt,NONE,1.1 Message-ID: Update of /cvsroot/expat/testdata In directory sc8-pr-cvs1:/tmp/cvs-serv3429 Added Files: README.txt Log Message: Add explanations for this directory. --- NEW FILE: README.txt --- This directory contains various collections of files used for test data. It currently contains the following collections: largefiles/ This contains some really large files; most are used for benchmarking various aspects of Expat's performance. From fdrake at users.sourceforge.net Thu Sep 4 10:02:38 2003 From: fdrake at users.sourceforge.net (Fred L. Drake) Date: Thu Sep 4 17:14:55 2003 Subject: [Expat-checkins] testdata/largefiles - New directory Message-ID: Update of /cvsroot/expat/testdata/largefiles In directory sc8-pr-cvs1:/tmp/cvs-serv3023/largefiles Log Message: Directory /cvsroot/expat/testdata/largefiles added to the repository From fdrake at users.sourceforge.net Thu Sep 4 10:02:24 2003 From: fdrake at users.sourceforge.net (Fred L. Drake) Date: Thu Sep 4 17:14:56 2003 Subject: [Expat-checkins] testdata - New directory Message-ID: Update of /cvsroot/expat/testdata In directory sc8-pr-cvs1:/tmp/cvs-serv2956/testdata Log Message: Directory /cvsroot/expat/testdata added to the repository From kwaclaw at users.sourceforge.net Thu Sep 4 10:01:56 2003 From: kwaclaw at users.sourceforge.net (Karl Waclawek) Date: Thu Sep 4 17:14:56 2003 Subject: [Expat-checkins] expat/tests/benchmark benchmark.c, NONE, 1.1 benchmark.dsp, NONE, 1.1 benchmark.dsw, NONE, 1.1 Message-ID: Update of /cvsroot/expat/expat/tests/benchmark In directory sc8-pr-cvs1:/tmp/cvs-serv2871 Added Files: benchmark.c benchmark.dsp benchmark.dsw Log Message: Small benchmark utility to test pure parser speed. Tested on Windows only. Includes MS VC++ 6.0 workspace. --- NEW FILE: benchmark.c --- #include #include #include #include #include "expat.h" static void usage(const char *prog, int rc) { fprintf(stderr, "usage: %s [-n] filename bufferSize nr_of_loops\n", prog); exit(rc); } int main (int argc, char *argv[]) { XML_Parser parser; char *XMLBuf, *XMLBufEnd, *XMLBufPtr; FILE *fd; struct stat fileAttr; int nrOfLoops, bufferSize, fileSize, i, isFinal; int j = 0, ns = 0; clock_t tstart, tend; double cpuTime = 0.0; if (argc > 1) { if (argv[1][0] == '-') { if (argv[1][1] == 'n' && argv[1][2] == '\0') { ns = 1; j = 1; } else usage(argv[0], 1); } } if (argc != j + 4) usage(argv[0], 1); if (stat (argv[j + 1], &fileAttr) != 0) { fprintf (stderr, "could not access file '%s'\n", argv[j + 1]); return 2; } fd = fopen (argv[j + 1], "r"); if (!fd) { fprintf (stderr, "could not open file '%s'\n", argv[j + 1]); exit(2); } bufferSize = atoi (argv[j + 2]); nrOfLoops = atoi (argv[j + 3]); if (bufferSize <= 0 || nrOfLoops <= 0) { fprintf (stderr, "buffer size and nr of loops must be greater than zero.\n"); exit(3); } XMLBuf = malloc (fileAttr.st_size); fileSize = fread (XMLBuf, sizeof (char), fileAttr.st_size, fd); fclose (fd); i = 0; XMLBufEnd = XMLBuf + fileSize; while (i < nrOfLoops) { XMLBufPtr = XMLBuf; isFinal = 0; if (ns) parser = XML_ParserCreateNS(NULL, '!'); else parser = XML_ParserCreate(NULL); tstart = clock(); do { int parseBufferSize = XMLBufEnd - XMLBufPtr; if (parseBufferSize <= bufferSize) isFinal = 1; else parseBufferSize = bufferSize; if (!XML_Parse (parser, XMLBufPtr, parseBufferSize, isFinal)) { fprintf (stderr, "error '%s' at line %d character %d\n", XML_ErrorString (XML_GetErrorCode (parser)), XML_GetCurrentLineNumber (parser), XML_GetCurrentColumnNumber (parser)); free (XMLBuf); XML_ParserFree (parser); exit (4); } XMLBufPtr += bufferSize; } while (!isFinal); tend = clock(); cpuTime += ((double) (tend - tstart)) / CLOCKS_PER_SEC; XML_ParserFree (parser); i++; } free (XMLBuf); printf ("%d loops, with buffer size %d. Average time per loop: %f\n", nrOfLoops, bufferSize, cpuTime / (double) nrOfLoops); return 0; } --- NEW FILE: benchmark.dsp --- # Microsoft Developer Studio Project File - Name="benchmark" - Package Owner=<4> # Microsoft Developer Studio Generated Build File, Format Version 6.00 # ** DO NOT EDIT ** # TARGTYPE "Win32 (x86) Console Application" 0x0103 CFG=benchmark - Win32 Debug !MESSAGE This is not a valid makefile. To build this project using NMAKE, !MESSAGE use the Export Makefile command and run !MESSAGE !MESSAGE NMAKE /f "benchmark.mak". !MESSAGE !MESSAGE You can specify a configuration when running NMAKE !MESSAGE by defining the macro CFG on the command line. For example: !MESSAGE !MESSAGE NMAKE /f "benchmark.mak" CFG="benchmark - Win32 Debug" !MESSAGE !MESSAGE Possible choices for configuration are: !MESSAGE !MESSAGE "benchmark - Win32 Release" (based on "Win32 (x86) Console Application") !MESSAGE "benchmark - Win32 Debug" (based on "Win32 (x86) Console Application") !MESSAGE # Begin Project # PROP AllowPerConfigDependencies 0 # PROP Scc_ProjName "" # PROP Scc_LocalPath "" CPP=cl.exe RSC=rc.exe !IF "$(CFG)" == "benchmark - Win32 Release" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 0 # PROP BASE Output_Dir "Release" # PROP BASE Intermediate_Dir "Release" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 0 # PROP Output_Dir "Release" # PROP Intermediate_Dir "Release" # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c # ADD CPP /nologo /W3 /GX /O2 /I "..\..\lib" /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c # ADD BASE RSC /l 0x1009 /d "NDEBUG" # ADD RSC /l 0x1009 /d "NDEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /machine:I386 # ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /machine:I386 !ELSEIF "$(CFG)" == "benchmark - Win32 Debug" # PROP BASE Use_MFC 0 # PROP BASE Use_Debug_Libraries 1 # PROP BASE Output_Dir "Debug" # PROP BASE Intermediate_Dir "Debug" # PROP BASE Target_Dir "" # PROP Use_MFC 0 # PROP Use_Debug_Libraries 1 # PROP Output_Dir "Debug" # PROP Intermediate_Dir "Debug" # PROP Target_Dir "" # ADD BASE CPP /nologo /W3 /Gm /GX /ZI /Od /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /GZ /c # ADD CPP /nologo /W3 /Gm /GX /ZI /Od /I "..\..\lib" /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /GZ /c # ADD BASE RSC /l 0x1009 /d "_DEBUG" # ADD RSC /l 0x1009 /d "_DEBUG" BSC32=bscmake.exe # ADD BASE BSC32 /nologo # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept # ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept !ENDIF # Begin Target # Name "benchmark - Win32 Release" # Name "benchmark - Win32 Debug" # Begin Source File SOURCE=.\benchmark.c # End Source File # End Target # End Project --- NEW FILE: benchmark.dsw --- Microsoft Developer Studio Workspace File, Format Version 6.00 # WARNING: DO NOT EDIT OR DELETE THIS WORKSPACE FILE! ############################################################################### Project: "benchmark"=.\benchmark.dsp - Package Owner=<4> Package=<5> {{{ }}} Package=<4> {{{ Begin Project Dependency Project_Dep_Name expat End Project Dependency }}} ############################################################################### Project: "expat"=..\..\lib\expat.dsp - Package Owner=<4> Package=<5> {{{ }}} Package=<4> {{{ }}} ############################################################################### Global: Package=<5> {{{ }}} Package=<3> {{{ }}} ############################################################################### From kwaclaw at users.sourceforge.net Thu Sep 4 09:59:01 2003 From: kwaclaw at users.sourceforge.net (Karl Waclawek) Date: Thu Sep 4 17:14:57 2003 Subject: [Expat-checkins] expat/tests/benchmark - New directory Message-ID: Update of /cvsroot/expat/expat/tests/benchmark In directory sc8-pr-cvs1:/tmp/cvs-serv2292/benchmark Log Message: Directory /cvsroot/expat/expat/tests/benchmark added to the repository From kwaclaw at users.sourceforge.net Thu Sep 4 18:42:49 2003 From: kwaclaw at users.sourceforge.net (Karl Waclawek) Date: Thu Sep 4 20:42:57 2003 Subject: [Expat-checkins] expat/tests/benchmark README.txt,1.1,1.2 Message-ID: Update of /cvsroot/expat/expat/tests/benchmark In directory sc8-pr-cvs1:/tmp/cvs-serv9443 Modified Files: README.txt Log Message: Corrected typo. Index: README.txt =================================================================== RCS file: /cvsroot/expat/expat/tests/benchmark/README.txt,v retrieving revision 1.1 retrieving revision 1.2 diff -u -d -r1.1 -r1.2 --- README.txt 4 Sep 2003 16:13:10 -0000 1.1 +++ README.txt 5 Sep 2003 00:42:46 -0000 1.2 @@ -4,7 +4,7 @@ The command line arguments are: - -n ... optional; if supplied, then namepsace processing is turned on + -n ... optional; if supplied, then namespace processing is turned on ... name/path of test xml file ... size of processing buffer; the file is parsed in chunks of this size From kwaclaw at users.sourceforge.net Fri Sep 5 11:20:49 2003 From: kwaclaw at users.sourceforge.net (Karl Waclawek) Date: Fri Sep 5 13:20:53 2003 Subject: [Expat-checkins] htdocs/dev cvs.html,1.6,1.7 Message-ID: Update of /cvsroot/expat/htdocs/dev In directory sc8-pr-cvs1:/tmp/cvs-serv17853 Modified Files: cvs.html Log Message: Updated mailing list links. Index: cvs.html =================================================================== RCS file: /cvsroot/expat/htdocs/dev/cvs.html,v retrieving revision 1.6 retrieving revision 1.7 diff -u -d -r1.6 -r1.7 --- cvs.html 4 Sep 2002 15:46:35 -0000 1.6 +++ cvs.html 5 Sep 2003 17:20:47 -0000 1.7 @@ -75,11 +75,11 @@ change made to the CVS repository via this mailing list:

From kwaclaw at users.sourceforge.net Sat Sep 13 13:30:32 2003 From: kwaclaw at users.sourceforge.net (Karl Waclawek) Date: Sat Sep 13 13:30:36 2003 Subject: [Expat-checkins] expat/lib xmlparse.c,1.109,1.110 Message-ID: Update of /cvsroot/expat/expat/lib In directory sc8-pr-cvs1:/tmp/cvs-serv9344 Modified Files: xmlparse.c Log Message: Applied patch #699487. For details see patch description. Index: xmlparse.c =================================================================== RCS file: /cvsroot/expat/expat/lib/xmlparse.c,v retrieving revision 1.109 retrieving revision 1.110 diff -u -d -r1.109 -r1.110 --- xmlparse.c 7 Mar 2003 15:54:08 -0000 1.109 +++ xmlparse.c 13 Sep 2003 17:30:30 -0000 1.110 @@ -102,12 +102,36 @@ typedef struct { NAMED **v; + unsigned char power; size_t size; size_t used; - size_t usedLim; const XML_Memory_Handling_Suite *mem; } HASH_TABLE; +/* Basic character hash algorithm, taken from Python's string hash: + h = h * 1000003(prime number) ^ character. +*/ +#ifdef XML_UNICODE +#define CHAR_HASH(h, c) \ + (((h) * 0xF4243) ^ (unsigned short)(c)) +#else +#define CHAR_HASH(h, c) \ + (((h) * 0xF4243) ^ (unsigned char)(c)) +#endif + +/* For probing (after a collision) we need a step size relative prime + to the hash table size, which is a power of 2. We use double-hashing, + since we can calculate a second hash value cheaply by taking those bits + of the first hash value that were discarded (masked out) when the table + index was calculated: index = hash & mask, where mask = table->size - 1. + We limit the maximum step size to table->size / 4 (mask >> 2) and make + it odd, since odd numbers are always relative prime to a power of 2. +*/ +#define SECOND_HASH(hash, mask, power) \ + ((((hash) & ~(mask)) >> ((power) - 1)) & ((mask) >> 2)) +#define PROBE_STEP(hash, mask, power) \ + ((unsigned char)((SECOND_HASH(hash, mask, power)) | 1)) + typedef struct { NAMED **p; NAMED **end; @@ -116,6 +140,7 @@ #define INIT_TAG_BUF_SIZE 32 /* must be a multiple of sizeof(XML_Char) */ #define INIT_DATA_BUF_SIZE 1024 #define INIT_ATTS_SIZE 16 +#define INIT_ATTS_VERSION 0xFFFFFFFF #define INIT_BLOCK_SIZE 1024 #define INIT_BUFFER_SIZE 1024 @@ -224,6 +249,7 @@ } DEFAULT_ATTRIBUTE; typedef struct { + unsigned long version; unsigned long hash; const XML_Char *uriName; } NS_ATT; @@ -505,12 +531,13 @@ int m_idAttIndex; ATTRIBUTE *m_atts; NS_ATT *m_nsAtts; - int m_nsAttsSize; + unsigned long m_nsAttsVersion; + unsigned char m_nsAttsPower; POSITION m_position; STRING_POOL m_tempPool; STRING_POOL m_temp2Pool; char *m_groupConnector; - unsigned m_groupSize; + unsigned int m_groupSize; XML_Char m_namespaceSeparator; XML_Parser m_parentParser; #ifdef XML_DTD @@ -609,7 +636,8 @@ #define nSpecifiedAtts (parser->m_nSpecifiedAtts) #define idAttIndex (parser->m_idAttIndex) #define nsAtts (parser->m_nsAtts) -#define nsAttsSize (parser->m_nsAttsSize) +#define nsAttsVersion (parser->m_nsAttsVersion) +#define nsAttsPower (parser->m_nsAttsPower) #define tempPool (parser->m_tempPool) #define temp2Pool (parser->m_temp2Pool) #define groupConnector (parser->m_groupConnector) @@ -757,7 +785,8 @@ ns_triplets = XML_FALSE; nsAtts = NULL; - nsAttsSize = 0; + nsAttsVersion = 0; + nsAttsPower = 0; poolInit(&tempPool, &(parser->m_mem)); poolInit(&temp2Pool, &(parser->m_mem)); @@ -2419,7 +2448,10 @@ + XmlNameLength(enc, atts[i].name)); if (!attId) return XML_ERROR_NO_MEMORY; - /* detect duplicate attributes by their QNames */ + /* Detect duplicate attributes by their QNames. This does not work when + namespace processing is turned on and different prefixes for the same + namespace are used. For this case we have a check further down. + */ if ((attId->name)[-1]) { if (enc == encoding) eventPtr = atts[i].name; @@ -2519,64 +2551,87 @@ } appAtts[attIndex] = 0; - /* expand prefixed attribute names and - clear flags that say whether attributes were specified */ + /* expand prefixed attribute names, check for duplicates, + and clear flags that say whether attributes were specified */ i = 0; if (nPrefixes) { - int j; - if ((nPrefixes * 2) > nsAttsSize) { - NS_ATT *temp = (NS_ATT *)REALLOC(nsAtts, nPrefixes * 2 * sizeof(NS_ATT)); + int j; /* hash table index */ + unsigned long version = nsAttsVersion; + int nsAttsSize = (int)1 << nsAttsPower; + /* size of hash table must be at least 2 * (# of prefixed attributes) */ + if ((nPrefixes << 1) >> nsAttsPower) { /* true for nsAttsPower = 0 */ + NS_ATT *temp; + /* hash table size must also be a power of 2 and >= 8 */ + while (nPrefixes >> nsAttsPower++); + if (nsAttsPower < 3) + nsAttsPower = 3; + nsAttsSize = (int)1 << nsAttsPower; + temp = (NS_ATT *)REALLOC(nsAtts, nsAttsSize * sizeof(NS_ATT)); if (!temp) return XML_ERROR_NO_MEMORY; nsAtts = temp; - nsAttsSize = nPrefixes * 2; + version = 0; /* force re-initialization of nsAtts hash table */ } - /* clear nsAtts hash table */ - for (j = 0; j < nsAttsSize; j++) - nsAtts[j].uriName = NULL; + /* using a version flag saves us from initializing nsAtts every time */ + if (!version) { /* initialize version flags when version wraps around */ + version = INIT_ATTS_VERSION; + for (j = nsAttsSize; j != 0; ) + nsAtts[--j].version = version; + } + nsAttsVersion = --version; + /* expand prefixed names and check for duplicates */ for (; i < attIndex; i += 2) { const XML_Char *s = appAtts[i]; - if (s[-1] == 2) { + if (s[-1] == 2) { /* prefixed */ ATTRIBUTE_ID *id; const BINDING *b; unsigned long uriHash = 0; - ((XML_Char *)s)[-1] = 0; + ((XML_Char *)s)[-1] = 0; /* clear flag */ id = (ATTRIBUTE_ID *)lookup(&dtd->attributeIds, s, 0); b = id->prefix->binding; if (!b) return XML_ERROR_UNBOUND_PREFIX; - /* b->uri includes namespace separator */ + /* as we expand the name we also calculate its hash value */ for (j = 0; j < b->uriLen; j++) { const XML_Char c = b->uri[j]; if (!poolAppendChar(&tempPool, c)) return XML_ERROR_NO_MEMORY; - uriHash = (uriHash << 5) + uriHash + (unsigned char)c; + uriHash = CHAR_HASH(uriHash, c); } while (*s++ != XML_T(':')) ; - do { + do { /* copies null terminator */ const XML_Char c = *s; if (!poolAppendChar(&tempPool, *s)) return XML_ERROR_NO_MEMORY; - uriHash = (uriHash << 5) + uriHash + (unsigned char)c; + uriHash = CHAR_HASH(uriHash, c); } while (*s++); - /* detect duplicate attributes based on uriName = uri + local name */ - for (j = uriHash & (nsAttsSize - 1); - nsAtts[j].uriName; - j == 0 ? j = nsAttsSize - 1 : --j) { - if (uriHash == nsAtts[j].hash) { - const XML_Char *s1 = poolStart(&tempPool); /* null-terminated */ - const XML_Char *s2 = nsAtts[j].uriName; - for (; *s1 == *s2 && *s1 != 0; s1++, s2++); - if (*s1 == 0) - return XML_ERROR_DUPLICATE_ATTRIBUTE; + { /* Check hash table for duplicate of expanded name (uriName). + Derived from code in lookup(HASH_TABLE *table, ...). + */ + unsigned char step = 0; + unsigned long mask = nsAttsSize - 1; + j = uriHash & mask; /* index into hash table */ + while (nsAtts[j].version == version) { + /* for speed we compare stored hash values first */ + if (uriHash == nsAtts[j].hash) { + const XML_Char *s1 = poolStart(&tempPool); + const XML_Char *s2 = nsAtts[j].uriName; + /* s1 is null terminated, but not s2 */ + for (; *s1 == *s2 && *s1 != 0; s1++, s2++); + if (*s1 == 0) + return XML_ERROR_DUPLICATE_ATTRIBUTE; + } + if (!step) + step = PROBE_STEP(uriHash, mask, nsAttsPower); + j < step ? ( j += nsAttsSize - step) : (j -= step); } } - if (ns_triplets) { + if (ns_triplets) { /* append namespace separator and prefix */ tempPool.ptr[-1] = namespaceSeparator; s = b->prefix->name; do { @@ -2585,19 +2640,21 @@ } while (*s++); } + /* store expanded name in attribute list */ s = poolStart(&tempPool); - appAtts[i] = s; poolFinish(&tempPool); + appAtts[i] = s; - /* fill empty slot with new attribute */ + /* fill empty slot with new version, uriName and hash value */ + nsAtts[j].version = version; nsAtts[j].hash = uriHash; nsAtts[j].uriName = s; if (!--nPrefixes) break; } - else - ((XML_Char *)s)[-1] = 0; + else /* not prefixed */ + ((XML_Char *)s)[-1] = 0; /* clear flag */ } } /* clear flags for the remaining attributes */ @@ -5312,7 +5369,7 @@ return 1; } -#define INIT_SIZE 64 +#define INIT_POWER 6 static XML_Bool FASTCALL keyeq(KEY s1, KEY s2) @@ -5328,7 +5385,7 @@ { unsigned long h = 0; while (*s) - h = (h << 5) + h + (unsigned char)*s++; + h = CHAR_HASH(h, *s++); return h; } @@ -5338,31 +5395,38 @@ size_t i; if (table->size == 0) { size_t tsize; - if (!createSize) return NULL; - tsize = INIT_SIZE * sizeof(NAMED *); + table->power = INIT_POWER; + /* table->size is a power of 2 */ + table->size = (size_t)1 << INIT_POWER; + tsize = table->size * sizeof(NAMED *); table->v = (NAMED **)table->mem->malloc_fcn(tsize); if (!table->v) return NULL; memset(table->v, 0, tsize); - table->size = INIT_SIZE; - table->usedLim = INIT_SIZE / 2; - i = hash(name) & (table->size - 1); + i = hash(name) & ((unsigned long)table->size - 1); } else { unsigned long h = hash(name); - for (i = h & (table->size - 1); - table->v[i]; - i == 0 ? i = table->size - 1 : --i) { + unsigned long mask = (unsigned long)table->size - 1; + unsigned char step = 0; + i = h & mask; + while (table->v[i]) { if (keyeq(name, table->v[i]->name)) return table->v[i]; + if (!step) + step = PROBE_STEP(h, mask, table->power); + i < step ? (i += table->size - step) : (i -= step); } if (!createSize) return NULL; - if (table->used == table->usedLim) { - /* check for overflow */ - size_t newSize = table->size * 2; + + /* check for overflow (table is half full) */ + if (table->used >> (table->power - 1)) { + unsigned char newPower = table->power + 1; + size_t newSize = (size_t)1 << newPower; + unsigned long newMask = (unsigned long)newSize - 1; size_t tsize = newSize * sizeof(NAMED *); NAMED **newV = (NAMED **)table->mem->malloc_fcn(tsize); if (!newV) @@ -5370,21 +5434,27 @@ memset(newV, 0, tsize); for (i = 0; i < table->size; i++) if (table->v[i]) { - size_t j; - for (j = hash(table->v[i]->name) & (newSize - 1); - newV[j]; - j == 0 ? j = newSize - 1 : --j) - ; + unsigned long newHash = hash(table->v[i]->name); + size_t j = newHash & newMask; + step = 0; + while (newV[j]) { + if (!step) + step = PROBE_STEP(newHash, newMask, newPower); + j < step ? (j += newSize - step) : (j -= step); + } newV[j] = table->v[i]; } table->mem->free_fcn(table->v); table->v = newV; + table->power = newPower; table->size = newSize; - table->usedLim = newSize/2; - for (i = h & (table->size - 1); - table->v[i]; - i == 0 ? i = table->size - 1 : --i) - ; + i = h & newMask; + step = 0; + while (table->v[i]) { + if (!step) + step = PROBE_STEP(h, newMask, newPower); + i < step ? (i += newSize - step) : (i -= step); + } } } table->v[i] = (NAMED *)table->mem->malloc_fcn(createSize); @@ -5404,7 +5474,6 @@ table->mem->free_fcn(table->v[i]); table->v[i] = NULL; } - table->usedLim = table->size / 2; table->used = 0; } @@ -5420,8 +5489,8 @@ static void FASTCALL hashTableInit(HASH_TABLE *p, const XML_Memory_Handling_Suite *ms) { + p->power = 0; p->size = 0; - p->usedLim = 0; p->used = 0; p->v = NULL; p->mem = ms; @@ -5752,4 +5821,5 @@ } return ret; } + From kwaclaw at users.sourceforge.net Mon Sep 22 10:14:58 2003 From: kwaclaw at users.sourceforge.net (Karl Waclawek) Date: Mon Sep 22 10:15:08 2003 Subject: [Expat-checkins] expat/lib xmlparse.c,1.110,1.111 Message-ID: Update of /cvsroot/expat/expat/lib In directory sc8-pr-cvs1:/tmp/cvs-serv31618 Modified Files: xmlparse.c Log Message: Improved comment on hash algorithm. Index: xmlparse.c =================================================================== RCS file: /cvsroot/expat/expat/lib/xmlparse.c,v retrieving revision 1.110 retrieving revision 1.111 diff -u -d -r1.110 -r1.111 --- xmlparse.c 13 Sep 2003 17:30:30 -0000 1.110 +++ xmlparse.c 22 Sep 2003 14:14:54 -0000 1.111 @@ -109,7 +109,8 @@ } HASH_TABLE; /* Basic character hash algorithm, taken from Python's string hash: - h = h * 1000003(prime number) ^ character. + h = h * 1000003 ^ character, the constant being a prime number. + */ #ifdef XML_UNICODE #define CHAR_HASH(h, c) \ From gstein at users.sourceforge.net Sun Sep 28 17:57:25 2003 From: gstein at users.sourceforge.net (Greg Stein) Date: Sun Sep 28 17:58:37 2003 Subject: [Expat-checkins] expat/lib internal.h,1.6,1.7 Message-ID: Update of /cvsroot/expat/expat/lib In directory sc8-pr-cvs1:/tmp/cvs-serv9203 Modified Files: internal.h Log Message: Suggested fix from jtalkington@users.sf.net. See bug #765227. * lib/internal.h: (FASTCALL, PTRFASTCALL): only define these macros for the GNU C compiler on i386 platforms. apprently, they do not work well on PPC ports. Index: internal.h =================================================================== RCS file: /cvsroot/expat/expat/lib/internal.h,v retrieving revision 1.6 retrieving revision 1.7 diff -u -d -r1.6 -r1.7 --- internal.h 14 Mar 2003 17:25:12 -0000 1.6 +++ internal.h 28 Sep 2003 21:57:22 -0000 1.7 @@ -20,7 +20,7 @@ and therefore subject to change. */ -#if defined(__GNUC__) && defined(linux) +#if defined(__GNUC__) && defined(__i386__) /* We'll use this version by default only where we know it helps. regparm() generates warnings on Solaris boxes. See SF bug #692878.