
[forwarding to python-dev for general ack]
The changes to re are still throwing off signed/unsigned warnings. Can you please get that fixed up?
My changes to sre have nothing to do with these warnings. These are being caused by the fact that gcc throws a warning even with casts in the following situation: char i = 0; if ((int)i < 256) ... Unlike one might think, this is a valid test because the type of 'i' is not always char. I won't fix these warnings until I find an elegant solution to workaround the gcc stupidity. -- Gustavo Niemeyer http://niemeyer.net

Gustavo Niemeyer wrote:
The changes to re are still throwing off signed/unsigned warnings. Can you please get that fixed up?
My changes to sre have nothing to do with these warnings.
This is not true. I get Modules/_sre.c:811: warning: comparison between signed and unsigned Now, line 811 is DATA_ALLOC(SRE_MATCH_CONTEXT, ctx); This expands to alloc_pos = state->data_stack_base; \ TRACE(("allocating %s in %d (%d)\n", \ SFY(type), alloc_pos, sizeof(type))); \ if (state->data_stack_size < alloc_pos+sizeof(type)) { \ int j = data_stack_grow(state, sizeof(type)); \ if (j < 0) return j; \ if (ctx_pos != -1) \ DATA_STACK_LOOKUP_AT(state, SRE_MATCH_CONTEXT, ctx, ctx_pos); \ } \ ptr = (type*)(state->data_stack+alloc_pos); \ state->data_stack_base += sizeof(type); \ } while (0) The culprit is the line if (state->data_stack_size < alloc_pos+sizeof(type)) { \ because data_stack_size is signed, and sizeof(type) is size_t. This line is yours 2.101 (niemeyer 17-Oct-03): if (state->data_stack_size < alloc_pos+sizeof (type)) { Changing the type of data_stack_size to unsigned makes the warning go away. Please change all uses of sizes/positions to "size_t", and change the special -1 marker to (size_t)-1. Regards, Martin

The changes to re are still throwing off signed/unsigned warnings. Can you please get that fixed up?
My changes to sre have nothing to do with these warnings.
This is not true. I get
Modules/_sre.c:811: warning: comparison between signed and unsigned
Now, line 811 is
DATA_ALLOC(SRE_MATCH_CONTEXT, ctx);
Ohh.. I wasn't aware about this case, since I get no errors at all at this line. Do you see any others? I personally get these: Modules/_sre.c:381: warning: comparison is always true due to limited range of data type Modules/_sre.c:383: warning: comparison is always true due to limited range of data type Modules/_sre.c:390: warning: comparison is always true due to limited range of data type Modules/_sre.c:392: warning: comparison is always true due to limited range of data type Modules/_sre.c: In function `sre_charset': Modules/_sre.c:487: warning: comparison is always true due to limited range of data type Modules/_sre.c: In function `sre_ucharset': Modules/_sre.c:487: warning: comparison is always true due to limited range of data type Suggestions welcome.
The culprit is the line
if (state->data_stack_size < alloc_pos+sizeof(type)) { \
because data_stack_size is signed, and sizeof(type) is size_t. This line is yours
2.101 (niemeyer 17-Oct-03): if (state->data_stack_size < alloc_pos+sizeof (type)) {
Changing the type of data_stack_size to unsigned makes the warning go away.
It will be a pleasure to fix that once I'm back from vacation (dialup tsc tsc), and get the necessary time. Thanks for pointing me out.
Please change all uses of sizes/positions to "size_t", and change the special -1 marker to (size_t)-1.
Ditto. -- Gustavo Niemeyer http://niemeyer.net

[Gustavo Niemeyer]
Ohh.. I wasn't aware about this case, since I get no errors at all at this line.
I think Martin is using a more-recent version of gcc than most people here. Everyone on Windows (MSVC) sees these warnings, though.
Do you see any others?
Windows doesn't complain about the ones your compiler complains about; it complains about 17 others, listed here (although the line numbers have probably changed in the last 3 months <wink>): http://mail.python.org/pipermail/python-dev/2003-October/039059.html I personally get these:
Modules/_sre.c:381: warning: comparison is always true due to limited range of data type Modules/_sre.c:383: warning: comparison is always true due to limited range of data type Modules/_sre.c:390: warning: comparison is always true due to limited range of data type Modules/_sre.c:392: warning: comparison is always true due to limited range of data type Modules/_sre.c: In function `sre_charset': Modules/_sre.c:487: warning: comparison is always true due to limited range of data type Modules/_sre.c: In function `sre_ucharset': Modules/_sre.c:487: warning: comparison is always true due to limited range of data type
Suggestions welcome.
Upgrade to Windows <heh>. Most of those seem to come from lines of the form SRE_LOC_IS_WORD((int) ptr[-1]) : 0; where SRE_CHAR* ptr and either #define SRE_CHAR unsigned char or #define SRE_CHAR Py_UNICODE and #define SRE_LOC_IS_WORD(ch) (SRE_LOC_IS_ALNUM((ch)) || (ch) == '_') #define SRE_LOC_IS_ALNUM(ch) ((ch) < 256 ? isalnum((ch)) : 0) So it's apparently bitching about (((int) ptr[-1])) < 256 when the "unsigned char" expansion of SRE_CHAR is in effect. I suppose that could be repaired by defining SRE_LOC_IS_ALNUM differently depending on how SRE_CHAR is defined. The warning on line 487 comes from if (ch < 65536) where SRE_CODE ch and SRE_CODE is unsigned short or unsigned long, depending on Py_UNICODE_WIDE. This warning is really irritating. I suppose #if defined(Py_UNICODE_WIDE) || SIZEOF_SHORT > 2 if (ch < 65536) #endif block = ((unsigned char*)set)[ch >> 8]; #if defined(Py_UNICODE_WIDE) || SIZEOF_SHORT > 2 else block = -1; #endif would shut it up, but that's sure ugly.

I'm back to life, after a long and nice vacation. Sorry for taking a while to answer this.
I think Martin is using a more-recent version of gcc than most people here. Everyone on Windows (MSVC) sees these warnings, though.
I'm using gcc 3.3.2 here.
Do you see any others?
Windows doesn't complain about the ones your compiler complains about; it complains about 17 others, listed here (although the line numbers have probably changed in the last 3 months <wink>):
http://mail.python.org/pipermail/python-dev/2003-October/039059.html
Ok.. it should give me an idea about where to look. [...]
Suggestions welcome.
Upgrade to Windows <heh>.
Next one please.. :-)
Most of those seem to come from lines of the form
SRE_LOC_IS_WORD((int) ptr[-1]) : 0; [...]
Yes, that's what I reported in my first email. IMO, the compiler should not complain about it, since the char is being casted away. [...]
when the "unsigned char" expansion of SRE_CHAR is in effect. I suppose that could be repaired by defining SRE_LOC_IS_ALNUM differently depending on how SRE_CHAR is defined.
That's an alternative indeed. [...]
and SRE_CODE is unsigned short or unsigned long, depending on Py_UNICODE_WIDE. This warning is really irritating. I suppose [...] would shut it up, but that's sure ugly.
I'm trying to avoid using something ugly to avoid an ugly warning. I'll spend some time on it now that I'm back, and relaxed. :-) Thanks. -- Gustavo Niemeyer http://niemeyer.net

Gustavo Niemeyer wrote:
Ohh.. I wasn't aware about this case, since I get no errors at all at this line. Do you see any others?
As Tim explains, this is likely due to me using a more recent gcc (3.3). I'll attach my current list of warnings below. The signed/unsigned warnings are actually harmless: they cause a problem only when comparing negative and unsigned numbers (where the negative number gets converted to unsigned and may actually compare larger than the positive one). This should not be an issue for SRE - I believe all the numbers are always non-negative, unless they are the -1 marker (for which I'm uncertain whether it is ever relationally compared). Regards, Martin Modules/_sre.c: In function `sre_lower': Modules/_sre.c:149: warning: signed and unsigned type in conditional expression Modules/_sre.c: In function `sre_lower_locale': Modules/_sre.c:162: warning: signed and unsigned type in conditional expression In file included from Modules/_sre.c:296: Modules/_sre.c: In function `sre_at': Modules/_sre.c:381: warning: comparison is always true due to limited range of data type Modules/_sre.c:383: warning: comparison is always true due to limited range of data type Modules/_sre.c:390: warning: comparison is always true due to limited range of data type Modules/_sre.c:392: warning: comparison is always true due to limited range of data type Modules/_sre.c: In function `sre_match': Modules/_sre.c:811: warning: comparison between signed and unsigned Modules/_sre.c:824: warning: comparison between signed and unsigned Modules/_sre.c:980: warning: comparison between signed and unsigned Modules/_sre.c:991: warning: comparison between signed and unsigned Modules/_sre.c:1059: warning: comparison between signed and unsigned Modules/_sre.c:1076: warning: comparison between signed and unsigned Modules/_sre.c:1132: warning: comparison between signed and unsigned Modules/_sre.c:1166: warning: comparison between signed and unsigned Modules/_sre.c:1194: warning: comparison between signed and unsigned Modules/_sre.c:1198: warning: comparison between signed and unsigned Modules/_sre.c:1208: warning: comparison between signed and unsigned Modules/_sre.c:1215: warning: comparison between signed and unsigned Modules/_sre.c:1217: warning: comparison between signed and unsigned Modules/_sre.c:1220: warning: comparison between signed and unsigned Modules/_sre.c:1236: warning: comparison between signed and unsigned Modules/_sre.c:1257: warning: comparison between signed and unsigned Modules/_sre.c:1261: warning: comparison between signed and unsigned Modules/_sre.c:1275: warning: comparison between signed and unsigned Modules/_sre.c:1287: warning: comparison between signed and unsigned Modules/_sre.c:1292: warning: comparison between signed and unsigned Modules/_sre.c:1380: warning: comparison between signed and unsigned Modules/_sre.c:1392: warning: comparison between signed and unsigned Modules/_sre.c: In function `sre_umatch': Modules/_sre.c:811: warning: comparison between signed and unsigned Modules/_sre.c:824: warning: comparison between signed and unsigned Modules/_sre.c:980: warning: comparison between signed and unsigned Modules/_sre.c:991: warning: comparison between signed and unsigned Modules/_sre.c:1051: warning: comparison between signed and unsigned Modules/_sre.c:1059: warning: comparison between signed and unsigned Modules/_sre.c:1076: warning: comparison between signed and unsigned Modules/_sre.c:1132: warning: comparison between signed and unsigned Modules/_sre.c:1166: warning: comparison between signed and unsigned Modules/_sre.c:1194: warning: comparison between signed and unsigned Modules/_sre.c:1198: warning: comparison between signed and unsigned Modules/_sre.c:1208: warning: comparison between signed and unsigned Modules/_sre.c:1215: warning: comparison between signed and unsigned Modules/_sre.c:1217: warning: comparison between signed and unsigned Modules/_sre.c:1220: warning: comparison between signed and unsigned Modules/_sre.c:1236: warning: comparison between signed and unsigned Modules/_sre.c:1257: warning: comparison between signed and unsigned Modules/_sre.c:1261: warning: comparison between signed and unsigned Modules/_sre.c:1275: warning: comparison between signed and unsigned Modules/_sre.c:1287: warning: comparison between signed and unsigned Modules/_sre.c:1292: warning: comparison between signed and unsigned Modules/_sre.c:1380: warning: comparison between signed and unsigned Modules/_sre.c:1392: warning: comparison between signed and unsigned

Ohh.. I wasn't aware about this case, since I get no errors at all at this line. Do you see any others?
As Tim explains, this is likely due to me using a more recent gcc (3.3).
That's the same one I'm using.
I'll attach my current list of warnings below.
Thanks.
The signed/unsigned warnings are actually harmless: they cause a problem only when comparing negative and unsigned numbers (where the negative number gets converted to unsigned and may actually compare larger than the positive one). This should not be an issue for SRE - I believe all the numbers are always non-negative, unless they are the -1 marker (for which I'm uncertain whether it is ever relationally compared).
I'll check it out. Thanks. -- Gustavo Niemeyer http://niemeyer.net

Martin v. Loewis wrote:
Please change all uses of sizes/positions to "size_t", and change the special -1 marker to (size_t)-1.
If sizeof(int) < sizeof(size_t), is it *guaranteed* that (size_t)-1 expands to a bit pattern of all 1's? Also, is it *guaranteed* that you don't get more warnings (converting a negative quantity to unsigned)? I've been using ~(size_t)1 for things like this where these *are* guaranteed. -- Sjoerd Mullender <sjoerd@acm.org>

[Martin v. Loewis]
Please change all uses of sizes/positions to "size_t", and change the special -1 marker to (size_t)-1.
[Sjoerd Mullender]
If sizeof(int) < sizeof(size_t), is it *guaranteed* that (size_t)-1 expands to a bit pattern of all 1's?
Guaranteed (not to mention *guaranteed* <wink>) is a very strong thing, and C explicitly allows for that an integer type other than unsigned char may contain bits that don't contribute to the value (like parity bits, or a not-an-integer bit). This makes "a bit pattern of all 1's" hard to relate to what the standard says. For casts from signed to unsigned, when the original value can't be represented in the new type (which includes all negative original values): the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. If we assume a binary box and that int and size_t don't have "extra" bits (Python assumes all this in more than one place already), then yes, the result is guaranteed to be a string of 1 bits: it's mathematically 2**i-1, where i is the number of bits in a size_t. That resulting mathematical *value* is well-defined even if there are extra bits, although C doesn't define what the extra bits may contain.
Also, is it *guaranteed* that you don't get more warnings (converting a negative quantity to unsigned)?
Well, the standard never guarantees that you won't get a warning. It's traditional for C compilers not to whine about explicit casts, no matter how goofy they are. Casting from signed to unsigned is well-defined, so there's no reason at all to whine about that. Note that there are instances of (size_t)-1 in Python already (cPickle.c and longobject.c), and no reports of warnings from those.
I've been using ~(size_t)1 for things like this where these *are* guaranteed.
I think you mean ~(size_t)0. There are also two instances of that in Python's source now. I think either way is fine.

Sjoerd Mullender wrote:
If sizeof(int) < sizeof(size_t), is it *guaranteed* that (size_t)-1 expands to a bit pattern of all 1's?
As Tim explains: If you assume that the machine uses two's complement, then yes. However, this is irrelevant. More interestingly: Is it guaranteed that you get the largest possible value of size_t? Based on the language spec fragment that Tim cites: yes. This actually isn't really relevant either. Is it guaranteed that you get a value of size_t different from all values that denote real sizes of things? To this, the answer clearly is no: There might be objects whose size is (size_t)-1. However, it is unlikely (perhaps even *unlikely* :-) that you ever meet such an object in real life. By definition, (size_t)-1 == ~(size_t)0 in all conforming implementations. Regards, Martin
participants (4)
-
Gustavo Niemeyer
-
Martin v. Loewis
-
Sjoerd Mullender
-
Tim Peters