
I've just spotted inconsistency between string and lists handling:
Why strings here behave differently than other sequence types? Is that by design?

Strings have a common use case which lists do not: finding subsequences/substrings. Consider the following:
The contains operator ("in") has a different meaning than the contains operator for a list. A list contains an object if (and only if) that object is a single element of the list. A string contains another string if (and only if) the other string is a substring of the first string. Matthew Lefavor NASA GSFC [Microtel, LLC] Mail Code 699.0/Org Code 582.0 matthew.lefavor@nasa.gov (301) 614-6818 (Desk) (443) 758-4891 (Cell) On 7/18/12 1:30 PM, "anatoly techtonik" <techtonik@gmail.com> wrote:

On 2012-07-18, at 19:43 , Masklinn wrote:
in fact, things used to work that way in older Python, this was specifically changed to the current behavior *as noted in the documentation*:
A Python string, you may want to note, is a string. Not a sequence of characters. The first item of a 1-character string is itself, all basic (step-less) slices of a string are contained in itself (including itself and the empty string), you can infinitely get the first item of a non-empty string, and I'm sure I'm missing plenty.

On Wed, Jul 18, 2012 at 1:58 PM, Masklinn <masklinn@masklinn.net> wrote:
A Python string, you may want to note, is a string. Not a sequence of characters.
It's both (with the caveat that, in Python, a character is just a string of length 1). (See: http://docs.python.org/reference/datamodel.html#the-standard-type-hierarchy ) -- Devin

On 2012-07-18, at 20:06 , Devin Jeanpierre wrote:
That's playing with words, especially comparing strings with Python 3 binaries which *do* actually have a separate "character" type (reified to an integer). So Python strings don't have reified characters, a string's item and a slice of size 1 are essentially identical which is pretty much unique to them (as far as my knowledge of Python's sequences go). Which is not a bad thing, mind you, it makes working with strings much more pleasant.

On Wed, Jul 18, 2012 at 2:16 PM, Masklinn <masklinn@masklinn.net> wrote:
No it isn't. Strings are adherents to the sequence protocol. The Python datatype reference echoes what I said, nearly exactly. http://docs.python.org/reference/datamodel.html#the-standard-type-hierarchy
Nothing about that feature makes them not-sequences; instead, it makes them a rather special kind of sequence. -- Devin

On 2012-07-18, at 20:31 , Devin Jeanpierre wrote:
This has no relevance to my messages, I have not claimed anywhere that strings weren't sequences.
I'm not sure why you're saying that. Again, I have never once claimed they were not sequences (quite the opposite in fact). Why the strawmanning?

On Fri, 20 Jul 2012 14:18:20 -0700 Ethan Furman <ethan@stoneleaf.us> wrote:
"Python 3 binaries" probably means "Python 3 bytes objects" above. Regards Antoine. -- Software development and contracting: http://pro.pitrou.net

Andrew Svetlov wrote:
Masklinn's explanation is comprehensive clean to me.
I'm glad that it's clear to someone, because to me the straight-forward, literal meaning of Masklinn's explanation (that Python 3 has a character type, and they're integers) is wrong. Python has no built-in "Char" type, under any spelling, let alone one which is also a subset of int. The non-literal meaning is hard to understand. I *guess* that Masklinn is trying to get across that Python 3 strings are Unicode strings, and characters in Unicode are actually code points, which are implemented at the C level as integers. If not that, I have no idea. I've more or less forgotten why this was important, but I am enjoying watching people try to out-pedant each other :)
-- Steven

Strings have a common use case which lists do not: finding subsequences/substrings. Consider the following:
The contains operator ("in") has a different meaning than the contains operator for a list. A list contains an object if (and only if) that object is a single element of the list. A string contains another string if (and only if) the other string is a substring of the first string. Matthew Lefavor NASA GSFC [Microtel, LLC] Mail Code 699.0/Org Code 582.0 matthew.lefavor@nasa.gov (301) 614-6818 (Desk) (443) 758-4891 (Cell) On 7/18/12 1:30 PM, "anatoly techtonik" <techtonik@gmail.com> wrote:

On 2012-07-18, at 19:43 , Masklinn wrote:
in fact, things used to work that way in older Python, this was specifically changed to the current behavior *as noted in the documentation*:
A Python string, you may want to note, is a string. Not a sequence of characters. The first item of a 1-character string is itself, all basic (step-less) slices of a string are contained in itself (including itself and the empty string), you can infinitely get the first item of a non-empty string, and I'm sure I'm missing plenty.

On Wed, Jul 18, 2012 at 1:58 PM, Masklinn <masklinn@masklinn.net> wrote:
A Python string, you may want to note, is a string. Not a sequence of characters.
It's both (with the caveat that, in Python, a character is just a string of length 1). (See: http://docs.python.org/reference/datamodel.html#the-standard-type-hierarchy ) -- Devin

On 2012-07-18, at 20:06 , Devin Jeanpierre wrote:
That's playing with words, especially comparing strings with Python 3 binaries which *do* actually have a separate "character" type (reified to an integer). So Python strings don't have reified characters, a string's item and a slice of size 1 are essentially identical which is pretty much unique to them (as far as my knowledge of Python's sequences go). Which is not a bad thing, mind you, it makes working with strings much more pleasant.

On Wed, Jul 18, 2012 at 2:16 PM, Masklinn <masklinn@masklinn.net> wrote:
No it isn't. Strings are adherents to the sequence protocol. The Python datatype reference echoes what I said, nearly exactly. http://docs.python.org/reference/datamodel.html#the-standard-type-hierarchy
Nothing about that feature makes them not-sequences; instead, it makes them a rather special kind of sequence. -- Devin

On 2012-07-18, at 20:31 , Devin Jeanpierre wrote:
This has no relevance to my messages, I have not claimed anywhere that strings weren't sequences.
I'm not sure why you're saying that. Again, I have never once claimed they were not sequences (quite the opposite in fact). Why the strawmanning?

On Fri, 20 Jul 2012 14:18:20 -0700 Ethan Furman <ethan@stoneleaf.us> wrote:
"Python 3 binaries" probably means "Python 3 bytes objects" above. Regards Antoine. -- Software development and contracting: http://pro.pitrou.net

Andrew Svetlov wrote:
Masklinn's explanation is comprehensive clean to me.
I'm glad that it's clear to someone, because to me the straight-forward, literal meaning of Masklinn's explanation (that Python 3 has a character type, and they're integers) is wrong. Python has no built-in "Char" type, under any spelling, let alone one which is also a subset of int. The non-literal meaning is hard to understand. I *guess* that Masklinn is trying to get across that Python 3 strings are Unicode strings, and characters in Unicode are actually code points, which are implemented at the C level as integers. If not that, I have no idea. I've more or less forgotten why this was important, but I am enjoying watching people try to out-pedant each other :)
-- Steven
participants (10)
-
anatoly techtonik
-
Andrew Svetlov
-
Antoine Pitrou
-
Devin Jeanpierre
-
Ethan Furman
-
Georg Brandl
-
Lefavor, Matthew (GSFC-582.0)[MICROTEL LLC]
-
Masklinn
-
Serhiy Storchaka
-
Steven D'Aprano