string method count()
Hi, There’s an error with the string method count(). x = ‘AAA’ y = ‘AA’ print(x.count(y)) The output is 1, instead of 2. I write programs on SoloLearn mobile app. Warm regards, Julia Kim
str.count counts non-overlapping instances of the substring. After counting the first 'AA', there is only one A left, so that isn't a second instance of 'AA' On 2018-04-25 02:22 PM, Julia Kim wrote:
Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
I write programs on SoloLearn mobile app.
Warm regards, Julia Kim
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Hi,
From https://docs.python.org/3/library/stdtypes.html#str.count: str.count(*sub*[, *start*[, *end*]])
Return the number of *non-overlapping* occurrences of substring *sub* in
the range [*start*, *end*]. Optional arguments *start* and *end* are
interpreted as in slice notation.
Best regards,
João Santos
On Wed, 25 Apr 2018 at 20:22 Julia Kim
Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
I write programs on SoloLearn mobile app.
Warm regards, Julia Kim
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote:
Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
Are you proposing that there ought to be a version of count that looks for *overlapping* substrings? When will this be useful? -- Steve
On Wednesday, April 25, 2018, Steven D'Aprano
On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote:
Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
Are you proposing that there ought to be a version of count that looks for *overlapping* substrings?
When will this be useful?
"Finding a motif in DNA" http://rosalind.info/problems/subs/ This is possible with re.find, re.finditer, re.findall, regex.findall(, overlapped=True), sliding window https://stackoverflow.com/questions/2970520/string-count-with-overlapping-oc... n-grams can be by indices or by value. count = len(indices) https://en.wikipedia.org/wiki/N-gram#Examples https://en.wikipedia.org/wiki/String_(computer_science)#String_processing_al... https://en.wikipedia.org/wiki/Sequential_pattern_mining
-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
or build it yourself...
def str_count(string, sub):
c = 0
for c in range(len(string)-len(sub)):
if string[c:].startswith(sub):
c += 1
return c
(probably some optimizations possible...)
Or in one line with a generator expression:
def str_count(string, sub):
return sum(string[c:].startswith(sub) for c in range(len(string)-len(sub)))
regular expressions would probably be at least an order of magnitude
better in speed, if it's a bottleneck to you. But pure python
implementation for this is a lot easier than it would be for the
current string.count().
2018-04-26 8:57 GMT+02:00 Wes Turner
On Wednesday, April 25, 2018, Steven D'Aprano
wrote: On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote:
Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
Are you proposing that there ought to be a version of count that looks for *overlapping* substrings?
When will this be useful?
"Finding a motif in DNA" http://rosalind.info/problems/subs/
This is possible with re.find, re.finditer, re.findall, regex.findall(, overlapped=True), sliding window https://stackoverflow.com/questions/2970520/string-count-with-overlapping-oc...
n-grams can be by indices or by value. count = len(indices) https://en.wikipedia.org/wiki/N-gram#Examples
https://en.wikipedia.org/wiki/String_(computer_science)#String_processing_al...
https://en.wikipedia.org/wiki/Sequential_pattern_mining
-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Regular expressions are not just "an order of magnitude better"—they're asymptotically faster. See https://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm for a non-regular-expression algorithm. On Thursday, April 26, 2018 at 5:45:20 AM UTC-4, Jacco van Dorp wrote:
or build it yourself...
def str_count(string, sub): c = 0 for c in range(len(string)-len(sub)): if string[c:].startswith(sub): c += 1 return c
(probably some optimizations possible...)
Or in one line with a generator expression: def str_count(string, sub): return sum(string[c:].startswith(sub) for c in range(len(string)-len(sub)))
regular expressions would probably be at least an order of magnitude better in speed, if it's a bottleneck to you. But pure python implementation for this is a lot easier than it would be for the current string.count().
2018-04-26 8:57 GMT+02:00 Wes Turner
javascript:>: On Wednesday, April 25, 2018, Steven D'Aprano
javascript:> wrote:
On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote:
Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
Are you proposing that there ought to be a version of count that looks for *overlapping* substrings?
When will this be useful?
"Finding a motif in DNA" http://rosalind.info/problems/subs/
This is possible with re.find, re.finditer, re.findall, regex.findall(, overlapped=True), sliding window
https://stackoverflow.com/questions/2970520/string-count-with-overlapping-oc...
n-grams can be by indices or by value. count = len(indices) https://en.wikipedia.org/wiki/N-gram#Examples
https://en.wikipedia.org/wiki/String_(computer_science)#String_processing_al...
https://en.wikipedia.org/wiki/Sequential_pattern_mining
-- Steve _______________________________________________ Python-ideas mailing list Python...@python.org javascript: https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python...@python.org javascript: https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python...@python.org javascript: https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
There are two ‘AA’ in ‘AAA’, one starting from 0 and the other starting from 1. If ‘AA’ starting from 0 is deleted and inserted with ‘BANAN’, ‘AAA’ becomes ‘BANANA ‘. If ‘AA’ starting from 1 is deleted and inserted with ‘PPLE’, ‘AAA’ becomes ‘APPLE’. Depending on which one is chosen, ‘AAA’ can be edited to ‘BANANA’ or ‘APPLE ‘, two different results. I wrote a program which edits a part of a text. If the part to be edited occurs more than once, it presents the positions and asks the user to choose which one to be edited. I tried with different algorithms. Best one so far would be using just find() and collecting the results in a list.
On Apr 25, 2018, at 11:57 PM, Wes Turner
wrote: On Wednesday, April 25, 2018, Steven D'Aprano
wrote: On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote: Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
Are you proposing that there ought to be a version of count that looks for *overlapping* substrings?
When will this be useful?
"Finding a motif in DNA" http://rosalind.info/problems/subs/
This is possible with re.find, re.finditer, re.findall, regex.findall(, overlapped=True), sliding window https://stackoverflow.com/questions/2970520/string-count-with-overlapping-oc...
n-grams can be by indices or by value. count = len(indices) https://en.wikipedia.org/wiki/N-gram#Examples
https://en.wikipedia.org/wiki/String_(computer_science)#String_processing_al...
https://en.wikipedia.org/wiki/Sequential_pattern_mining
-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
If this was for a school assignment, I'd probably go to edit distance and
fuzzy string match next:
https://en.wikipedia.org/wiki/Edit_distance
https://en.wikipedia.org/wiki/String-to-string_correction_problem
- https://pypi.org/search/?q=Levenshtein
- https://pypi.org/project/textdistance/
As a bioinformatics program, this is a bit like CRISPR:
https://en.wikipedia.org/wiki/CRISPR
BioPython Seq has a count_overlap method with a BSD 3-Clause LICENSE:
https://github.com/biopython/biopython/blob/master/LICENSE.rst
Can it be made faster with e.g. itertools.count and a generator
comprehension?
- Bio.Seq.Seq.count_overlap()
http://biopython.org/DIST/docs/api/Bio.Seq.Seq-class.html#count_overlap
Are there any changes or features necessary in core Python in order to
finish this application?
If not, the python-tutor mailing list or r/learnpython are set up to handle
this sort of thing.
It may or may not be appropriate for core Python to support all of these
string algorithms:
http://rosalind.info/problems/topics/string-algorithms/
On Thursday, April 26, 2018, Julia Kim
There are two ‘AA’ in ‘AAA’, one starting from 0 and the other starting from 1.
If ‘AA’ starting from 0 is deleted and inserted with ‘BANAN’, ‘AAA’ becomes ‘BANANA ‘.
If ‘AA’ starting from 1 is deleted and inserted with ‘PPLE’, ‘AAA’ becomes ‘APPLE’.
Depending on which one is chosen, ‘AAA’ can be edited to ‘BANANA’ or ‘APPLE ‘, two different results.
I wrote a program which edits a part of a text. If the part to be edited occurs more than once, it presents the positions and asks the user to choose which one to be edited.
I tried with different algorithms. Best one so far would be using just find() and collecting the results in a list.
On Apr 25, 2018, at 11:57 PM, Wes Turner
wrote: On Wednesday, April 25, 2018, Steven D'Aprano
wrote: On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote:
Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
Are you proposing that there ought to be a version of count that looks for *overlapping* substrings?
When will this be useful?
"Finding a motif in DNA" http://rosalind.info/problems/subs/
This is possible with re.find, re.finditer, re.findall, regex.findall(, overlapped=True), sliding window https://stackoverflow.com/questions/2970520/string-count-with-overlapping- occurrences
n-grams can be by indices or by value. count = len(indices) https://en.wikipedia.org/wiki/N-gram#Examples
https://en.wikipedia.org/wiki/String_(computer_science)# String_processing_algorithms
https://en.wikipedia.org/wiki/Sequential_pattern_mining
-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
participants (7)
-
Alexandre Brault
-
Jacco van Dorp
-
João Santos
-
Julia Kim
-
Neil Girdhar
-
Steven D'Aprano
-
Wes Turner