Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
I write programs on SoloLearn mobile app.
Warm regards, Julia Kim
str.count counts non-overlapping instances of the substring. After counting the first 'AA', there is only one A left, so that isn't a second instance of 'AA'
On 2018-04-25 02:22 PM, Julia Kim wrote:
Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
I write programs on SoloLearn mobile app.
Warm regards, Julia Kim
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Hi,
From https://docs.python.org/3/library/stdtypes.html#str.count:
str.count(*sub*[, *start*[, *end*]])
Return the number of *non-overlapping* occurrences of substring *sub* in the range [*start*, *end*]. Optional arguments *start* and *end* are interpreted as in slice notation. Best regards, João Santos
On Wed, 25 Apr 2018 at 20:22 Julia Kim julia.hiyeon.kim@gmail.com wrote:
Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
I write programs on SoloLearn mobile app.
Warm regards, Julia Kim
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote:
Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
Are you proposing that there ought to be a version of count that looks for *overlapping* substrings?
When will this be useful?
On Wednesday, April 25, 2018, Steven D'Aprano steve@pearwood.info wrote:
On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote:
Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
Are you proposing that there ought to be a version of count that looks for *overlapping* substrings?
When will this be useful?
"Finding a motif in DNA" http://rosalind.info/problems/subs/
This is possible with re.find, re.finditer, re.findall, regex.findall(, overlapped=True), sliding window https://stackoverflow.com/questions/2970520/string-count-with-overlapping-oc...
n-grams can be by indices or by value. count = len(indices) https://en.wikipedia.org/wiki/N-gram#Examples
https://en.wikipedia.org/wiki/String_(computer_science)#String_processing_al...
https://en.wikipedia.org/wiki/Sequential_pattern_mining
-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
or build it yourself...
def str_count(string, sub): c = 0 for c in range(len(string)-len(sub)): if string[c:].startswith(sub): c += 1 return c
(probably some optimizations possible...)
Or in one line with a generator expression: def str_count(string, sub): return sum(string[c:].startswith(sub) for c in range(len(string)-len(sub)))
regular expressions would probably be at least an order of magnitude better in speed, if it's a bottleneck to you. But pure python implementation for this is a lot easier than it would be for the current string.count().
2018-04-26 8:57 GMT+02:00 Wes Turner wes.turner@gmail.com:
On Wednesday, April 25, 2018, Steven D'Aprano steve@pearwood.info wrote:
On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote:
Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
Are you proposing that there ought to be a version of count that looks for *overlapping* substrings?
When will this be useful?
"Finding a motif in DNA" http://rosalind.info/problems/subs/
This is possible with re.find, re.finditer, re.findall, regex.findall(, overlapped=True), sliding window https://stackoverflow.com/questions/2970520/string-count-with-overlapping-oc...
n-grams can be by indices or by value. count = len(indices) https://en.wikipedia.org/wiki/N-gram#Examples
https://en.wikipedia.org/wiki/String_(computer_science)#String_processing_al...
https://en.wikipedia.org/wiki/Sequential_pattern_mining
-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Regular expressions are not just "an order of magnitude better"—they're asymptotically faster. See https://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm for a non-regular-expression algorithm.
On Thursday, April 26, 2018 at 5:45:20 AM UTC-4, Jacco van Dorp wrote:
or build it yourself...
def str_count(string, sub): c = 0 for c in range(len(string)-len(sub)): if string[c:].startswith(sub): c += 1 return c
(probably some optimizations possible...)
Or in one line with a generator expression: def str_count(string, sub): return sum(string[c:].startswith(sub) for c in range(len(string)-len(sub)))
regular expressions would probably be at least an order of magnitude better in speed, if it's a bottleneck to you. But pure python implementation for this is a lot easier than it would be for the current string.count().
2018-04-26 8:57 GMT+02:00 Wes Turner <wes.t...@gmail.com javascript:>:
On Wednesday, April 25, 2018, Steven D'Aprano <st...@pearwood.info
javascript:> wrote:
On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote:
Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
Are you proposing that there ought to be a version of count that looks for *overlapping* substrings?
When will this be useful?
"Finding a motif in DNA" http://rosalind.info/problems/subs/
This is possible with re.find, re.finditer, re.findall, regex.findall(, overlapped=True), sliding window
https://stackoverflow.com/questions/2970520/string-count-with-overlapping-oc...
n-grams can be by indices or by value. count = len(indices) https://en.wikipedia.org/wiki/N-gram#Examples
https://en.wikipedia.org/wiki/String_(computer_science)#String_processing_al...
https://en.wikipedia.org/wiki/Sequential_pattern_mining
-- Steve _______________________________________________ Python-ideas mailing list Python...@python.org javascript: https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Python-ideas mailing list Python...@python.org javascript: https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Python-ideas mailing list Python...@python.org javascript: https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
There are two ‘AA’ in ‘AAA’, one starting from 0 and the other starting from 1.
If ‘AA’ starting from 0 is deleted and inserted with ‘BANAN’, ‘AAA’ becomes ‘BANANA ‘.
If ‘AA’ starting from 1 is deleted and inserted with ‘PPLE’, ‘AAA’ becomes ‘APPLE’.
Depending on which one is chosen, ‘AAA’ can be edited to ‘BANANA’ or ‘APPLE ‘, two different results.
I wrote a program which edits a part of a text. If the part to be edited occurs more than once, it presents the positions and asks the user to choose which one to be edited.
I tried with different algorithms. Best one so far would be using just find() and collecting the results in a list.
On Apr 25, 2018, at 11:57 PM, Wes Turner wes.turner@gmail.com wrote:
On Wednesday, April 25, 2018, Steven D'Aprano steve@pearwood.info wrote: On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote:
Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
Are you proposing that there ought to be a version of count that looks for *overlapping* substrings?
When will this be useful?
"Finding a motif in DNA" http://rosalind.info/problems/subs/ This is possible with re.find, re.finditer, re.findall, regex.findall(, overlapped=True), sliding window https://stackoverflow.com/questions/2970520/string-count-with-overlapping-oc...
n-grams can be by indices or by value. count = len(indices) https://en.wikipedia.org/wiki/N-gram#Examples
https://en.wikipedia.org/wiki/String_(computer_science)#String_processing_al...
https://en.wikipedia.org/wiki/Sequential_pattern_mining
-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
If this was for a school assignment, I'd probably go to edit distance and fuzzy string match next: https://en.wikipedia.org/wiki/Edit_distance https://en.wikipedia.org/wiki/String-to-string_correction_problem
- https://pypi.org/search/?q=Levenshtein - https://pypi.org/project/textdistance/
As a bioinformatics program, this is a bit like CRISPR: https://en.wikipedia.org/wiki/CRISPR
BioPython Seq has a count_overlap method with a BSD 3-Clause LICENSE: https://github.com/biopython/biopython/blob/master/LICENSE.rst
Can it be made faster with e.g. itertools.count and a generator comprehension?
- Bio.Seq.Seq.count_overlap() http://biopython.org/DIST/docs/api/Bio.Seq.Seq-class.html#count_overlap
Are there any changes or features necessary in core Python in order to finish this application? If not, the python-tutor mailing list or r/learnpython are set up to handle this sort of thing.
It may or may not be appropriate for core Python to support all of these string algorithms: http://rosalind.info/problems/topics/string-algorithms/
On Thursday, April 26, 2018, Julia Kim julia.hiyeon.kim@gmail.com wrote:
There are two ‘AA’ in ‘AAA’, one starting from 0 and the other starting from 1.
If ‘AA’ starting from 0 is deleted and inserted with ‘BANAN’, ‘AAA’ becomes ‘BANANA ‘.
If ‘AA’ starting from 1 is deleted and inserted with ‘PPLE’, ‘AAA’ becomes ‘APPLE’.
Depending on which one is chosen, ‘AAA’ can be edited to ‘BANANA’ or ‘APPLE ‘, two different results.
I wrote a program which edits a part of a text. If the part to be edited occurs more than once, it presents the positions and asks the user to choose which one to be edited.
I tried with different algorithms. Best one so far would be using just find() and collecting the results in a list.
On Apr 25, 2018, at 11:57 PM, Wes Turner wes.turner@gmail.com wrote:
On Wednesday, April 25, 2018, Steven D'Aprano steve@pearwood.info wrote:
On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote:
Hi,
There’s an error with the string method count().
x = ‘AAA’ y = ‘AA’ print(x.count(y))
The output is 1, instead of 2.
Are you proposing that there ought to be a version of count that looks for *overlapping* substrings?
When will this be useful?
"Finding a motif in DNA" http://rosalind.info/problems/subs/
This is possible with re.find, re.finditer, re.findall, regex.findall(, overlapped=True), sliding window https://stackoverflow.com/questions/2970520/string-count-with-overlapping- occurrences
n-grams can be by indices or by value. count = len(indices) https://en.wikipedia.org/wiki/N-gram#Examples
https://en.wikipedia.org/wiki/String_(computer_science)# String_processing_algorithms
https://en.wikipedia.org/wiki/Sequential_pattern_mining
-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/