[issue14422] Pack PyASCIIObject fields to reduce memory consumption of pure ASCII strings
STINNER Victor
report at bugs.python.org
Tue Mar 27 13:23:03 CEST 2012
STINNER Victor <victor.stinner at gmail.com> added the comment:
iobench and stringbench results on unpatched Python:
$ ./python Tools/iobench/iobench.py -t
Preparing files...
Python 3.3.0a1+ (default:51016ff7f8c9, Mar 27 2012, 13:19:52)
[GCC 4.6.1]
Unicode: PEP 393
Linux-3.0.0-16-generic-pae-i686-with-debian-wheezy-sid
Text unit = one character (utf8-decoded)
** Text input **
[ 400KB ] read one unit at a time... 5.4 MB/s
[ 400KB ] read 20 units at a time... 68 MB/s
[ 400KB ] read one line at a time... 174 MB/s
[ 400KB ] read 4096 units at a time... 289 MB/s
[ 20KB ] read whole contents at once... 315 MB/s
[ 400KB ] read whole contents at once... 332 MB/s
[ 10MB ] read whole contents at once... 292 MB/s
[ 400KB ] seek forward one unit at a time... 0.304 MB/s
[ 400KB ] seek forward 1000 units at a time... 312 MB/s
** Text append **
[ 20KB ] write one unit at a time... 3.05 MB/s
[ 400KB ] write 20 units at a time... 43 MB/s
[ 400KB ] write 4096 units at a time... 554 MB/s
[ 10MB ] write 1e6 units at a time... 450 MB/s
** Text overwrite **
[ 20KB ] modify one unit at a time... 1.18 MB/s
[ 400KB ] modify 20 units at a time... 18.9 MB/s
[ 400KB ] modify 4096 units at a time... 400 MB/s
$ ./python stringbench/stringbench.py
stringbench v2.0
3.3.0a1+ (default:51016ff7f8c9, Mar 27 2012, 13:19:52)
[GCC 4.6.1]
2012-03-27 13:21:01.217823
bytes unicode
(in ms) (in ms) % comment
========== case conversion -- dense
0.37 0.38 97.9 ("WHERE IN THE WORLD IS CARMEN SAN DEIGO?"*10).lower() (*1000)
0.38 0.38 99.3 ("where in the world is carmen san deigo?"*10).upper() (*1000)
========== case conversion -- rare
0.38 0.38 99.9 ("Where in the world is Carmen San Deigo?"*10).lower() (*1000)
0.43 0.38 113.6 ("wHERE IN THE WORLD IS cARMEN sAN dEIGO?"*10).upper() (*1000)
========== concat 20 strings of words length 4 to 15
1.76 1.69 104.2 s1+s2+s3+s4+...+s20 (*1000)
========== concat two strings
0.08 0.07 107.7 "Andrew"+"Dalke" (*1000)
========== count AACT substrings in DNA example
2.15 2.13 100.7 dna.count("AACT") (*10)
========== count newlines
0.65 0.58 110.8 ...text.with.2000.newlines.count("\n") (*10)
========== early match, single character
0.20 0.19 107.9 ("A"*1000).find("A") (*1000)
0.36 0.05 745.8 "A" in "A"*1000 (*1000)
0.18 0.19 96.4 ("A"*1000).index("A") (*1000)
0.18 0.21 85.5 ("A"*1000).partition("A") (*1000)
0.21 0.20 103.6 ("A"*1000).rfind("A") (*1000)
0.21 0.30 69.8 ("A"*1000).rindex("A") (*1000)
0.37 0.21 171.7 ("A"*1000).rpartition("A") (*1000)
0.38 0.39 98.4 ("A"*1000).rsplit("A", 1) (*1000)
0.37 0.37 100.7 ("A"*1000).split("A", 1) (*1000)
========== early match, two characters
0.20 0.19 107.7 ("AB"*1000).find("AB") (*1000)
0.36 0.05 702.1 "AB" in "AB"*1000 (*1000)
0.18 0.19 96.9 ("AB"*1000).index("AB") (*1000)
0.20 0.24 83.9 ("AB"*1000).partition("AB") (*1000)
0.20 0.20 103.6 ("AB"*1000).rfind("AB") (*1000)
0.20 0.19 102.9 ("AB"*1000).rindex("AB") (*1000)
0.20 0.23 86.7 ("AB"*1000).rpartition("AB") (*1000)
0.39 0.40 97.7 ("AB"*1000).rsplit("AB", 1) (*1000)
0.40 0.42 94.4 ("AB"*1000).split("AB", 1) (*1000)
========== endswith multiple characters
0.17 0.19 92.6 "Andrew".endswith("Andrew") (*1000)
========== endswith multiple characters - not!
0.17 0.18 95.2 "Andrew".endswith("Anders") (*1000)
========== endswith single character
0.17 0.18 92.3 "Andrew".endswith("w") (*1000)
========== formatting a string type with a dict
N/A 0.91 0.0 "The %(k1)s is %(k2)s the %(k3)s."%{"k1":"x","k2":"y","k3":"z",} (*1000)
========== join empty string, with 1 character sep
N/A 0.04 0.0 "A".join("") (*100)
========== join empty string, with 5 character sep
N/A 0.04 0.0 "ABCDE".join("") (*100)
========== join list of 100 words, with 1 character sep
1.37 1.71 80.0 "A".join(["Bob"]*100)) (*1000)
========== join list of 100 words, with 5 character sep
1.50 1.86 80.8 "ABCDE".join(["Bob"]*100)) (*1000)
========== join list of 26 characters, with 1 character sep
0.48 0.49 99.6 "A".join(list("ABC..Z")) (*1000)
========== join list of 26 characters, with 5 character sep
0.49 0.54 91.3 "ABCDE".join(list("ABC..Z")) (*1000)
========== join string with 26 characters, with 1 character sep
N/A 1.17 0.0 "A".join("ABC..Z") (*1000)
========== join string with 26 characters, with 5 character sep
N/A 1.22 0.0 "ABCDE".join("ABC..Z") (*1000)
========== late match, 100 characters
8.48 8.46 100.2 s="ABC"*33; ((s+"D")*500+s+"E").find(s+"E") (*100)
4.19 3.50 119.9 s="ABC"*33; ((s+"D")*500+"E"+s).find("E"+s) (*100)
5.30 5.11 103.7 s="ABC"*33; (s+"E") in ((s+"D")*300+s+"E") (*100)
8.47 8.45 100.2 s="ABC"*33; ((s+"D")*500+s+"E").index(s+"E") (*100)
8.68 8.68 100.0 s="ABC"*33; ((s+"D")*500+s+"E").partition(s+"E") (*100)
6.36 6.37 99.8 s="ABC"*33; ("E"+s+("D"+s)*500).rfind("E"+s) (*100)
2.33 2.27 102.4 s="ABC"*33; (s+"E"+("D"+s)*500).rfind(s+"E") (*100)
6.58 6.58 100.1 s="ABC"*33; ("E"+s+("D"+s)*500).rindex("E"+s) (*100)
7.34 6.56 111.9 s="ABC"*33; ("E"+s+("D"+s)*500).rpartition("E"+s) (*100)
6.69 7.65 87.5 s="ABC"*33; ("E"+s+("D"+s)*500).rsplit("E"+s, 1) (*100)
8.47 8.87 95.4 s="ABC"*33; ((s+"D")*500+s+"E").split(s+"E", 1) (*100)
========== late match, two characters
1.30 1.26 102.7 ("AB"*300+"C").find("BC") (*1000)
1.30 1.27 102.0 ("AB"*300+"CA").find("CA") (*1000)
1.42 1.10 129.6 "BC" in ("AB"*300+"C") (*1000)
1.20 1.20 100.2 ("AB"*300+"C").index("BC") (*1000)
1.16 1.26 92.3 ("AB"*300+"C").partition("BC") (*1000)
0.95 0.94 101.0 ("C"+"AB"*300).rfind("CA") (*1000)
0.90 0.69 131.2 ("BC"+"AB"*300).rfind("BC") (*1000)
0.94 0.94 100.1 ("C"+"AB"*300).rindex("CA") (*1000)
1.02 0.94 108.6 ("C"+"AB"*300).rpartition("CA") (*1000)
1.12 1.08 103.7 ("C"+"AB"*300).rsplit("CA", 1) (*1000)
1.27 1.38 91.8 ("AB"*300+"C").split("BC", 1) (*1000)
========== no match, single character
0.45 0.41 111.1 ("A"*1000).find("B") (*1000)
0.59 0.29 205.4 "B" in "A"*1000 (*1000)
0.30 0.31 97.4 ("A"*1000).partition("B") (*1000)
0.49 0.48 102.5 ("A"*1000).rfind("B") (*1000)
0.36 0.37 96.5 ("A"*1000).rpartition("B") (*1000)
0.77 0.76 101.4 ("A"*1000).rsplit("B", 1) (*1000)
0.83 0.81 101.6 ("A"*1000).split("B", 1) (*1000)
========== no match, two characters
3.80 3.78 100.6 ("AB"*1000).find("BC") (*1000)
4.08 3.68 111.0 ("AB"*1000).find("CA") (*1000)
3.71 3.40 109.2 "BC" in "AB"*1000 (*1000)
3.44 3.42 100.8 ("AB"*1000).partition("BC") (*1000)
2.56 1.86 137.9 ("AB"*1000).rfind("BC") (*1000)
2.69 2.69 100.2 ("AB"*1000).rfind("CA") (*1000)
2.50 1.84 135.6 ("AB"*1000).rpartition("BC") (*1000)
2.03 1.94 104.7 ("AB"*1000).rsplit("BC", 1) (*1000)
3.27 3.56 91.8 ("AB"*1000).split("BC", 1) (*1000)
========== quick replace multiple character match
0.08 0.08 99.7 ("A" + ("Z"*128*1024)).replace("AZZ", "BBZZ", 1) (*10)
========== quick replace single character match
0.08 0.09 89.5 ("A" + ("Z"*128*1024)).replace("A", "BB", 1) (*10)
========== repeat 1 character 10 times
0.06 0.07 87.0 "A"*10 (*1000)
========== repeat 1 character 1000 times
0.13 0.15 89.3 "A"*1000 (*1000)
========== repeat 5 characters 10 times
0.12 0.09 128.8 "ABCDE"*10 (*1000)
========== repeat 5 characters 1000 times
0.33 0.34 94.8 "ABCDE"*1000 (*1000)
========== replace and expand multiple characters, big string
1.83 2.11 86.4 "...text.with.2000.newlines...replace("\n", "\r\n") (*10)
========== replace multiple characters, dna
3.21 3.23 99.5 dna.replace("ATC", "ATT") (*10)
========== replace single character
0.18 0.25 70.9 "This is a test".replace(" ", "\t") (*1000)
========== replace single character, big string
0.65 0.92 70.1 "...text.with.2000.lines...replace("\n", " ") (*10)
========== replace/remove multiple characters
0.27 0.34 78.7 "When shall we three meet again?".replace("ee", "") (*1000)
========== split 1 whitespace
0.12 0.14 82.7 ("Here are some words. "*2).partition(" ") (*1000)
0.08 0.11 75.9 ("Here are some words. "*2).rpartition(" ") (*1000)
0.23 0.26 87.4 ("Here are some words. "*2).rsplit(None, 1) (*1000)
0.24 0.25 95.9 ("Here are some words. "*2).split(None, 1) (*1000)
========== split 2000 newlines
1.59 1.75 90.8 "...text...".rsplit("\n") (*10)
1.64 1.68 97.5 "...text...".split("\n") (*10)
1.83 2.03 90.1 "...text...".splitlines() (*10)
========== split newlines
0.26 0.29 88.8 "this\nis\na\ntest\n".rsplit("\n") (*1000)
0.27 0.29 92.2 "this\nis\na\ntest\n".split("\n") (*1000)
0.26 0.30 85.8 "this\nis\na\ntest\n".splitlines() (*1000)
========== split on multicharacter separator (dna)
2.18 1.86 117.5 dna.rsplit("ACTAT") (*10)
2.53 2.48 102.0 dna.split("ACTAT") (*10)
========== split on multicharacter separator (small)
0.53 0.59 88.8 "this--is--a--test--of--the--emergency--broadcast--system".rsplit("--") (*1000)
0.59 0.57 102.6 "this--is--a--test--of--the--emergency--broadcast--system".split("--") (*1000)
========== split whitespace (huge)
1.50 1.73 86.9 human_text.rsplit() (*10)
1.49 1.75 85.5 human_text.split() (*10)
========== split whitespace (small)
0.43 0.50 87.0 ("Here are some words. "*2).rsplit() (*1000)
0.40 0.50 79.4 ("Here are some words. "*2).split() (*1000)
========== startswith multiple characters
0.17 0.18 92.0 "Andrew".startswith("Andrew") (*1000)
========== startswith multiple characters - not!
0.17 0.17 99.5 "Andrew".startswith("Anders") (*1000)
========== startswith single character
0.17 0.18 94.0 "Andrew".startswith("A") (*1000)
========== strip terminal newline
0.07 0.15 46.9 s="Hello!\n"; s[:-1] if s[-1]=="\n" else s (*1000)
0.06 0.07 78.1 "\nHello!".rstrip() (*1000)
0.05 0.13 42.1 "Hello!\n".rstrip() (*1000)
0.06 0.07 77.1 "\nHello!\n".strip() (*1000)
0.06 0.07 77.6 "\nHello!".strip() (*1000)
0.05 0.07 75.0 "Hello!\n".strip() (*1000)
========== strip terminal spaces and tabs
0.06 0.08 74.2 "\t \tHello".rstrip() (*1000)
0.06 0.07 79.4 "Hello\t \t".rstrip() (*1000)
0.04 0.05 87.1 "Hello\t \t".strip() (*1000)
========== tab split
0.44 0.51 87.2 GFF3_example.rsplit("\t", 8) (*1000)
0.42 0.47 89.9 GFF3_example.rsplit("\t") (*1000)
0.39 0.44 88.7 GFF3_example.split("\t", 8) (*1000)
0.41 0.47 86.1 GFF3_example.split("\t") (*1000)
158.46 160.84 98.5 TOTAL
*****************
iobench and stringbench results on patched Python (pack the 3 structures):
$ ./python Tools/iobench/iobench.py -t
Preparing files...
Python 3.3.0a1+ (default:51016ff7f8c9+, Mar 27 2012, 13:11:28)
[GCC 4.6.1]
Unicode: PEP 393
Linux-3.0.0-16-generic-pae-i686-with-debian-wheezy-sid
Text unit = one character (utf8-decoded)
** Text input **
[ 400KB ] read one unit at a time... 5.4 MB/s
[ 400KB ] read 20 units at a time... 68.5 MB/s
[ 400KB ] read one line at a time... 163 MB/s
[ 400KB ] read 4096 units at a time... 295 MB/s
[ 20KB ] read whole contents at once... 322 MB/s
[ 400KB ] read whole contents at once... 336 MB/s
[ 10MB ] read whole contents at once... 289 MB/s
[ 400KB ] seek forward one unit at a time... 0.32 MB/s
[ 400KB ] seek forward 1000 units at a time... 325 MB/s
** Text append **
[ 20KB ] write one unit at a time... 2.99 MB/s
[ 400KB ] write 20 units at a time... 44 MB/s
[ 400KB ] write 4096 units at a time... 556 MB/s
[ 10MB ] write 1e6 units at a time... 456 MB/s
** Text overwrite **
[ 20KB ] modify one unit at a time... 1.16 MB/s
[ 400KB ] modify 20 units at a time... 19.5 MB/s
[ 400KB ] modify 4096 units at a time... 401 MB/s
$ ./python stringbench/stringbench.py
stringbench v2.0
3.3.0a1+ (default:51016ff7f8c9+, Mar 27 2012, 13:11:28)
[GCC 4.6.1]
2012-03-27 13:17:42.363789
bytes unicode
(in ms) (in ms) % comment
========== case conversion -- dense
0.37 0.38 98.6 ("WHERE IN THE WORLD IS CARMEN SAN DEIGO?"*10).lower() (*1000)
0.37 0.38 98.4 ("where in the world is carmen san deigo?"*10).upper() (*1000)
========== case conversion -- rare
0.37 0.38 98.6 ("Where in the world is Carmen San Deigo?"*10).lower() (*1000)
0.37 0.38 98.4 ("wHERE IN THE WORLD IS cARMEN sAN dEIGO?"*10).upper() (*1000)
========== concat 20 strings of words length 4 to 15
1.86 1.85 100.9 s1+s2+s3+s4+...+s20 (*1000)
========== concat two strings
0.08 0.07 108.0 "Andrew"+"Dalke" (*1000)
========== count AACT substrings in DNA example
2.16 2.12 101.8 dna.count("AACT") (*10)
========== count newlines
0.59 0.58 101.3 ...text.with.2000.newlines.count("\n") (*10)
========== early match, single character
0.18 0.17 103.7 ("A"*1000).find("A") (*1000)
0.36 0.05 775.5 "A" in "A"*1000 (*1000)
0.17 0.17 102.0 ("A"*1000).index("A") (*1000)
0.17 0.20 84.7 ("A"*1000).partition("A") (*1000)
0.19 0.19 102.2 ("A"*1000).rfind("A") (*1000)
0.19 0.38 50.7 ("A"*1000).rindex("A") (*1000)
0.18 0.20 90.0 ("A"*1000).rpartition("A") (*1000)
0.59 0.36 166.9 ("A"*1000).rsplit("A", 1) (*1000)
0.34 0.36 93.5 ("A"*1000).split("A", 1) (*1000)
========== early match, two characters
0.18 0.19 95.8 ("AB"*1000).find("AB") (*1000)
0.44 0.05 891.0 "AB" in "AB"*1000 (*1000)
0.23 0.31 73.4 ("AB"*1000).index("AB") (*1000)
0.22 0.31 70.7 ("AB"*1000).partition("AB") (*1000)
0.19 0.19 101.2 ("AB"*1000).rfind("AB") (*1000)
0.19 0.19 102.0 ("AB"*1000).rindex("AB") (*1000)
0.17 0.21 78.7 ("AB"*1000).rpartition("AB") (*1000)
0.35 0.38 93.0 ("AB"*1000).rsplit("AB", 1) (*1000)
0.39 0.42 93.0 ("AB"*1000).split("AB", 1) (*1000)
========== endswith multiple characters
0.16 0.17 93.0 "Andrew".endswith("Andrew") (*1000)
========== endswith multiple characters - not!
0.16 0.16 101.4 "Andrew".endswith("Anders") (*1000)
========== endswith single character
0.16 0.17 93.7 "Andrew".endswith("w") (*1000)
========== formatting a string type with a dict
N/A 0.86 0.0 "The %(k1)s is %(k2)s the %(k3)s."%{"k1":"x","k2":"y","k3":"z",} (*1000)
========== join empty string, with 1 character sep
N/A 0.04 0.0 "A".join("") (*100)
========== join empty string, with 5 character sep
N/A 0.04 0.0 "ABCDE".join("") (*100)
========== join list of 100 words, with 1 character sep
1.42 1.74 81.3 "A".join(["Bob"]*100)) (*1000)
========== join list of 100 words, with 5 character sep
1.62 1.95 83.3 "ABCDE".join(["Bob"]*100)) (*1000)
========== join list of 26 characters, with 1 character sep
0.51 0.57 89.7 "A".join(list("ABC..Z")) (*1000)
========== join list of 26 characters, with 5 character sep
0.58 0.53 108.1 "ABCDE".join(list("ABC..Z")) (*1000)
========== join string with 26 characters, with 1 character sep
N/A 1.30 0.0 "A".join("ABC..Z") (*1000)
========== join string with 26 characters, with 5 character sep
N/A 1.22 0.0 "ABCDE".join("ABC..Z") (*1000)
========== late match, 100 characters
8.50 8.45 100.6 s="ABC"*33; ((s+"D")*500+s+"E").find(s+"E") (*100)
3.70 3.46 107.0 s="ABC"*33; ((s+"D")*500+"E"+s).find("E"+s) (*100)
5.11 5.08 100.6 s="ABC"*33; (s+"E") in ((s+"D")*300+s+"E") (*100)
8.62 8.47 101.7 s="ABC"*33; ((s+"D")*500+s+"E").index(s+"E") (*100)
8.80 8.67 101.5 s="ABC"*33; ((s+"D")*500+s+"E").partition(s+"E") (*100)
6.39 6.46 99.0 s="ABC"*33; ("E"+s+("D"+s)*500).rfind("E"+s) (*100)
2.31 2.18 105.9 s="ABC"*33; (s+"E"+("D"+s)*500).rfind(s+"E") (*100)
6.41 6.35 100.9 s="ABC"*33; ("E"+s+("D"+s)*500).rindex("E"+s) (*100)
7.41 6.56 112.9 s="ABC"*33; ("E"+s+("D"+s)*500).rpartition("E"+s) (*100)
6.59 6.59 100.0 s="ABC"*33; ("E"+s+("D"+s)*500).rsplit("E"+s, 1) (*100)
8.00 8.69 92.0 s="ABC"*33; ((s+"D")*500+s+"E").split(s+"E", 1) (*100)
========== late match, two characters
1.20 1.21 99.6 ("AB"*300+"C").find("BC") (*1000)
1.29 1.25 103.1 ("AB"*300+"CA").find("CA") (*1000)
1.41 1.07 130.9 "BC" in ("AB"*300+"C") (*1000)
1.20 1.21 99.3 ("AB"*300+"C").index("BC") (*1000)
1.17 1.20 97.5 ("AB"*300+"C").partition("BC") (*1000)
0.95 0.93 101.4 ("C"+"AB"*300).rfind("CA") (*1000)
0.90 0.69 129.3 ("BC"+"AB"*300).rfind("BC") (*1000)
0.95 0.94 101.2 ("C"+"AB"*300).rindex("CA") (*1000)
1.01 0.94 106.8 ("C"+"AB"*300).rpartition("CA") (*1000)
1.11 1.10 101.5 ("C"+"AB"*300).rsplit("CA", 1) (*1000)
1.28 1.37 93.6 ("AB"*300+"C").split("BC", 1) (*1000)
========== no match, single character
0.41 0.40 101.2 ("A"*1000).find("B") (*1000)
0.59 0.29 203.8 "B" in "A"*1000 (*1000)
0.29 0.30 95.7 ("A"*1000).partition("B") (*1000)
0.49 0.48 101.4 ("A"*1000).rfind("B") (*1000)
0.37 0.38 97.3 ("A"*1000).rpartition("B") (*1000)
0.76 0.75 101.1 ("A"*1000).rsplit("B", 1) (*1000)
0.76 0.75 100.9 ("A"*1000).split("B", 1) (*1000)
========== no match, two characters
3.53 3.52 100.2 ("AB"*1000).find("BC") (*1000)
3.92 3.67 106.9 ("AB"*1000).find("CA") (*1000)
3.71 3.39 109.6 "BC" in "AB"*1000 (*1000)
3.40 3.42 99.5 ("AB"*1000).partition("BC") (*1000)
2.55 1.90 134.2 ("AB"*1000).rfind("BC") (*1000)
2.69 2.68 100.1 ("AB"*1000).rfind("CA") (*1000)
2.43 1.81 133.9 ("AB"*1000).rpartition("BC") (*1000)
2.02 1.92 104.8 ("AB"*1000).rsplit("BC", 1) (*1000)
3.27 3.54 92.4 ("AB"*1000).split("BC", 1) (*1000)
========== quick replace multiple character match
0.09 0.08 107.7 ("A" + ("Z"*128*1024)).replace("AZZ", "BBZZ", 1) (*10)
========== quick replace single character match
0.09 0.08 108.7 ("A" + ("Z"*128*1024)).replace("A", "BB", 1) (*10)
========== repeat 1 character 10 times
0.06 0.07 87.5 "A"*10 (*1000)
========== repeat 1 character 1000 times
0.16 0.12 135.0 "A"*1000 (*1000)
========== repeat 5 characters 10 times
0.11 0.10 104.9 "ABCDE"*10 (*1000)
========== repeat 5 characters 1000 times
0.35 0.37 93.7 "ABCDE"*1000 (*1000)
========== replace and expand multiple characters, big string
1.78 2.04 87.3 "...text.with.2000.newlines...replace("\n", "\r\n") (*10)
========== replace multiple characters, dna
3.20 3.25 98.5 dna.replace("ATC", "ATT") (*10)
========== replace single character
0.17 0.24 73.0 "This is a test".replace(" ", "\t") (*1000)
========== replace single character, big string
0.62 0.88 69.7 "...text.with.2000.lines...replace("\n", " ") (*10)
========== replace/remove multiple characters
0.25 0.32 78.3 "When shall we three meet again?".replace("ee", "") (*1000)
========== split 1 whitespace
0.10 0.13 78.9 ("Here are some words. "*2).partition(" ") (*1000)
0.08 0.11 76.8 ("Here are some words. "*2).rpartition(" ") (*1000)
0.23 0.25 91.7 ("Here are some words. "*2).rsplit(None, 1) (*1000)
0.23 0.26 87.1 ("Here are some words. "*2).split(None, 1) (*1000)
========== split 2000 newlines
1.60 1.75 91.7 "...text...".rsplit("\n") (*10)
1.56 1.65 94.3 "...text...".split("\n") (*10)
1.78 2.04 87.0 "...text...".splitlines() (*10)
========== split newlines
0.27 0.29 92.6 "this\nis\na\ntest\n".rsplit("\n") (*1000)
0.27 0.29 94.2 "this\nis\na\ntest\n".split("\n") (*1000)
0.26 0.29 90.4 "this\nis\na\ntest\n".splitlines() (*1000)
========== split on multicharacter separator (dna)
2.09 1.92 108.5 dna.rsplit("ACTAT") (*10)
2.56 2.64 96.9 dna.split("ACTAT") (*10)
========== split on multicharacter separator (small)
0.72 0.89 81.1 "this--is--a--test--of--the--emergency--broadcast--system".rsplit("--") (*1000)
0.75 0.65 114.5 "this--is--a--test--of--the--emergency--broadcast--system".split("--") (*1000)
========== split whitespace (huge)
1.50 1.73 86.3 human_text.rsplit() (*10)
2.25 2.68 83.8 human_text.split() (*10)
========== split whitespace (small)
0.42 0.51 82.0 ("Here are some words. "*2).rsplit() (*1000)
0.41 0.48 86.7 ("Here are some words. "*2).split() (*1000)
========== startswith multiple characters
0.16 0.18 88.9 "Andrew".startswith("Andrew") (*1000)
========== startswith multiple characters - not!
0.19 0.17 112.0 "Andrew".startswith("Anders") (*1000)
========== startswith single character
0.16 0.18 88.2 "Andrew".startswith("A") (*1000)
========== strip terminal newline
0.07 0.16 45.5 s="Hello!\n"; s[:-1] if s[-1]=="\n" else s (*1000)
0.05 0.07 79.2 "\nHello!".rstrip() (*1000)
0.05 0.07 76.5 "Hello!\n".rstrip() (*1000)
0.06 0.07 80.9 "\nHello!\n".strip() (*1000)
0.06 0.07 80.7 "\nHello!".strip() (*1000)
0.05 0.07 77.4 "Hello!\n".strip() (*1000)
========== strip terminal spaces and tabs
0.06 0.08 77.6 "\t \tHello".rstrip() (*1000)
0.06 0.07 81.8 "Hello\t \t".rstrip() (*1000)
0.04 0.05 77.5 "Hello\t \t".strip() (*1000)
========== tab split
0.47 0.50 94.5 GFF3_example.rsplit("\t", 8) (*1000)
0.43 0.47 91.3 GFF3_example.rsplit("\t") (*1000)
0.38 0.43 88.7 GFF3_example.split("\t", 8) (*1000)
0.40 0.46 87.4 GFF3_example.split("\t") (*1000)
157.65 160.53 98.2 TOTAL
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14422>
_______________________________________
More information about the Python-bugs-list
mailing list