[Cython] How to improve the performance when doing string/unicode replace and search?
Yunfan Jiang
jyf1987 at gmail.com
Wed Mar 30 08:41:22 CEST 2011
sorry for ask in the wrong groups, what i mean "not use it" is that, i have
download the
latest cython source and installed it
and i check that the change code is already there, but my bug can still
replay
On Wed, Mar 30, 2011 at 1:02 PM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Yunfan Jiang, 30.03.2011 05:33:
>
> hi, i used to ask some string process question here, and found a bug, it
>> seems you guys fix the bug but not use it
>>
>
> Not sure what you mean by "not use it".
>
>
>
> and this time , my problem is about the performance,
>> i need to wrote a filter which search sorts of keywords in the target
>> string , and stop if matched,
>> this act require unicode input/output , so i wrote a trie like module to
>> done it, it works ,but i found its too slower than using regex module
>> so could you guys give some tips on string process performance?
>>
>
> Note that the right place to ask usage related questions is the Cython
> users mailing list, not the core developers mailing list. I set a follow-up
> to point you there.
>
> Generally speaking, a trie isn't necessarily fast, and it's certainly not
> the best algorithmic approach for keyword search. You should read up on
> Aho-Corasick and friends. I also wrote a simple Cython module that
> implements a keyword search algorithm ("acora", it's on PyPI), but it's
> unusable for large sets of keywords due to state explosion. It's pretty fast
> for smaller sets though.
>
> Stefan
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
--
ME = {
"name": [ "jyf", "yunfan", "wuxian" ],
"im": {
"gtalk": "jyf1987 at gmail.com",
"msn": "geek42 at live.cn"
},
"job": "python engineer",
"site": "http://hi.baidu.com/jyf1987",
"interested": {
"tech": [ "linux", "python", "lua", "php", "html5", "c", "nosql"],
"history": ["chinese history", "global history"],
"SF": [ "hard SF", "Thought experiment" ],
"music": [ "New Age", "Chinese old theme", "Electronic music",
"Strange Music :}"]
}
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20110330/8e2d9e6b/attachment.html>
More information about the cython-devel
mailing list