[Cython] How to improve the performance when doing string/unicode replace and search?

Yunfan Jiang jyf1987 at gmail.com
Wed Mar 30 08:41:22 CEST 2011


sorry for ask in the wrong groups, what i mean "not use it" is that, i have
download the
latest cython source and installed it

and i check that the change code is already there, but my bug can still
replay


On Wed, Mar 30, 2011 at 1:02 PM, Stefan Behnel <stefan_ml at behnel.de> wrote:

> Yunfan Jiang, 30.03.2011 05:33:
>
>  hi, i used to ask some string process question here, and found  a bug, it
>> seems you guys fix the bug but not use it
>>
>
> Not sure what you mean by "not use it".
>
>
>
>  and this time , my problem is about the performance,
>> i need to wrote  a filter which search sorts of keywords in the target
>> string , and stop if matched,
>> this act require unicode input/output  , so i wrote a trie like module to
>> done it, it works ,but i found its too slower than using regex module
>> so could you guys give some tips on string process performance?
>>
>
> Note that the right place to ask usage related questions is the Cython
> users mailing list, not the core developers mailing list. I set a follow-up
> to point you there.
>
> Generally speaking, a trie isn't necessarily fast, and it's certainly not
> the best algorithmic approach for keyword search. You should read up on
> Aho-Corasick and friends. I also wrote a simple Cython module that
> implements a keyword search algorithm ("acora", it's on PyPI), but it's
> unusable for large sets of keywords due to state explosion. It's pretty fast
> for smaller sets though.
>
> Stefan
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>



-- 
ME = {
   "name": [ "jyf", "yunfan", "wuxian" ],
   "im": {
        "gtalk": "jyf1987 at gmail.com",
        "msn": "geek42 at live.cn"
          },
   "job": "python engineer",
   "site": "http://hi.baidu.com/jyf1987",
   "interested":  {
       "tech": [ "linux", "python", "lua", "php", "html5", "c", "nosql"],
       "history": ["chinese history", "global history"],
       "SF": [ "hard SF", "Thought experiment" ],
       "music": [ "New Age", "Chinese old theme", "Electronic music",
"Strange Music :}"]
     }
 }
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20110330/8e2d9e6b/attachment.html>


More information about the cython-devel mailing list