GSoC: Cython-rewrite and improvement for scipy.cluster

Hi all, I am Richard Tsai, and currently a second year undergraduate student in Computer Science at Sun Yat-sen University. I wish to take part in this year's GSoC. I'm learning machine learning with scipy/sklearn now and I've ever contributed some code to SciPy since last term. I've read Ralf's [Roadmap to Scipy 1.0][1] and I'm interested in the `cluster` part. I want to help finish the cython-rewrite work and make some improvement for it as my GSoC project. I noticed that there's a `_vq_rewrite.pyx` in `scipy/cluster` but I think it still needs furthor work. I want to start with some issues related to `cluster` as a warm-up, and then try to re-implement the `cluster.vq` module in cython first and try to do some optimizations. I'm familiar with it since I've ever done a little SNS text mining research with it with my classmates in a contest. As for the `cluster.hierarchy` module, I do not know a lot about hierarchical clustering for I haven't used it in practice. I may start with reading some papers and writing some examples for the documents. Then I will start the cython-rewrite for the `hierarchy` module. Finally, I plan to make some enhancements for the package. Maybe to automatically determine the number of clusters with Elbow Method? I haven't had a detailed plan yet. Since this idea is not listed on the ideas page, I don't know if it is suitable to be a GSoC project. If you have any suggestions, please let me know. I'd appreciate it if you can provide any guidance/opinions/suggestions. Regards, Richard [1]: https://github.com/rgommers/scipy/blob/roadmap/doc/ROADMAP.rst.txt

Hi Richard, On Wed, Feb 26, 2014 at 1:17 PM, Richard Tsai <richard9404@gmail.com> wrote:
Hi all,
I am Richard Tsai, and currently a second year undergraduate student in Computer Science at Sun Yat-sen University. I wish to take part in this year's GSoC. I'm learning machine learning with scipy/sklearn now and I've ever contributed some code to SciPy since last term.
I've read Ralf's [Roadmap to Scipy 1.0][1] and I'm interested in the `cluster` part.
We should merge that thing. It's not mine, it's the product of a lot of discussion between most of the core devs. I want to help finish the cython-rewrite work and make some improvement for
it as my GSoC project.
Great!
I noticed that there's a `_vq_rewrite.pyx` in `scipy/cluster` but I think it still needs furthor work. I want to start with some issues related to `cluster` as a warm-up, and then try to re-implement the `cluster.vq` module in cython first and try to do some optimizations. I'm familiar with it since I've ever done a little SNS text mining research with it with my classmates in a contest. As for the `cluster.hierarchy` module, I do not know a lot about hierarchical clustering for I haven't used it in practice. I may start with reading some papers and writing some examples for the documents. Then I will start the cython-rewrite for the `hierarchy` module. Finally, I plan to make some enhancements for the package. Maybe to automatically determine the number of clusters with Elbow Method? I haven't had a detailed plan yet.
Since this idea is not listed on the ideas page, I don't know if it is suitable to be a GSoC project. If you have any suggestions, please let me know. I'd appreciate it if you can provide any guidance/opinions/suggestions.
I think that there's definitely enough work there for one GSoC. However I don't know much about cluster so I'll let one of the experts comment on that. Cheers, Ralf
Regards,
Richard
[1]: https://github.com/rgommers/scipy/blob/roadmap/doc/ROADMAP.rst.txt
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev
participants (2)
-
Ralf Gommers
-
Richard Tsai