Unique idea presentation

Respected sir/madam, I would like to propose (if not already there) an idea/feature regarding your mailing web app as a part of GSoC. Since it runs on python I would like to integrate NLP text summarisation using the nltk library.
Sometimes when people are on the go they wouldn't want to read the whole email and maybe just summarise the relevant parts. This feature allows them to do that with a click of a button and see the result as a pop up or modal.
I hope my idea has been conveyed properly. Let me know if it can be integrated into your web app and also whether I should add it as a proposal for GSoC selection process.
Thanking you Kurian Thomas Pulimoottil

Hi, Kurian!
(Is that how you would like to be addressed in the future?)
kurian thomas (kk) writes:
Respected sir/madam,
You don't need to be so formal. "Hello, I am Kurian and I'm interested in working with Mailman on GSoC" the first time, and something like "Hi folks" after that if you're not replying to a specific person, is pretty common practice in open source projects. Although formally you're applying for a mentor-student (ie, hierarchical) relationship, in our project we want to think of you as a future colleague. Of course experience and demonstrated skill matter in getting ideas accepted, but being inexperienced or new to the project is not a barrier to participation, especially offering ideas.
I think this is a very interesting idea in the long run, but for a couple of reasons I doubt it's a good proposal for GSoC this year.
Google specifically excludes requirements, specification, and design from the project. That doesn't mean we don't value your ideas. It means that we (the mentors) had better have a very clear idea of spec and design by the time the project starts, and be able to implement it ourselves. Otherwise it's not fair to you, because you will be graded on completing your proposed effort, and there's only limited wiggle room for passing you based on effort. Google would not be happy with us if we accepted a project we're not confident a student can complete. (Organizations that have abused their students that way end up with very restricted allocations, or even out of GSoC entirely.)
In your idea, "summarize" is very vague. A simple spec would not use NLP at all, but simply pull in the first few lines. Then, if those are quoted, get as many more from the first few unquoted lines, otherwise pull in more unquoted lines until the desired size is reached. I doubt this project is "big enough" to keep you busy for a whole summer. :-)
If you can make clear (1) what filtering the NLP would be able to do, (2) what algorithms are likely to do this well enough to be useful, and (3) existing modules implementing some of those algorithms by the end of the application period, as well as (4) provide a plan and schedule for implementation, sure, I'd love to have a go at it. But that is VERY risky for you. It's a lot of work for you, but we'd have to reject a proposal we're not confident you can implement in one summer. I know *I* couldn't do it in one summer (I don't have the NLP knowledge, and won't have time to study until the summer).
An alternative project would be to create an API for plugging in algorithms. Then integrate the API into at least one, preferably more, of the following processes: (1) digest table of contents (in core), at user option, (2) moderator's view of messages being held, or (3) the HyperKitty list summary view. Finally, implement two algorithms as proof of concept, one being the simple "top lines" algorithm described above, the other being any easy-to-implement NLP- based algorithm. This is sufficiently well-defined, but doesn't match your expressed interest in NLP and "web apps" very well, except that experience with NLP would surely be helpful in designing such an API. And it's still much more work than writing a proposal for another project would involve.
My recommendation is that this year you apply for something more straightforward, and work with us in the autumn to specify a task based on summaries using NLP for next year. I don't think we could "reserve" it for you (AIUI, that's against Google rules), but you'd have a head start and definitely be the favorite to implement in a future GSoC.
Again, for the applications I gave in the alternate "API" project, I think this is really interesting. But it's very risky for you (that is, it's unlikely to be accepted unless you know a lot more than you've said so far) unless Abhilash thinks he has a pretty good idea of how to implement it well.

Hi, Kurian!
(Is that how you would like to be addressed in the future?)
kurian thomas (kk) writes:
Respected sir/madam,
You don't need to be so formal. "Hello, I am Kurian and I'm interested in working with Mailman on GSoC" the first time, and something like "Hi folks" after that if you're not replying to a specific person, is pretty common practice in open source projects. Although formally you're applying for a mentor-student (ie, hierarchical) relationship, in our project we want to think of you as a future colleague. Of course experience and demonstrated skill matter in getting ideas accepted, but being inexperienced or new to the project is not a barrier to participation, especially offering ideas.
I think this is a very interesting idea in the long run, but for a couple of reasons I doubt it's a good proposal for GSoC this year.
Google specifically excludes requirements, specification, and design from the project. That doesn't mean we don't value your ideas. It means that we (the mentors) had better have a very clear idea of spec and design by the time the project starts, and be able to implement it ourselves. Otherwise it's not fair to you, because you will be graded on completing your proposed effort, and there's only limited wiggle room for passing you based on effort. Google would not be happy with us if we accepted a project we're not confident a student can complete. (Organizations that have abused their students that way end up with very restricted allocations, or even out of GSoC entirely.)
In your idea, "summarize" is very vague. A simple spec would not use NLP at all, but simply pull in the first few lines. Then, if those are quoted, get as many more from the first few unquoted lines, otherwise pull in more unquoted lines until the desired size is reached. I doubt this project is "big enough" to keep you busy for a whole summer. :-)
If you can make clear (1) what filtering the NLP would be able to do, (2) what algorithms are likely to do this well enough to be useful, and (3) existing modules implementing some of those algorithms by the end of the application period, as well as (4) provide a plan and schedule for implementation, sure, I'd love to have a go at it. But that is VERY risky for you. It's a lot of work for you, but we'd have to reject a proposal we're not confident you can implement in one summer. I know *I* couldn't do it in one summer (I don't have the NLP knowledge, and won't have time to study until the summer).
An alternative project would be to create an API for plugging in algorithms. Then integrate the API into at least one, preferably more, of the following processes: (1) digest table of contents (in core), at user option, (2) moderator's view of messages being held, or (3) the HyperKitty list summary view. Finally, implement two algorithms as proof of concept, one being the simple "top lines" algorithm described above, the other being any easy-to-implement NLP- based algorithm. This is sufficiently well-defined, but doesn't match your expressed interest in NLP and "web apps" very well, except that experience with NLP would surely be helpful in designing such an API. And it's still much more work than writing a proposal for another project would involve.
My recommendation is that this year you apply for something more straightforward, and work with us in the autumn to specify a task based on summaries using NLP for next year. I don't think we could "reserve" it for you (AIUI, that's against Google rules), but you'd have a head start and definitely be the favorite to implement in a future GSoC.
Again, for the applications I gave in the alternate "API" project, I think this is really interesting. But it's very risky for you (that is, it's unlikely to be accepted unless you know a lot more than you've said so far) unless Abhilash thinks he has a pretty good idea of how to implement it well.
participants (2)
-
kurian thomas (kk)
-
Stephen J. Turnbull