[soc2008-general] Applying to PSF vs applying to Google.
James Tauber
jtauber at jtauber.com
Mon Mar 24 03:43:38 CET 2008
This would not be accepted by PSF for two reasons:
1. while you are implementing in Python, the result isn't really a
contribution to the Python community.
2. we prefer projects that involve building on existing work that will
let you work with existing developers and users.
That doesn't mean it isn't a good GSoC proposal, just not for PSF. I
would definitely try Google as a mentoring org.
James
On Mar 23, 2008, at 9:59 PM, Wojciech Walczak wrote:
> Hello,
>
> time for submitting applications is coming and I am not sure where
> should
> I apply with the one described below.
> The project is written in python, so theoretically I could apply to
> PSF,
> but on the other hand it is a ''What if there is no organization
> doing the kind
> of open source work I'm doing?'' case, so I could apply to Google
> (I have already found somebody at my university willing to be a
> mentor).
> It will be hard to be accepted by Google. What I wanted to ask is:
> will it be harder to be accepted by PSF with this one?
>
>
> Details:
>
> I. Project description
>
> Since the project I would like to propose is not only an idea, but
> already
> a working code, in this chapter I will just describe what it is, and
> what
> has been already done. I started working on this in mid-december of
> 2007.
>
> Chain/Friends Framework is a set of tools designed to analyze social
> networks created by people while adding 'friends' to their 'buddies
> lists'
> in social network services (such as MySpace, Facebook or Bebo).
> In short, it treats networks of connections between buddies as its
> main
> source of information, it connects, splits, compares the data to show
> how much we say about ourselves just by adding friends to our
> buddies lists.
>
> 'Chain' searches for a path between two given users. Even though it
> is based
> on the idea of 'Small world experiment'[1] it seems to prove that
> the world
> is actually smaller, just like it is in Six Degrees of Kevin
> Bacon[2] game.
> The source of this difference is that it is cheaper to search for a
> link
> in a digital database than in real world.
>
> 'Friends' is designed to analyze buddies list of a given person and
> produce
> a report about groups that person lives in. The tool shows that (and
> how)
> one's social world is scattered; it also tries to identify what glues
> the groups (is it age, same city, school, family?). The algorithm
> idea:
> if a buddy from my buddies list has a number of people in his list
> that
> are common for both of us, and if everyone in that group of shared
> buddies
> knows eachother, then we all are a group of friends.
>
> As a proof of concept the framework comes with a library that handles
> data gathering for Nasza-klasa.pl (largest Polish social network
> service;
> more than 8 millions of users). As part of politeness strategy the
> library
> saves downloaded data in a database (multiple files with pickled data)
> so that the same data is not downloaded twice. As part of privacy
> strategy
> only the basic data about each user is stored (phone numbers,
> communicator
> numbers etc. _are not_ saved).
>
> II. Todo list
>
> In this chapter I mention the things that I would like to complete
> during
> (and thanks to) GSoC:
>
> 1. Chain: implement 'link nature detector', so that for a chain of
> people:
> A - B - C - D, we can identify what is common for A and B, B and C,
> C and D.
> (besides simple mechanisms comparing X and Y, advanced deduction based
> on a character of a group that both X and Y belong to is planned).
> See also: 'Future development' chapter (1).
>
> 2. Friends: implement the detector of a group character. This task
> is generally
> similar to the one above, but future development plans aren't: see
> also
> 'Future development' chapter (2, 3).
>
> 3. Lib: optional PostgreSQL database handling (big thing for me, as
> I never
> coded such thing before).
>
> 4. Chain/Friends: Google MAPS handling (again: big thing).
>
> 5. Lib: distributed database system: part of politeness strategy;
> imagine
> that three researchers are working with these tools; it is possible
> that in a part they will work on the same data; thus joining their
> databases will result in avoiding double downloads.
>
> 6. Restructure library code to:
> - make it easier to add other social network services handling
> - make it easier to maintain and create new tools
>
> 7. Add buddies lists update mechanism. At the moment once downloaded
> buddies list is not updated until it is deleted.
>
> 8. Add command line support to Chain and Friends.
>
> 9. Add some READMEs.
>
> III. Future development
>
> 1. With my GSoC goals accomplished I could use Chain to conduct a
> research
> to find out what kind of relations between buddies in chains are the
> most
> common. Basing on the results I would like to implement a second
> algorithm
> for buddies searching. The one implemented already is blind. It pays
> no
> attention to the categories humans care of (like similar age,
> possible family
> relations, geographical closeness or similar schools). The purpose
> of the
> second algorithm is to act more like humans do. With that done, I
> could conduct
> an 'algorithm war' research. The questions are: which algorithm need
> more steps
> to find a chain? which need to analyze more data? Is learing the
> computers to
> 'think' in human categories reasonable?
>
> 2. Develop antipathies seeker.
>
> 3. Develop connectors seeker. The idea of connectors[3] comes from
> Malcolm
> Gladwell's book 'The Tipping Point'.
>
>
> [1] http://en.wikipedia.org/wiki/Small_world_phenomenon
> [2] http://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon
> [3] http://en.wikipedia.org/wiki/The_Tipping_Point_(book)
>
> I think that after reading this it is not hard to guess that I am
> studing sociology ;-)
>
>
> Regards,
> Wojtek Walczak
> _______________________________________________
> soc2008-general mailing list
> soc2008-general at python.org
> http://mail.python.org/mailman/listinfo/soc2008-general
More information about the soc2008-general
mailing list