Disable automatic interning
fetchinson at googlemail.com
Thu Mar 19 02:36:19 CET 2009
>> > I'm working on some graph generation problem where the node identity
>> > is significant (e.g. "if node1 is node2: # do something) but ideally I
>> > wouldn't want to impose any constraint on what a node is (i.e. require
>> > a base Node class). It's not a show stopper, but it would be
>> > problematic if something broke when nodes happen to be (small)
>> > integers or strings.
>> But if two different nodes are both identified by, let's say the
>> string 'x' then you surely consider this an error anyway, don't you?
>> What's the point of identifying two different nodes by the same
> In this particular problem the graph represents web surfing behavior
> and in the simplest case the nodes are plain URLs. Now suppose a
> session log has recorded the URL sequence [u1, u2, u1]. There are two
> scenarios for the second occurrence of u1: it's either caused by a
> forward action (e.g. clicking on a link to u1 from page u2) or a back
> action (i.e. the user clicked the back button). If this information is
> available, it makes sense to differentiate them. One way to do so is
> to represent the result of every forward action with a brand-new node
> and the result of a back action with an existing node. So even though
> the state of the two occurrences of u1 are the same, they are not
> necessarily represented by a single node.
Okay, I think I understand what you want to accomplish but in this
case I would use a different data structure such that u1 is always
represented by the same string, same identifier, whatever, let's say
'x', and then I'd be happy with 'x' is 'x' being always True. The
relationship between u1 and u2, and u2 and u1 would be represented by
additional data so the difference between the first u1 and the second
u1 would be clear once this additional data is available, because it
would be used in the comparison explicitly.
> If it was always possible to make a copy of a string instance (say,
> with a str.new() classmethod), then it would be sufficient to pass "map
> (str.new, session_urls)" to the graph generator. Equality would still
> work as before but all instances in the sequence would be guaranteed
> to be unique. Thankfully, as Martin mentioned, this is easy even
> without str.new(), simply by wrapping each url in an instance of a
> small Node class.
Psss, psss, put it down! - http://www.cafepress.com/putitdown
More information about the Python-list