Query regarding set([])?

Terry Reedy tjreedy at udel.edu
Fri Jul 10 16:47:06 EDT 2009


vox wrote:
> Hi,
> I'm contsructing a simple compare-script and thought I would use set
> ([]) to generate the difference output. But I'm obviosly doing
> something wrong.
> 
> file1 contains 410 rows.
> file2 contains 386 rows.
> I want to know what rows are in file1 but not in file2.
> 
> This is my script:
> s1 = set(open("file1"))
> s2 = set(open("file2"))
> s3 = set([])
> s1temp = set([])
> s2temp = set([])
> 
> s1temp = set(i.strip() for i in s1)
> s2temp = set(i.strip() for i in s2)
> s3 = s1temp-s2temp
> 
> print len(s3)
> 
> Output is 119. AFAIK 410-386=24. What am I doing wrong here?

Assuming that every line in s2 is in s1. If there are lines in s2 that 
are not in s1, then the number of lines in s1 not in s2 will be larger 
than 24. s1 - s2 subtracts the intersection of s1 and s2.




More information about the Python-list mailing list