GadFly - MemoryError
Tres Seaver
tseaver at palladion.com
Wed Apr 7 09:32:51 EDT 1999
Oleg Broytmann wrote:
>
> Hello!
>
> I tried to add yeat another database backend to my project "Bookmarks
> database". My database contains now about 3000 URLs, not too much, I think.
> I subclass by BookmarksParser to parse bookmarks.html into gadfly database
> and got a database of 500 Kbytes - very small database, I hope.
> Then I tried to find duplicates (there are duplicates). I ran the query:
>
> SELECT b1.rec_no, b2.rec_no, b1.URL
> FROM bookmarks b1, bookmarks b2
> WHERE b1.URL = b2.URL
> AND b1.rec_no < b2.rec_no
How many duplicates are there? Something like
SELECT URL FROM bookmarks GROUP BY URL HAVING COUNT(*) > 1
will produce the URL's with duplicates; you could then do
SELECT rec_no, URL FROM bookmarks
WHERE URL IN
(SELECT URL FROM bookmarks GROUP BY URL HAVING COUNT(*) > 1)
or create a temp table first with the results of the subquery, then join it in a
separate query.
--
=========================================================
Tres Seaver tseaver at palladion.com 713-523-6582
Palladion Software http://www.palladion.com
More information about the Python-list
mailing list