sql - Find duplicates takes a long itme -


i'm having following table layout: 4 different tables, each containing around 10 15 million entries. 3 string attributes of each table same (let's call them id, name1, name2). want read entries having same id column different (name1,name2) tuples. estimated less 0.5 % of entries matching.

we've created view allentries (basically union of relevant attributes on 4 tables) , our query looks this:

select * allentries group id having count(distinct(name1)) > 1 or count(distinct(name2)) > 1 

executing query in our test database 2 million entries in each table (i.e. 8 million entries in view) takes around 2 3 minutes (nice server).

q: there performance improvement possible improve performance?

try cte row_number() instead of traditional group by/having approach:

;with ctedups (     select  *             ,row_number() over(partition name1 order id) rn1             ,row_number() over(partition name2 order id) rn2        allentries ) select  *    ctedups   rn1 > 1     or  rn2 > 1 

Comments

Popular posts from this blog

java - WrongTypeOfReturnValue exception thrown when unit testing using mockito -

php - Magento - Deleted Base url key -

android - How to disable Button if EditText is empty ? -