Monday, April 25, 2011

How do I delete duplicate data in SQL Server?

I have sql table but, some value added more than once, I need to delete one of them. Can you give me a simple query?

From stackoverflow
  • From here. If you don't already have an "ID" field that uniquely identifies each row, you'll need to create one for this to work (you can always just drop the column after you are done):

    DELETE
    FROM MyTable
    WHERE ID NOT IN
    (
    SELECT MAX(ID)
    FROM MyTable
    GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn2)
    

    Also, do a search on this site for "delete duplicate rows sql server" and you'll see that this question has already been answered many times here.

    alextansc : One such link in SO: http://stackoverflow.com/questions/18932/sql-how-can-i-remove-duplicate-rows
    Lieven : +1 for being a correct answer, commenting on the referential integrity problem and teaching me something new all in one go.
  • Use:
    Set RowCount 1

    Then run your delete sql

    This will delete 1 row, regardless if there are multiple rows matching.

    Then either close the query window, or use Set RowCount 0 to reset it.

  • DECLARE @Duplicates TABLE (AValue VARCHAR(32))
    
    INSERT INTO @Duplicates VALUES ('No Duplicate')
    INSERT INTO @Duplicates VALUES ('Duplicate')
    INSERT INTO @Duplicates VALUES ('Duplicate')
    
    
    SELECT ID = IDENTITY(INT, 1, 1), * 
    INTO #Duplicates
    FROM @Duplicates
    
    DELETE FROM #Duplicates
    FROM #Duplicates d
         INNER JOIN (
           SELECT ID = MIN(ID)
           FROM #Duplicates
           GROUP BY AValue
           HAVING COUNT(*) > 1
          ) id ON id.ID = d.ID
    
    DELETE FROM @Duplicates
    
    INSERT INTO @Duplicates
    SELECT AValue FROM #Duplicates
    
    DROP TABLE #Duplicates
    
    SELECT * FROM @Duplicates
    
    • Create temporary table with the same schema
    • INSERT INTO temptable SELECT DISTINCT * FROM oldtable
    • DELETE FROM oldtable
    • INSERT INTO oldtable SELECT * FROM temptable
    • DROP temptable

    And don't forget:

    • Refactor your tables to not allow this to happen again
  • Not your answer, but I guess all the others (that gave answers) dont care for referential integrity... :( Just wildly deleting dupes is crazy!

    Eric Petroelje : Great point, but referential integrity essentially makes this an unanswerable question without creating a massive data movement script.
    leppie : Exactly! Best would be to turn on DELETE CASCADES and then try.
  • How about this article delete duplicate records in sql server

0 comments:

Post a Comment