A this time I'd like to show some techniques to deal with duplicates
CREATE TABLE TableWithDups
(
row_id int identity(1,1),
col1 int,
col2 int
)
INSERT INTO TableWithDups (col1 ,col2 ) VALUES (1,10)
INSERT INTO TableWithDups (col1 ,col2 ) VALUES (1,10)
INSERT INTO TableWithDups (col1 ,col2 ) VALUES (4,2)
INSERT INTO TableWithDups (col1 ,col2 ) VALUES (3,3)
INSERT INTO TableWithDups (col1 ,col2 ) VALUES (3,3)
INSERT INTO TableWithDups (col1 ,col2 ) VALUES (9,20)
PRINT 'Duplicates in Table'
SELECT * FROM TableWithDups
WHERE EXISTS
(SELECT * FROM TableWithDups TWD
WHERE TWD.row_id <> TableWithDups.row_id
AND TWD.col1 = TableWithDups.col1
AND TWD.col2 = TableWithDups.col2)
PRINT 'Duplicates to be Deleted'
SELECT * FROM TableWithDups
WHERE EXISTS
(SELECT * FROM TableWithDups TWD
WHERE TWD.row_id < TableWithDups.row_id
AND TWD.col1 = TableWithDups.col1
AND TWD.col2 = TableWithDups.col2 )
PRINT 'Executig Deletion of Duplicates '
DELETE FROM TableWithDups
WHERE EXISTS
(SELECT * FROM TableWithDups TWD
WHERE TWD.row_id < TableWithDups.row_id
AND TWD.col1 = TableWithDups.col1
AND TWD.col2 = TableWithDups.col2 )
SELECT * FROM TableWithDups
If you are using SQL Server 2005 take a look at the below script
WITH Duplicates
AS
( SELECT *, ROW_NUMBER()OVER(PARTITION BY col1 , col2 ORDER BY col1 , col2 ) AS cnt FROM TableWithDups)
DELETE FROM Duplicates WHERE cnt> 1;
Recently my good friend Doron Farber wrote great article about how efficiently remove duplicates on huge tables based on real world example.
http://www.dfarber.com/computer-consulting-blog/2011/12/26/remove-duplicate-records-in-sql.aspx
Thursday, February 22, 2007
Subscribe to:
Post Comments (Atom)
19 comments:
Hi Uri,
Surely the true test is when there is no identifier on the table (i.e. a badly designed table)?
The SQL Server 2005 method works for this but in SQL 2000 the best I've managed to come up with is using a temp table and SELECT DISTINCT - which gets a lot more complicated if the duplicate rows have foreign key references :-/
Any thoughts?
By definition this is not a table at all -- it has no key! What this usually models is a commodity, we need a column for the quantity, which can be an absolute scale (count) or a ratio scale (units of measure).
CREATE TABLE Inventory
(item_name VARCHAR(25) NOT NULL PRIMARY KEY,
item_qty INTEGER NOT NULL
CHECK (item_qty > 0));
INSERT INTO Inventory VALUES ('No. 5 Machine Bolts', 123);
INSERT INTO Inventory VALUES ('No. 5 Hex Nuts', 127);
etc.
This sort of design comes from mimicking a paper form or other physical data source directly in SQL. It should never be allowed inthe first place.
These articles are fantastic; the information you show us is interesting for everybody and really good written. It’s just great!!
Agriculture Logo
Hi, I like your blog very much! It is found me so interesting and informative... Thanks very much for sharing this amazing information over here…Ready-Made Logo
This is an excellent site, I like your blog and your writing style, thanks for sharing your knowledge, keep up the good work. Cheap Essay Writing
Hi, I like your blog very much! It is found me so interesting and informative. Thanks very much for sharing this amazing information over here. Essay Writing Services
I really like this website. This is a great resources, provide you with your free distribution. It gives in depth information. Thank you for this valuable information. Paper Writing Services
Thesis Paper Writing
Research Paper Writing
Letter Writing Services
Appreciation for nice Updates.I like your blog very much! It is found me so interesting and informative.Database Solutions
Hello Uri,
Your code is great regarding removing duplicate records. Here is an article that covers that topic in depth:
http://www.dfarber.com/computer-consulting-blog/2011/12/26/remove-duplicate-records-in-sql.aspx
Regards,
Doron
Http://www.dFarber.Com
Remote DBA:
http://www.dfarber.com/microsoft-solutions/mssql-server/ms-sql-remote-dba-services.aspx
Hi Uri,
Here is an article to cover this topic at:
Remove duplicate records in sql
Regards,
Doron
The Farber Consulting Group, Inc.
such a nice work! Happy to find this great work keep it up. Thanks for shearing.
Search Strings
This was a great and interesting article to read. I found many interesting things from this site.
SEO Company
Good work by the blog writter! such a nice and informative article keep shearing more articles.
Google Adwords
Have a good day! Thanks for sharing. :0
Cheap Essay
This was a great and interesting article to read. I found many interesting things from this site.
Social Media
Good to see this blog really informative.
Conversion Optimization
ARINET DBA Services is a Chicago, IL based organization established in 2013. Uniting more than 100+ years of combined involvement in giving quality Oracle database Support administrations to American organizations, we've given the diverse options about remote dba experts, remote dba organizations, remote dba reinforce, remote database, prophet remote, dba remote, database association, prophet dba reinforce, Oracle sponsorship and Oracle Consultants.
Post a Comment