How to remove duplicate rows from a table in SQL Server

  • How-Tos FAQs
  • December 17, 2018
Supercharge Your Snowflake SQL
with Datameer's
Data Transformation

The best way to remove duplicate rows from a table in SQL server is by using the combination of

Common Table Expressions (CTE) and ROW_NUMBER window function.

Let us set up an ‘orders’ table as below. The table has duplicated rows for ‘Jack’ and ‘Rose.’ All the columns are the same for them.

CREATE TABLE orders (
       order_id INT,
       customer_name VARCHAR(100),
       order_date DATETIME,
       total_orders INT
);

INSERT INTO  orders
SELECT 1,'Jack', '2020-02-03', 4 UNION ALL
SELECT 1,'Jack', '2020-02-03', 4 UNION ALL
SELECT 2, 'Rose', '2020-01-09', 19 UNION ALL
SELECT 2, 'Rose', '2020-01-09', 19 UNION ALL
SELECT 3,'John', '2020-02-03', 45

-- Window function ROW_NUMBER() adds a unique row number for each duplicate row.
;WITH CTE AS
(
SELECT *, ROW_NUMBER()
OVER (PARTITION BY order_id, customer_name, order_date, total_orders
      ORDER BY order_id
      ) AS row_num
FROM orders
)
-- This new unique column (row_num) is used to delete all the duplicate rows leaving only one copy.

DELETE FROM CTE WHERE row_num > 1

SELECT * FROM orders

-- Only a single copy of a unique row is reminded leaving no duplicate data
order_id  customer_name   order_date                 total_orders
---------------------------------------------------------------------------
1         Jack            2020-02-03 00:00:00.000    4
2         Rose            2020-01-09 00:00:00.000    19
3         John            2020-02-03 00:00:00.000    45

Up Next:

Read: How to return the duplicate data on multiple columns in SQL