How to find the duplicate data in a table in SQL?

How-Tos FAQs
December 16, 2018

Supercharge Your Snowflake SQL with Datameer's Data Transformation

Let us set up a ‘customers’ table with their email addresses:

CREATE TABLE customers (
    customers_id INT,
    customer_name VARCHAR(100),
    email VARCHAR(100)
);

INSERT INTO  customers
SELECT 1, 'Jack', 'jack@email.com' UNION ALL
SELECT 2, 'July', 'jack@email.com' UNION ALL
SELECT 3, 'John', 'john@email.com' UNION ALL
SELECT 4, 'Rose', 'rose@email.com';

We can find duplicate data using different approaches.

Using GROUP BY

We can group the table by email column and count the rows with the same email using the HAVING clause.

SELECT   
    email ,
    COUNT(1) email_count
FROM customers
GROUP BY email
HAVING COUNT(1) > 1;



#Output
# email       email_count
------------------------------------------------------
jack@email.com   2

Using Window function

We can use the ROW_NUMBER function and partition the data by email:

SELECT customer_name
, email
, IF( ROW_NUMBER () OVER (PARTITION BY email ORDER BY email) > 1,'Yes','No') is_duplicate
FROM customers

#Output



# customer_name   email     is_duplicate
--------------------------------------------
Jack          jack@email.com   No
July          jack@email.com   Yes
John          john@email.com  No
Rose          rose@email.com  No

Up Next:

Read How do I calculate ratios using the data in two columns to create a ratio in Tableau?

How to find the duplicate data in a table in SQL?

Up Next:

Related Posts

How to INSERT values in a table using SELECT query in SQL?

How to UPDATE a table with a SELECT statement in SQL Server?

How to use PIVOT to convert rows to columns in SQL SERVER...

Product

Company

Resources

Sign up for our newsletter

Follow us on