hit tracker

Duplicate Records

Home » Duplicate Records

Possible ways to remove Duplicate Records in a Database Table – SQL Server

in Database, Programming by Balamanigandan B Comments are off

Structured Query Language (SQL) is one of the most popular languages today to querying database and there are many reasons for this. It has many qualities that make it suitable for development for users at most all levels. Let’s discuss how efficiently remove duplicate records in a SQL Server Database table.

SQL Server is a Relational Database Management System (RDBMS) and the most widely used client–server model RDBMS in the current trend applications. RDBMS is perhaps the most important Database system used in the workplace today.

Plan your Data Structure and table design before you start your application. It purely depends on the amount of Data and type of application.

There are two types of table structures, they are

  1. With Unique Column
  2. Without Unique Column

First we discuss about “with Unique Column”. Let us consider the SQL Query to create an EmplyeeInfo Table for our further illustration purpose.

  1. Table Contains Unique Column : Number 
    Create Table:
    EmployeeInfo

SQL Server Syntax:

Insert Data:

Insert the data into the EmployeeInfo table.

Select Data:

Select the data from the EmployeeInfo table.

 

Id firstName lastName dob gender
1 Bala Manigandan 1985-12-06 1
2 Bala Manigandan 1985-12-06 1
3 Bala Manigandan 1985-12-06 1
4 Prayag Pavithran 1985-09-15 1
5 Bala Manigandan 1985-12-06 1
6 Sakthivel G 1986-12-11 1
7 Prayag Pavithran 1985-09-15 1
8 Bala Manigandan 1985-12-06 1
9 Sakthivel G 1986-12-11 1
10 Prayag Pavithran 1985-09-15 1

In the above EmployeeInfo table, id 1, 4, 6 has duplicate values with unique id

Case 1: without Column ‘id’

 

firstName lastName dob Gender
Bala Manigandan 1985-12-06 1
Prayag Pavithran 1985-09-15 1
Sakthivel G 1986-12-11 1

 

Case 2: with Column ‘id’

 

Id firstName lastName dob Gender
1 Bala Manigandan 1985-12-06 1
4 Prayag Pavithran 1985-09-15 1
6 Sakthivel G 1986-12-11 1

The possible way to delete duplicate record in the above table using the following approach, for that first we have the select the duplicate rows in the said table. Let us see the SQL for selecting duplicate rows

 

Id firstName lastName Dob gender
2 Bala Manigandan 1985-12-06 1
3 Bala Manigandan 1985-12-06 1
5 Bala Manigandan 1985-12-06 1
7 Prayag Pavithran 1985-09-15 1
8 Bala Manigandan 1985-12-06 1
9 Sakthivel G 1986-12-11 1
10 Prayag Pavithran 1985-09-15 1

The above SQL find out’s the duplicate rows perfectly, so by using this query we can delete the duplicate rows from the above said database table EmployeeInfo. The SQL is

Table Contains Unique Column: Varchar (Alphabets | Numbers | Special Character)

Create Table: EmployeeInfo

Insert Data:

Insert the data into the EmployeeInfo table.

Select Data:

Select the data from the EmployeeInfo table.

Id firstName lastName dob gender
E001 Bala Manigandan 1985-12-06 1
E002 Bala Manigandan 1985-12-06 1
E003 Bala Manigandan 1985-12-06 1
E004 Prayag Pavithran 1985-09-15 1
E005 Bala Manigandan 1985-12-06 1
E006 Sakthivel G 1986-12-11 1
E007 Prayag Pavithran 1985-09-15 1
E008 Bala Manigandan 1985-12-06 1
E009 Sakthivel G 1986-12-11 1
E010 Prayag Pavithran 1985-09-15 1

In the above EmployeeInfo table, id 1, 4, 6 has duplicate values with unique id

The possible way to delete duplicate record in the above table using the following approach, for that first we have the select the duplicate rows in the said table. Let us see the SQL for selecting duplicate rows

 

Id firstName lastName Dob Gender
E002 Bala Manigandan 1985-12-06 1
E003 Bala Manigandan 1985-12-06 1
E005 Bala Manigandan 1985-12-06 1
E007 Prayag Pavithran 1985-09-15 1
E008 Bala Manigandan 1985-12-06 1
E009 Sakthivel G 1986-12-11 1
E010 Prayag Pavithran 1985-09-15 1

The above SQL find out’s the duplicate rows perfectly, so by using this query we can delete the duplicate rows from the above said database table EmployeeInfo. The SQL is

  1. Table Contains Non – Unique Column

Create Table: EmployeeInfo

Insert Data:

Insert the data into the EmployeeInfo table.

Select Data:

Select the data from the EmployeeInfo table.

 

Id firstName lastName dob gender
1 Bala Manigandan 1985-12-06 1
1 Bala Manigandan 1985-12-06 1
1 Bala Manigandan 1985-12-06 1
2 Prayag Pavithran 1985-09-15 1
1 Bala Manigandan 1985-12-06 1
3 Sakthivel G 1986-12-11 1
2 Prayag Pavithran 1985-09-15 1
1 Bala Manigandan 1985-12-06 1
3 Sakthivel G 1986-12-11 1
4 Ram Kumar 1968-11-02 1

 

In the above EmployeeInfo table, id 1, 2, 3 has duplicate values. The following SQL gives the DISTINCT records

 

 

Id firstName lastName dob Gender
1 Bala Manigandan 1985-12-06 1
2 Prayag Pavithran 1985-09-15 1
3 Sakthivel G 1986-12-11 1
4 Ram Kumar 1968-11-02 1

The possible way to delete duplicate record in the above table using the following approach, for that first we have the select the duplicate rows in the said table. Let us see the SQL for selecting duplicate rows

 

Id firstName lastName dob Gender
1 Bala Manigandan 1985-12-06 1
2 Prayag Pavithran 1985-09-15 1
3 Sakthivel G 1986-12-11 1

The above SQL find out’s the duplicate rows perfectly, append the Row Number with the SELECT Statement, then the SQL is

 

Id firstName lastName dob gender RN
1 Bala Manigandan 1985-12-06 1 1
1 Bala Manigandan 1985-12-06 1 2
2 Bala Manigandan 1985-12-06 1 3
2 Prayag Pavithran 1985-09-15 1 1
1 Bala Manigandan 1985-12-06 1 4
3 Sakthivel G 1986-12-11 1 1
2 Prayag Pavithran 1985-09-15 1 2
1 Bala Manigandan 1985-12-06 1 5
3 Sakthivel G 1986-12-11 1 2
4 Ram Kumar 1968-11-02 1 1

A common table expression (CTE) can be thought of as a temporary result set that is defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. A CTE is similar to a derived table in that it is not stored as an object and lasts only for the duration of the query, by using this query we can delete the duplicate rows from the above said database table EmployeeInfo. The SQL is

Conclusion: 

This article targets at understanding how efficiently we delete duplicate records with both scenario with Unique Value / without Unique Value This article is intended for the beginner/intermediate level. I hope, this post will solve all your basic functionality deletion of duplicate records in a SQL Server Database. Think Big… Start Small… Do Fast…