In this article, we will learn, How to get duplicate Rows based on few column values using the SQL server.

If you are looking to get duplicate rows from my database table based on some column then have come to the right place.

Recently I’m working on an e-commerce website Project in which I have to Find Duplicates From a Table in SQL Server.

There are many different ways for getting duplicate Rows based on few columns with varying efficiency, and the relative size of your database tables. In this post, We will discuss some efficient method to do this task.

Let’s take an example, I have tables Company_Customer in SQL DB.Table has some records of the Customers.Now I want is to select all duplicate the records from Company_Customer.

get duplicate records

Method-1 Using group by

select CustomerName,CustomerCity,CustomerSalary, count(*)
  from Company_Customer
  group by CustomerName,CustomerCity,CustomerSalary
  having count(*) > 1

Generic Query 

select column1,column2,column3, count(*)
  from TableName
  group by column1,column2,column3
  having count(*) > 1

Query Reseult:

result

Method-2 Using row_number()

If you have SQL Server 2005 or lastest version, then you can use row_number() functions to return the duplicate records from the database table.

select    *
from(select   CustomerName
            , CustomerCity
            , CustomerSalary          
            , row_number() over (partition by   CustomerName
                                                , CustomerCity
                                                , CustomerSalary
                                    order by Id ) as occurrence
          from      Company_Customer) x
where     occurrence > 1

Generic Query

select    *
from(select   colm1
            , colm2
            , colm3          
            , row_number() over (partition by    colm1
                                                , colm2
                                                , colm3
                                    order by colm1) as occurrence
          from      TableName) x
where     occurrence > 1

Query Reseult:

ocuurence rownumber

Method-3 Using Rank()

Above is the easiest solution with SQL Server 2005. it will return all records record except for the first one if there are multiple occurrences.

SELECT Id
    , CustomerName
    , CustomerCity
    , CustomerSalary
FROM
(
    SELECT Id
      , CustomerName
      , CustomerCity
      , CustomerSalary
        , RANK() OVER (PARTITION BY CustomerName, CustomerCity, CustomerSalary ORDER BY Id ASC) AS [rank]
    FROM Company_Customer
) a
WHERE [rank] > 1

Generic Query

SELECT Id
    , cloumn1
    , cloumn2
    , cloumn3
FROM
(
    SELECT Id
      , cloumn1
      , cloumn2
      , cloumn3
        , RANK() OVER (PARTITION BY cloumn1, cloumn2, cloumn3 ORDER BY Id ASC) AS [rank]
    FROM Company_Customer
) a
WHERE [rank] > 1

ranK result

Method-4 Using CTE


with MYCTE  as (
    select row_number() over ( partition by CustomerName,CustomerCity,CustomerSalary  order by Id) rown, *
    from Company_Customer  
    ) 
select * from MYCTE where rown >1

Sql Table Create and Insert Script

 

CREATE TABLE [dbo].[Company_Customer](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [CustomerName] [nvarchar](150) NOT NULL,
    [CustomerCity] [nvarchar](500) NOT NULL,
    [CustomerSalary] [decimal](18, 2) NOT NULL, 
)

GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (4, 'John', 'California', CAST(1000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (5, 'Marry', 'Washington dc', CAST(2000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1012, 'Ashok', 'New Delhi', CAST(2000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1014, 'John', 'California', CAST(1000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1015, 'Marry', 'Washington dc', CAST(2000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1016, 'Pankaj Patel', 'Kanpur', CAST(5000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1019, 'Neeraj Singh', 'Banglore', CAST(6000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1020, 'Aakankshi Gupta', 'London', CAST(40000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1022, 'Ashish Thapliyal', 'Paris', CAST(5000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1023, 'Marry', 'Washington dc', CAST(2000.00 AS Decimal(18, 2)))
GO

If you have any doubt or query then comment.

The post [Solved]-Find all duplicate rows based on one or two columns appeared first on Software Development | Programming Tutorials.



Read More Articles