In this article, we will learn, How to get duplicate Rows based on few column values using the SQL server.
If you are looking to get duplicate rows from my database table based on some column then have come to the right place.
Recently I’m working on an e-commerce website Project in which I have to Find Duplicates From a Table in SQL Server.
There are many different ways for getting duplicate Rows based on few columns with varying efficiency, and the relative size of your database tables. In this post, We will discuss some efficient method to do this task.
Let’s take an example, I have tables Company_Customer in SQL DB.Table has some records of the Customers.Now I want is to select all duplicate the records from Company_Customer.
Method-1 Using group by
select CustomerName,CustomerCity,CustomerSalary, count(*)
from Company_Customer
group by CustomerName,CustomerCity,CustomerSalary
having count(*) > 1
Generic Query
select column1,column2,column3, count(*)
from TableName
group by column1,column2,column3
having count(*) > 1
Query Reseult:
Method-2 Using row_number()
If you have SQL Server 2005 or lastest version, then you can use row_number() functions to return the duplicate records from the database table.
select *
from(select CustomerName
, CustomerCity
, CustomerSalary
, row_number() over (partition by CustomerName
, CustomerCity
, CustomerSalary
order by Id ) as occurrence
from Company_Customer) x
where occurrence > 1
Generic Query
select *
from(select colm1
, colm2
, colm3
, row_number() over (partition by colm1
, colm2
, colm3
order by colm1) as occurrence
from TableName) x
where occurrence > 1
Query Reseult:
Method-3 Using Rank()
Above is the easiest solution with SQL Server 2005. it will return all records record except for the first one if there are multiple occurrences.
SELECT Id
, CustomerName
, CustomerCity
, CustomerSalary
FROM
(
SELECT Id
, CustomerName
, CustomerCity
, CustomerSalary
, RANK() OVER (PARTITION BY CustomerName, CustomerCity, CustomerSalary ORDER BY Id ASC) AS [rank]
FROM Company_Customer
) a
WHERE [rank] > 1
Generic Query
SELECT Id
, cloumn1
, cloumn2
, cloumn3
FROM
(
SELECT Id
, cloumn1
, cloumn2
, cloumn3
, RANK() OVER (PARTITION BY cloumn1, cloumn2, cloumn3 ORDER BY Id ASC) AS [rank]
FROM Company_Customer
) a
WHERE [rank] > 1
Method-4 Using CTE
with MYCTE as (
select row_number() over ( partition by CustomerName,CustomerCity,CustomerSalary order by Id) rown, *
from Company_Customer
)
select * from MYCTE where rown >1
Sql Table Create and Insert Script
CREATE TABLE [dbo].[Company_Customer](
[Id] [int] IDENTITY(1,1) NOT NULL,
[CustomerName] [nvarchar](150) NOT NULL,
[CustomerCity] [nvarchar](500) NOT NULL,
[CustomerSalary] [decimal](18, 2) NOT NULL,
)
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (4, 'John', 'California', CAST(1000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (5, 'Marry', 'Washington dc', CAST(2000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1012, 'Ashok', 'New Delhi', CAST(2000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1014, 'John', 'California', CAST(1000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1015, 'Marry', 'Washington dc', CAST(2000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1016, 'Pankaj Patel', 'Kanpur', CAST(5000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1019, 'Neeraj Singh', 'Banglore', CAST(6000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1020, 'Aakankshi Gupta', 'London', CAST(40000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1022, 'Ashish Thapliyal', 'Paris', CAST(5000.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[Company_Customer] ([Id], [CustomerName], [CustomerCity], [CustomerSalary]) VALUES (1023, 'Marry', 'Washington dc', CAST(2000.00 AS Decimal(18, 2)))
GO
If you have any doubt or query then comment.
The post [Solved]-Find all duplicate rows based on one or two columns appeared first on Software Development | Programming Tutorials.
Read More Articles
- Linq to SQL Group by and Sum in Select | Linq To SQL Group By and Sum
- How send an HTTP POST request to a server from Excel using VBA?
- What is Difference between where and having clauses in SQL server
- How to Use EXCEPT Operator with an example in Sql
- How to use merge statement in the SQL server
- How to write Re-runnable SQL server query
- How to create Cursor in sql server with example
- How to generate random data in sql server for performance testing
- How to write subquery in select statement in Sql
- How to Log record changes in SQL server in an audit table
- Why SQL DELETE query is not working while using with LIKE Keyword?
- Why my visual studio 2015 SQL Server doesn't connect with Azure SQL Database?
- SQL Server 2005: Update rows in a specified order (like ORDER BY)?
- Return the most recent record according to a snapshot date
- Replace the subquery with JOIN/APPLY
- How to handle iOS emojis in sql server database?
- receive an alert when job is *not* running
- Please help me with this traversing query
- Db concurrency: field vs UPDLOCK
- Using SQL Server databases on shared hosting
- Inserting rows into a table with a predefined identity
- Special Character in SQL
- Generate Crystal Reports using SQL queries
- Remove invalid FK data after bulk load
- Using SQL Server stored procedures from Python (pyodbc)
- OPENROWSET load XML with different type
- Connection fails, database has '.' in name
- SQL query to select dates between two dates and exclude dates between two dates
- Keyword SQL Server stored procedure
- a large paged GridView, Rollup and Grand Totals with SQLdataSource
- SqlCommandBuilder() creates insert/update for underlying tables instead for a view
- How do I return an auto generated ID using Slick plain SQL with SQL Server
- Check if opened window has been closed
- Alternative to for loop in SQL Server
- Denial of service for proxy /sql server
- Access the files from Onedrive and load in to sql server table using SSIS package 2015
- What is wrong in the following connection string?
- Full text search in sql server
- How to select boolean column based on the other column value in SQL?
- Select all columns from two tables grouping by all columns in table1 and specific column in table2
- How to import flat file with duplicate primary key records into SQL table
- Distribute large quantities over multiple rows
- EF Core - Missing related entity in class despite foreign key after running Scaffold-DbContext
- Client with IP address is not allowed to access the server Azure SQL database
- SQL Server: drop table primary key, without knowing its name
- Calculate extra column in group by
- SQL split based on delimeter, keep only second element
- why to use same table alias and do SELF JOIN
- Mark a Table as obsolete in Sql Server
- simpler way to breakdown string sql server
- Multiple SQL Server connection strings in app.config file
- SQL "Incorrect syntax near '1900-01-01 00:00:00'." when trying to execute a stored procedure
- Solved-- Why columns value change into NULL when try to update only one column?
- Include an external javascript file in an SSRS report
- How to query the previous record that is in another table?
- Entity framework, insert object based on specific condition
- Sql Ansi to manage DateTime values
- Ensure Uniqueness with Triggers
- what is the best way to delete millions of records in TSQL?
- SQL Server table collation with greek
- SQL: How to get COUNT when using EXCEPT
- SQL Server Agent job completes but VBScript doesn't run
- Handling database changes/scripts in .NET?
- Login page for asp.net page
- Create loop to return return key-value-pair-XML-elements generically
- SQL server pivot query not taking null count
- Get a record position in a group + record count of the group on each record
- exception calling ".ctor" with "2" argument(s): "setparent failed for login 'yyy'."
- What is the difference between Primary Key and unique key constraint?
- django-pyodbc-azure SQL Server Parameter Limitation
- How to compare the current row with the aggregation of next n rows in Redshift?
- Handling large volume of data using Web API
- SELECT - Performance
- Sql Server Update Rows where like '%#186;%'
- perfmon same trace, different results?
- SQL Server 2012 Table Columns & Performance
- Dealing with missing values in SQL Server with Binary Column
- AWS DMS - Migrating from SQL Server to S3 Bucket. How I can add the header in the .csv?
- Remotely execute SSIS package multiple times simultaneously
- SQL Server 2017 CDC: 60 second polling time
- SQL Server with Service Broker
- How do I publish/update a winforms project and not drop tables in the database
- Order results inside and outside of union
- Need T-SQL Query for Groups and Islands
- Insert manual data into database
- Can't restore backup or move mdf file from upper compatability level to lower one
- Convert varchar in format mon-yy to datetime in SQL server
- How do I mark ids that have had occurrences in the past X days? SQL Server
- Import csv file via SSMS where text fields contain extra quotes
- SQL Server 2008 R2 query need guidance
- Entity Framework SaveChanges() not reflecting on entities
- Transaction, Table Truncate and Parametrized SQL Query
- Aggregate function within table I'm creating
- FILLFACTOR on empty database
- pivot table summing based on between date
- SQL Compare with one column, but returns all columns if matched
- Reading data from reader in C# with Sql Server
- Why can't one join with a table returned from a table valued function?
- Does a COUNT execute twice when in SELECT and ORDER BY?
- Creating a stored procedure with many parameters from the website
- Connecting Python to MSSQL using Pymssql
- Secondary Replica SQL Server(Availability Groups )
- SQL Server - Get the final sum of a product of two columns
- Behaviour of update statistics in case when is transaction rollback
- SSRS expression to get 2% report only daily
- Passing C# DateTime to SQL Server stored procedure as DateTime parameter
- SQL/Qlikview, Return all of the MAX values
- Entity Framework 6 Context duplicate while query
- How to protect a database from the Server Administrator in Sql Server
- XML from webservice to SQL Server using SSIS
- SQL Insert into Select failure due to one of the set of records
- populate null columns in database
- Re-assigning IDs in a non-IDENTITY type field in SQL Server database
- OR in WHERE statement slowing things down dramatically
- sql query to get earliest date
- Pivot a fixed multiple column table in sql server
- Microsoft SQL Server - Copy column in table, modify value and type
- SQL Server Foreign key/Insert issue
- Is it possible to asynchronously query SQL Server from Python (3.4)?
- Conceptually, how does database sharding differ from a federation
- SQLAlchemy and SQL Server Datetime field overflow
- SQL Server: Tempdb: per Instance not per Database?
- SELECT Statement in CASE
- SQL Server - how to use like keyword in records that have period in the middle of value
- T-SQL: CTE row_number over() wrong result for the first line
- Execute dynamic sql "alter database" inside trigger
- Select latest inserted record in SQL Server
- Get Output Parameter Value in C#
- SQL Server - Adding header and footer fields in each group
- SQL Server Allocation of amount in bill
- Looping through Dataset
- SQL CASE expression grouping "issue"
- SQL Azure: How to alter dbo user with new login
- I want to display output in a single row in T-SQL
- Deserialize XML object in T-SQL
- Sum multiple nodes value on based on specific condition in XML in SQL Server 2014
- SQL Split Tab Delimited Column
- Raise flag if date exceeds 24 hours excluding weekends
- Avoid column update using a SQL Server trigger
- Programmatically script SSIS package - Dynamic XML to OLE DB
- How to combine multiple dynamic query
- simple select is taking huge time on the table
- Convert payload from VARBINARY into VARCHAR in SQL
- C# SqlBulkCopy resume on error
- Small change in SQL Query
- JSON_QUERY to do a "Select Many"
- SQL Server Update Slow with Join
- intelligent database search
- Get Target Framework Version of SQLCLR Assemblies on SQL Server via Query
- Performance implications of opening multiple SQLConnections in poor-man's ORM
- I need to handle overlapping dates but if the end date is null then it to be assumed that the process has not stopped
- Is there a function similar to mysql_real_escape_string for php's sqlsrv library?
- "Conversion failed when converting the varchar value 'NULL' to data type int"
- Paging Large Datasets - SQL Server (Best Practice)
- SQL Server using TypeORM: Error Timeout: Request failed to complete when inserting records by batch
- INSERT TRIGGERS and the UPDATE() function
- SQL Table Comparison Using CASE T SQL
- Why is @@Identity returning null?
- How to return the output of stored procedure into a variable in sql server
- SQL Server 2012- Server collation and database collation