First, let’s look at the data set we want to deduplicate. It is comprised of sales for agents in specific regions.

Screenshot of a spreadsheet with a data set comprising 3 columns: Sales Agent, Region, Sales
Sample data set

Before we start removing duplicates, we must ask ourselves, “What constitutes a duplicate?”

  • Should a row of the table be removed if the [Region] on one row matches the [Region] of another row?
  • Should a row of the table be removed if the [Sales Agent] on one row matches the [Sales Agent] of another row?
  • Should the entire row match in all aspects to another row to be a candidate for removal?
  • Should any combination of the above be required for removal?

We will go with the requirement that the entire row of the table must be the same as another row in the same table to be removed.

NOTE:  You need to decide whether to retain a copy of the original data set or replace it with the deduplicated result.

Method #1: Excel’s Remove Duplicates Feature

This method of duplicate data removal works in all versions of Excel.

Before invoking the Remove Duplicates feature, let’s make a copy of the data set by:

  1. clicking in the data and pressing <CTRL-A> two times,
  2. pressing <CTRL-C> to copy the selected data into memory,
  3. clicking in a new location on the sheet and pressing <CTRL-V> to paste the copy of the data into the new location.
Screenshot of a spreadsheet with 2 versions of the same data set consisting of 3 columns: Sales Agent, Region, Sales.
Original data set and its copy

To remove the duplicate rows from the copy of the data set:

  1. Click in the newly copied data set.
  2. Select Data (tab) -> Data Tools (group) -> Remove Duplicates.
Fragment of the Excel ribbon showing the Data Tools group, with Remove Duplicates button highlighted
Location of the Remove Duplicates feature in the Data tab of the Excel ribbon.
  1. In the Remove Duplicates dialog box, ensure all the listed column headings and the option labeled “My data has headers” are checked and press OK.
Remove Duplicates dialog box with 3 checked columns: Sales Agent, Region, Sales
Remove Duplicates dialog box

The system reports back how many duplicate values were removed as well as how many unique values remain.

Original dataset with duplicates still present on the left (19 rows), and a shorter dataset with duplicates removed on the right (14 rows).
Before and After Removing Duplicates
Excel message box with the text: "5 duplicate values found and removed; 14 unique values remain".
Message box with the number of duplicates removed and unique values remaining.

Removing duplicate values based on a single column

If you wanted to remove all duplicated rows of the table based solely on the [Sales Agent] column, you would follow the above process but only check the box for [Sales Agent] in the Remove Duplicates dialog box.

Remove Duplicates dialog box with 3 columns: Sales Agent, Region, Sales, but only Sales Agent column checked.
Remove Duplicates dialog box

This would result in the following “before and after” snapshot of the data.

Original dataset with duplicates still present on the left, and a shorter dataset with duplicates removed on the right. Excel message box with the text: "15 duplicate values found and removed; 4 unique values remain".
Before and After removing duplicates from the [Sales Agent] column.

NOTE: Only the row for the first encountered [Sales Agent] name is retained.  All rows that follow with the same [Sales Agent] name are removed.

PROS:

This method is simple to implement and available to any Excel user, regardless of version. It’s the quickest way to remove duplicate values ‘in situ’ when you don’t need to keep the original data set. If you need to keep the original as is, the other two methods might be more efficient (they skip the extra step of copying the data).

CONS:

If new records are created in the original table that duplicate existing values, the deduplicated copy will no longer accurately reflect the source.  The “old” list of unique values would need to be deleted and the Remove Duplicates process would need to be re-executed.

Finding Duplicates (Before Removal)

If your data keeps changing and there’s a risk that values will be repeated, you may want to include visual indication that duplicates are present. There are several methods of finding duplicates in Excel. The simplest one is right there on the Home tab – conditional formatting.

Select the range (avoid selecting entire columns, e.g., A:A). Go to Home (tab) -> Styles (group) -> Conditional Formatting -> Highlight Cells Rules -> Duplicate Values.

Location of Conditional Formatting - Highlight Cells Rules - Duplicate Values on the Home tab.
Location of Conditional Formatting of Duplicate Values

Select the style you want and click OK to apply.

Excel table with duplicate values in each column highlighted in light red.
Duplicate values in each column highlighted.

Note that the duplicate check occurs on column-by-column basis rather than row by row. If you want to compare row to row, you will need to modify your approach. For example, concatenate the values to check the whole row in a single column. (You can hide the concatenated values with Custom Number Format: ;;;).

Excel table with duplicate rows from columns A to C indicated by light red cell fill in column D. The formula =CONCAT($A2:$C2) applied in cell D2.
Duplicate rows highlighted in the helper column.

By conditionally highlighting duplicate values, you get a signal that you need to run Remove Duplicates again – or use one of the dynamic methods described below.

Featured Course

Excel Essentials for the Real World

Learn time-saving hacks to work smarter in Excel. Our members also consistently report a newfound enthusiasm for using Excel after taking our course.
Learn More

Method #2 – How to Remove Duplicates in Excel Using the UNIQUE Function

This method of duplicate data removal works for subscribers of Office 365 and users of Excel for the Web and Excel 2021.

The UNIQUE function has been introduced in 2021 along with several other powerful functions that can manipulate arrays. Its purpose is pretty self-explanatory – it returns unique values.

There is one (optional) step of preparation before we put it to use: make a copy of our header row from the original table of data. You can skip this if you just need the unique values and don’t intend to present the deduplicated data set.

Data set in an Excel spreadsheet with columns Sales Agent, Region, Sales, and those same headers repeated next to the dataset.
Source data set and headers placeholder

Select the cell directly below the [Sales Agent] heading (in this case, cell E2) and enter the following formula:

=UNIQUE(A2:C20)

If you have converted your data set to an official Excel Table, you can use the table’s name in place of the cell range reference.  This is easier to understand and easier to write.

=UNIQUE(TSales)
On the right an Excel table called TSales with columns Sales Agent, Region, Sales; on the left, a spilled array under the same headings, with formula in cell E2 =UNIQUE(TSales).
Source table and the UNIQUE formula in action.

The UNIQUE function can be extremely useful when creating dependent dropdown lists. Combine it with the even newer Excel functions like TOCOL/TOROW to deduplicate values across rows as well as columns at the same time.

CONS:

The downside of this method is that the number formats (see the [Sales] column) do not carry forward to the result.  You need to format the cells to match if you need to retain that comma style. Also, UNIQUE, like other dynamic array formulas, cannot be used in official Excel tables. As you saw, it happily accepts table references as arguments (input), but you can’t output its results in a table.

PROS:

The UNIQUE function, as opposed to the Remove Duplicates feature, is dynamic.  If new records are created that duplicate existing records, the duplicates will be automatically removed – as long as you are using an official Excel table as your source. If you’re using a range of cells, you may have to manually adjust the range in the formula.

Featured Course

Master NEW Excel Functions in Office 365 & Office 2021

Ready to supercharge your Excel skills? This course will help you master the new set of functions introduced to Excel for Office 365. You’ll create professional-grade reports in a fraction of the time it used to take you.
Learn More
Excel new functions course cover

Method #3 – Remove Duplicate Values with Power Query

This method of duplicate data removal works in all versions of Excel from version 2013 onward.

Power Query is a data transformation tool available in Excel that can perform simple to highly complex data cleansing procedures on “dirty” data.

To use Power Query to remove the duplicates in the table, perform the following steps:

  1. Click in the data to be deduplicated.
  2. Select Data (tab) -> Get & Transform (group) -> From Table/Range.
Fragment of the Excel ribbon showing the Get & Transform Data group, with From Table/Range button highlighted
Location of From Table/Range in the Data tab of the Excel ribbon.

This will load the table into Power Query.

View of the data comprising 3 columns: Sales Agent, Region, Sales and 19 rows in the Power Query Editor.
View of the data in the Power Query Editor.
  1. Click any of the column heading and press <CTRL-A> to select all columns.
View of the data in the Power Query Editor will all columns (Sales Agent, Region, Sales) selected.
View of the data in the Power Query Editor will all columns selected.
  1. Select Home (tab) -> Reduce Rows (group) -> Remove Rows -> Remove Duplicates.
Location of the Remove Duplicates feature under expanded Remove Rows group in the Home tab of Power Query Editor.
Location of the Remove Duplicates feature in Power Query Editor.

You are left with only the unique rows.

View of the data comprising 3 columns: Sales Agent, Region, Sales and 15 rows in the Power Query Editor.
View of the data in the Power Query Editor after removing duplicates.
  1. Load the results of the query back into the Excel spreadsheet by clicking the lower part of the Close & Load button (far left of Home ribbon) and selecting “Close & Load To…”.
Location of "Close & Load To" in the Home tab of the Power Query Editor.
Location of “Close & Load To” in the Power Query Editor.
  1. In the Import Data dialog box, select the option for Table and Existing Worksheet. Select a cell beside the original data (e.g., cell E1 of the Data sheet) and click OK.
Import Data dialog box with Table and Existing worksheet selected.
Import Data dialog box
Original data set (20 rows including headers) on the right and output of the deduplicated query (16 rows including headers) on the left.
Original data set and output of the deduplicated query.

CONS:

This method is semi-dynamic, meaning you must refresh the query’s output to see the updated results.

This can be done by right-clicking on the query output and selecting Refresh or by clicking the Refresh All button on the Data tab.

Location of the Refresh All button on the Data tab. Refresh All button expanded to show Refresh All, Refresh, and (highlighted) Connection Properties.
Location of the Refresh All button on the Data tab.

PROS:

Data can be set to refresh automatically when one of 3 events occur:

  • The user opens the file (see Query Properties image below; you access Query Properties by expanding the Refresh All options and selecting Connection Properties (see above) or right-clicking on the query name in the Queries & Connections pane and selecting Properties).
  • A timer is established to refresh the results every N-minutes (see Query Properties image below).
  • Data is changed in the source table (initiated via an event-driven macro using VBA).
Query Properties dialog box with the following Refresh controls checked: Enable background refresh; Refresh every 60 minutes; Refresh data when opening the file; Refresh this connection on Refresh All.
Query Properties dialog box

Featured Course

Master Excel Power Query – Beginner to Pro

Power Query is essential for Excel users who work with lots of data. This course teaches you how to use Excel in Power Mode and create meaningful reports with far less effort.
Learn More
Power Query Course cover

Did you catch the “mistake”?

Look closely at the Power Query output.

Notice that “Stevie Bridge” in “Asia” with “4378” in [Sales] is listed twice.

Excel table comprising 3 columns (Sales Agent, Region and Sales) with two rows highlighted: "Stevie Bridge"-"Asia"-"4378" and "Stevie bridge"-"Asia"-"4378". The only difference being upper and lower case B in Bridge.
“Missed” duplicates in the Power Query output.

The reason this was “missed” is because Power Query is case-sensitive when it comes to analyzing data.

Both the Remove Duplicates feature and the UNIQUE function are case-insensitive; they don’t discriminate between differing cases.  Power Query will treat these two records as separate and distinct.

Fixing the Problem

To force Power Query to remove the duplicate value it “missed”, we must add a step to the query that will standardize the casing of the agent names.

  1. In the Queries & Connections pane, either double-click the “TSales” query or right-click the “TSales” query and select Edit. This will reopen the Power Query Editor.
Queries & Connections pane in Excel with TSales query visible and context menu open with Edit highlighted.
Queries & Connections pane in Excel.
  1. In the Query Settings pane (on the right) click the step labeled “Changed Type”. If you have more steps in your query, just remember that it should be a step before the “Removed Duplicates” step.
Power Query Editor open on the TSales query with an arrow pointing to the Changed Type step.
Power Query Editor open on the TSales query.
  1. Select the heading of the [Sales Agent] column.
  2. Capitalize the first letter of each word by selecting Transform (tab) -> Text Column (group) -> Format -> Capitalize Each Word.

NOTE: You can also do this by right-clicking the column heading and selecting Transform -> Capitalize Each Word.

Location of the Capitalize Each Word feature under expanded Format group in the Transform tab of the Power Query Editor.
Location of the Capitalize Each Word feature in the Power Query Editor.
  1. In the Insert Step notification, click Insert to add the new step to the query.
Insert Step dialog box with option to Insert or Cancel.
Insert Step dialog box.
  1. Save the updated query back to the original output table by clicking Home (tab) -> Close & Load.

We now have only one “Stevie Bridge” in “Asia” with “4378” in [Sales]. Data successfully deduplicated.

Original data set (20 rows including headers) on the right and output of the deduplicated query (15 rows including headers) on the left.
Original data set and output of the (properly) deduplicated query.

Final Thoughts

Next time you’re faced with the dilemma how to remove duplicates in Excel, you’ll be more than up to the task. Consider your specific scenario and choose the method that will work best. Avoid any traps by understanding the limitations of each approach and say goodbye to messy data.

Practice Workbook

Feel free to Download the Workbook HERE.

Excel Download Practice file

Featured Course

Black Belt Excel Package

What would your life look like if you became an Excel Black Belt? You’ll get the best deal with this (cost savings) package. It will take you from Excel Newbie to Grand Master.
Learn More
Excel Black Belt course cover

Leila Gharani

I'm a 6x Microsoft MVP with over 15 years of experience implementing and professionals on Management Information Systems of different sizes and nature.

My background is Masters in Economics, Economist, Consultant, Oracle HFM Accounting Systems Expert, SAP BW Project Manager. My passion is teaching, experimenting and sharing. I am also addicted to learning and enjoy taking online courses on a variety of topics.