Excel Practice Test

โ–ถ

Duplicate data is one of the most common problems in Excel spreadsheets, and it causes real damage if you don't catch it before you analyze or share your data. A list of customers with the same email address entered twice skews your counts. A sales report where the same transaction appears in multiple rows inflates your totals. An inventory sheet with duplicated part numbers creates confusion about actual stock levels.

Removing duplicates isn't a cleanup task you do when you have spare time โ€” it's a core data quality step that determines whether your analysis is accurate, your reports are trustworthy, and your decisions are based on real numbers rather than inflated counts. Data that enters a spreadsheet from multiple external sources, database imports, or manual entry by different team members almost always carries duplicates that need to be found and removed before any meaningful work can happen.

Excel provides several ways to remove duplicates, ranging from a built-in one-click tool to formula-based approaches to Power Query transformations. Which method is right depends on your situation: whether you want to permanently delete duplicate rows or keep the original data intact, whether you need to deduplicate based on one column or multiple columns, whether you're working with a static table or a dynamic dataset that updates regularly, and whether you need the deduplication to happen automatically whenever new data comes in.

The most direct approach is the Remove Duplicates feature built into Excel's Data tab. It's fast, works on any table or range, lets you choose which columns to consider when identifying duplicates, and permanently removes the duplicate rows in one click. Most Excel users who need to clean a list for one-time use will reach for this tool first โ€” and for good reason. It handles the majority of deduplication tasks in under a minute.

For situations where you want to keep your original data untouched and create a separate clean list, the UNIQUE function (available in Excel 365 and Excel 2021) is the better choice. It returns a deduplicated array of values that updates dynamically โ€” if you add data to the source range, the unique list updates automatically without any additional steps. The trade-off is that it requires a newer version of Excel, and it works differently from the built-in tool in ways that matter for column-by-column deduplication.

Power Query is the most powerful option for complex deduplication scenarios: large datasets, regular data imports, multi-step transformations, or workflows where you need to document and repeat the same cleaning steps. Power Query's Remove Duplicates command is non-destructive โ€” your source data stays untouched โ€” and the transformation is saved as a repeatable query you can refresh whenever new data arrives. It takes more setup than the built-in tool, but for production workflows it pays off quickly.

Before diving into methods, it's worth understanding exactly what Excel considers a duplicate. By default, the Remove Duplicates feature considers two rows identical only if every value in every selected column matches exactly โ€” including case, spaces, and number formatting.

A cell containing "Smith" and a cell containing "smith" are not treated as duplicates by the built-in tool (it's case-insensitive for the built-in tool specifically, but formula-based approaches can differ). A value of 42 and a value of 42.00 formatted differently may or may not be treated as duplicates depending on the underlying data type. These nuances matter in practice, and understanding them prevents the frustrating situation where duplicates appear to remain after you've already removed them. The complete guide to removing duplicates in Excel covers these edge cases with specific examples.

Excel Duplicate Removal: Key Methods

1 Click
Built-In Tool
Dynamic
UNIQUE Function
Repeatable
Power Query
Highlight
Cond. Formatting
All
Excel Versions

The built-in Remove Duplicates tool is the fastest way to clean a list when you're ready to permanently delete the duplicate rows. Here's exactly how to use it. First, select any cell within the data range you want to deduplicate โ€” you don't need to select the entire range, just one cell inside it. Excel will automatically detect the boundaries of your table. Then go to the Data tab in the ribbon and click Remove Duplicates in the Data Tools group. A dialog box opens showing all columns in your table or range, each with a checkbox.

The column selection step is where many users make their first mistake. Each checked column is included in the comparison that determines what counts as a duplicate. If all columns are checked, Excel only removes rows where every single column value is identical to another row. This is the right setting for most use cases โ€” it prevents accidentally deleting rows that look similar but differ in important columns.

But if you want to deduplicate based on a specific identifier โ€” say, removing customers with the same email address regardless of whether their name field differs โ€” uncheck all columns except the email column. Excel will then keep only the first instance of each unique email and remove all subsequent rows with the same email, even if other columns are different.

After clicking OK, Excel tells you how many duplicate values were found and removed, and how many unique values remain. This confirmation dialog is important โ€” read it before clicking OK again. If the numbers look wrong (for example, it says zero duplicates when you know there are some, or it removed far more rows than expected), click Undo immediately with Ctrl+Z and reassess. The Remove Duplicates operation cannot be undone after you close the confirmation dialog unless you've already saved an undo history.

One critical limitation of the built-in tool: it permanently deletes the duplicate rows. There's no way to recover them after the fact if you've saved your file. Best practice before using Remove Duplicates on a dataset you care about is to either work on a copy of the file or duplicate your data to a separate sheet before running the tool.

This sounds obvious in hindsight but many users skip this step and regret it. Another limitation worth knowing: Remove Duplicates works on static ranges. If your data changes frequently, you'll need to run the tool again each time โ€” it doesn't automatically keep your list clean as new rows are added. For dynamic deduplication, the UNIQUE function or Power Query is the right choice.

The built-in tool also handles data with headers correctly as long as "My data has headers" is checked in the dialog โ€” which it is by default when Excel detects headers. If your first row is data rather than headers, uncheck that option so Excel doesn't treat your first data row as a header and skip it in the deduplication process. For more detail on the three fastest variations of this method, the how to find duplicates in Excel guide covers identification strategies before and after removal.

The built-in Remove Duplicates tool is permanent. Once you confirm the operation and save your file, the deleted rows cannot be recovered. Always take one of these precautions first:

  • Work on a copy of the original file (Save As โ†’ new filename)
  • Duplicate your data to a separate sheet before running Remove Duplicates
  • Use Highlight Duplicates (conditional formatting) to review which rows will be affected before deleting
  • Use the UNIQUE function or Power Query instead if you want to preserve the original data

If you delete rows by mistake: press Ctrl+Z immediately to undo, before saving the file. Once saved, the deleted rows are gone.

Practice Microsoft Excel Managing Tables and Data Questions

When you need a deduplicated list without modifying your original data, the UNIQUE function is the cleanest solution in modern Excel. Available in Excel 365 and Excel 2021 (Windows and Mac), UNIQUE returns an array of distinct values from a range โ€” and that array updates automatically when the source data changes.

The syntax is straightforward: =UNIQUE(array, [by_col], [exactly_once]). The array argument is your source range. The optional by_col argument controls whether deduplication happens across rows (FALSE, the default) or columns (TRUE). The optional exactly_once argument, when set to TRUE, returns only values that appear exactly once โ€” not just unique values, but values with no duplicates at all.

A basic UNIQUE formula for deduplicating a single column looks like this: =UNIQUE(A2:A100). This returns a spilled array of distinct values from the range A2:A100, automatically filling however many rows are needed. If you add more values to A2:A100 later, the output updates on its own โ€” you don't need to re-run anything. For a multi-column table where you want rows that are unique across all columns, use =UNIQUE(A2:D100) where columns A through D make up your table. Excel treats each row as a unit and returns rows where the entire combination of values across all four columns is distinct.

One important thing to know about UNIQUE: it's case-sensitive by default for text comparisons. "Apple" and "apple" are treated as distinct values. The built-in Remove Duplicates tool is case-insensitive โ€” it treats them as the same. If your data has inconsistent capitalization and you're using UNIQUE, you may want to wrap the source array in LOWER() first: =UNIQUE(LOWER(A2:A100)). This normalizes case before deduplication, though it means your output will be in lowercase.

Power Query takes deduplication to a different level. To access it, go to Data โ†’ Get & Transform Data โ†’ From Table/Range. Excel will load your data into the Power Query editor. Once there, select the column or columns you want to deduplicate on (hold Ctrl to select multiple columns), right-click, and choose Remove Duplicates.

Power Query removes duplicate rows based on the selected columns, keeping the first occurrence. Click Close & Load to return the result to your worksheet as a separate table that you can refresh at any time. The original source data is untouched, and the deduplication logic is saved as part of the query โ€” reproducible every time you click Refresh.

Conditional formatting offers a non-destructive way to identify duplicates before you decide what to do with them. Select the range you want to check, go to Home โ†’ Conditional Formatting โ†’ Highlight Cells Rules โ†’ Duplicate Values. Excel highlights all cells in the selection that appear more than once. This doesn't remove anything โ€” it just makes duplicates visible. From there, you can review them manually and decide which ones to delete.

This approach is useful when you're working with messy data and need human judgment about which duplicate row to keep. The how to highlight duplicates in Excel guide walks through the conditional formatting steps in detail, including how to highlight unique values instead of duplicates, and how to extend the check across multiple columns. For removing duplicates after you've identified them, combine the highlighting step with the Delete Duplicates in Excel methods to complete the workflow.

5 Methods to Remove Duplicates in Excel

๐Ÿ”ด Remove Duplicates Tool

Data tab โ†’ Remove Duplicates. One-click permanent removal. Best for one-time list cleanup where you're ready to delete the duplicate rows.

๐ŸŸ  UNIQUE Function

=UNIQUE(range) returns a dynamic deduplicated list. Best for Excel 365/2021 users who want to preserve original data and keep the clean list auto-updated.

๐ŸŸก Power Query

Data โ†’ Get & Transform โ†’ Remove Duplicates. Non-destructive and repeatable. Best for large datasets, regular data imports, and production cleaning workflows.

๐ŸŸข Conditional Formatting

Home โ†’ Conditional Formatting โ†’ Highlight Cells โ†’ Duplicate Values. Best for visually identifying duplicates before deciding which rows to delete.

๐Ÿ”ต COUNTIF Formula

=COUNTIF($A$2:$A$100,A2)>1 flags duplicates with TRUE/FALSE. Best for adding a helper column that marks duplicates for manual review or filter-based deletion.

The most common reason Remove Duplicates seems to fail โ€” when you run it and duplicates appear to remain โ€” is that the cells aren't actually identical even though they look the same on screen. Excel is comparing the underlying cell values, not what you see after formatting is applied. The most frequent culprits are trailing spaces, leading spaces, inconsistent capitalization, numbers stored as text versus numbers, and date values with different underlying serial numbers.

Trailing and leading spaces are particularly common in data that was pasted from external sources, downloaded from a database, or imported from a CSV file. A cell containing "Johnson " (with a space after the name) and a cell containing "Johnson" (no trailing space) look identical in most views but are not treated as equal by Excel.

The fix is to run a TRIM function on the affected columns before removing duplicates: in a helper column, use =TRIM(A2), copy the formula down, paste the results as values over the original column, and then run Remove Duplicates. TRIM removes leading and trailing spaces and collapses multiple internal spaces to single spaces.

Numbers stored as text are another common issue. If a column contains what looks like customer IDs or product codes โ€” values that are numeric but treated as text because of how they were imported โ€” Excel may treat "00123" (text) and 123 (number) as different values.

Look for a small green triangle in the corner of cells in a column โ€” that's Excel's indicator that a number is stored as text. Select the affected cells, click the warning icon that appears, and choose Convert to Number before running Remove Duplicates. Alternatively, use the VALUE() function in a helper column to convert text-formatted numbers to actual numbers.

Date duplicates are a subtler problem. Dates in Excel are stored as serial numbers (the number of days since January 1, 1900). Two cells can display the same date visually but have different serial numbers if one includes a time component โ€” "1/15/2026" and "1/15/2026 9:00 AM" look the same when formatted as dates but are not equal. If you're deduplicating date columns and getting unexpected results, use the INT() function to strip the time component: =INT(A2) in a helper column converts any date-time value to just the date serial number.

For partial duplicate situations โ€” where you want to find rows that share one key column but differ in others โ€” the built-in Remove Duplicates tool handles this when you configure it correctly. Uncheck all columns in the Remove Duplicates dialog except the key column you're deduplicating on. Excel keeps the first instance of each unique value in that column and removes all subsequent rows, regardless of what other columns contain.

If you need more control over which instance to keep (the most recent, the one with the highest value, etc.), sort your data by the preference column first, then run Remove Duplicates on the key column โ€” Excel always keeps the first row of each group it encounters. The Excel formulas guide covers the COUNTIF approach for creating helper columns that flag duplicates before you decide which ones to remove, giving you full control over which rows survive the cleanup.

Practice Microsoft Excel Formulas and Functions Questions

Before You Remove Duplicates: Preparation Checklist

Save a backup copy of your file before running Remove Duplicates โ€” the operation is permanent and cannot be undone after saving
Use TRIM() to remove leading and trailing spaces from text columns โ€” spaces cause "identical" values to be treated as different
Check for numbers stored as text (green triangles in cells) and convert them before deduplicating
Decide which columns to include in the comparison โ€” checking only the key identifier column deduplicates on that value alone
Sort your data by a preference column first if you want to control which duplicate row survives (Remove Duplicates always keeps the first occurrence)
Use Highlight Duplicates (conditional formatting) to preview which rows will be affected before running the permanent removal
For dynamic datasets that update regularly, use UNIQUE function or Power Query instead of the built-in tool

Remove Duplicates: Method by Use Case

๐Ÿ“‹ Built-In Tool

Best for: One-time cleanup of a static list where you're ready to permanently delete duplicate rows.

Steps:

  1. Click any cell in your data range.
  2. Go to Data tab โ†’ Remove Duplicates.
  3. In the dialog, check only the columns that define uniqueness (or check all for full-row matching).
  4. Click OK. Review the confirmation message โ€” it tells you how many rows were removed and how many unique rows remain.
  5. Click OK again to confirm.

Limitations: Permanent and irreversible after saving. Doesn't update automatically when new data is added. Case-insensitive for text comparisons.

๐Ÿ“‹ UNIQUE Function

Best for: Excel 365 or 2021 users who want a dynamic, auto-updating deduplicated list without modifying the original data.

Basic syntax: =UNIQUE(A2:A100) โ€” returns distinct values from column A.

Multi-column: =UNIQUE(A2:D100) โ€” returns rows where the entire combination of A:D is unique.

Exactly once: =UNIQUE(A2:A100,,TRUE) โ€” returns only values that appear exactly once (no repeats at all).

Key notes: Results spill automatically. The output updates when source data changes. Case-sensitive for text by default โ€” wrap in LOWER() to normalize. Requires Excel 365 or 2021.

๐Ÿ“‹ Power Query

Best for: Large datasets, recurring imports, complex multi-step transformations, and workflows that need to be repeatable.

Steps:

  1. Select any cell in your data. Go to Data โ†’ Get & Transform Data โ†’ From Table/Range.
  2. In the Power Query editor, select the column(s) to deduplicate on (Ctrl+click for multiple).
  3. Right-click a selected column header โ†’ Remove Duplicates.
  4. Click Close & Load to load the clean table back to your worksheet.

Result: Original data unchanged. Clean table appears as a new Excel table. Click Refresh (in Data tab) to re-run the query on updated source data.

๐Ÿ“‹ COUNTIF Approach

Best for: Situations where you want to flag duplicates in a helper column and manually review or filter them before deciding what to delete.

Helper column formula: =COUNTIF($A$2:$A$100,A2) โ€” returns the count of how many times the value in A2 appears in the range. Values of 2 or more are duplicates.

Flag formula: =IF(COUNTIF($A$2:A2,A2)>1,"Duplicate","First") โ€” labels the first occurrence "First" and all subsequent duplicates "Duplicate". Useful when you want to filter and delete only the flagged rows.

To use: Add the helper column, filter for "Duplicate" rows, select and delete them, then remove the helper column. This gives you full control over which duplicate rows survive.

Remove Duplicates Tool vs. UNIQUE Function

Pros

  • Available in all modern Excel versions โ€” no 365 subscription required
  • Works on any table or range with a single click, no formula knowledge needed
  • Permanently cleans the data in place โ€” no extra columns or outputs to manage
  • Handles multi-column deduplication through a simple checkbox interface
  • Case-insensitive by default โ€” treats 'Smith' and 'SMITH' as the same

Cons

  • Requires Excel 365 or Excel 2021 โ€” not available in older versions
  • Returns a separate output array that doesn't modify the source data
  • Requires formula knowledge and understanding of dynamic array behavior
  • Case-sensitive by default โ€” 'Smith' and 'SMITH' are treated as different values
  • Output needs to be placed in a cell with enough empty space for the spilled array

Excel Questions and Answers

How do I remove duplicates in Excel?

The fastest way is the built-in Remove Duplicates tool. Click any cell in your data, go to the Data tab, and click Remove Duplicates. In the dialog box, check the columns that define uniqueness, then click OK. Excel permanently removes all rows after the first occurrence of each duplicate combination. For a non-destructive approach that keeps your original data, use the UNIQUE function (Excel 365/2021) or Power Query.

Does Remove Duplicates keep the first or last occurrence?

Excel's Remove Duplicates tool always keeps the first occurrence of each duplicate group and removes all subsequent occurrences. To control which instance survives, sort your data by a preference column before running Remove Duplicates. For example, if you want to keep the most recent transaction for each customer, sort by date (newest first) before removing duplicates on the customer ID column โ€” the first row Excel encounters will be the most recent.

Why didn't Remove Duplicates find my duplicates?

The most common cause is that the cells look the same but aren't actually identical. Check for trailing spaces (use TRIM), numbers stored as text (look for green triangles in cells), inconsistent capitalization, and date values that include hidden time components. Also verify that you've selected the right columns in the Remove Duplicates dialog โ€” if you're checking all columns, rows must match exactly across every column to be considered a duplicate.

Can I undo Remove Duplicates?

Yes, but only if you act immediately. Press Ctrl+Z right after running Remove Duplicates to undo the operation. However, once you save the file, the undo history is cleared and the deleted rows cannot be recovered. This is why creating a backup copy of your file before running Remove Duplicates is a critical precaution whenever you're working with data you can't afford to lose.

What is the UNIQUE function in Excel?

UNIQUE is a dynamic array function available in Excel 365 and Excel 2021 that returns a list of distinct values from a range. The syntax is =UNIQUE(array, [by_col], [exactly_once]). Unlike the Remove Duplicates tool, UNIQUE doesn't modify your original data โ€” it creates a separate output that updates automatically when the source data changes. It's ideal for situations where you need a continuously clean version of a list alongside the original raw data.

How do I remove duplicates based on one column only?

In the Remove Duplicates dialog, uncheck all columns except the one you want to deduplicate on. For example, to remove rows with duplicate customer emails while keeping all other columns, uncheck every column except the email column. Excel will keep the first row for each unique email address and remove all subsequent rows, regardless of whether other columns differ. The same column-selection approach works in Power Query's Remove Duplicates command.
โ–ถ Start Quiz