Excel Practice Test

โ–ถ

A 60 KB spreadsheet ballooned to 47 MB and you have no idea why. The data hasn't grown. The sheet count is the same. You delete a few rows, save, and somehow the file gets bigger. If that sounds familiar, you've hit Excel bloat โ€” the silent disease that turns nimble workbooks into email-attachment-rejecting monsters. The good news is that almost every cause of bloat is fixable, and most fixes take less than a minute. The trick is knowing where to look.

This guide walks through the eight real causes of Excel file bloat and the specific clicks that fix each one. We'll start with the most common culprit โ€” invisible formatting that stretches all the way to row 1,048,576 โ€” and work through images, hidden pivot caches, defined names you forgot you created, and the trick of switching to the binary .xlsb format.

Each fix is independent, so you can grab the one that matches your situation and skip the rest. By the end your file should be a fraction of its current size, and you'll know what to watch for next time.

Before you start, save a copy. Some of these fixes are destructive โ€” once you clear formatting from blank cells or strip a pivot cache, there's no undo button after you close the file. Hit File > Save a Copy, name it something like workbook_backup.xlsx, and keep it until you're sure the slimmed version works. Then we can dig in.

What Bloat Looks Like in Real Workbooks

๐Ÿ“‰
70%
size cut typical when switching from .xlsx to .xlsb
๐Ÿ“
1.04M
rows Excel formats by accident โ€” even empty ones
๐Ÿ“ท
5 MB+
added by a single uncompressed phone photo
๐Ÿ”„
1 click
to refresh-on-open and drop the pivot cache from disk

Why Excel Files Get Bloated in the First Place

Excel stores more than just the numbers you can see. Behind every workbook is a zipped folder of XML files describing every cell, every formatting choice, every defined name, every pivot table cache, and every embedded object. Most of that runs lean โ€” a clean 10,000-row dataset is usually under a megabyte. Bloat happens when one of those internal files explodes, often because something you did months ago left a trail Excel never cleaned up.

The usual suspects are predictable. You highlighted a column to set its color and the format stuck to all 1,048,576 cells in that column, not just the 200 with data. You pasted an image and didn't compress it. A pivot table you deleted left its cache sitting in the file because you never told Excel to drop it.

A vendor copy-paste pulled in 5,000 defined names from another workbook. A conditional formatting rule got duplicated every time someone re-applied it, and now there are 800 overlapping rules where there should be one. None of these show up in the sheet view, but every one of them adds weight.

The fastest way to diagnose is to rename a copy of your workbook from .xlsx to .zip and look inside. The XML files are plain text. Open xl/worksheets/sheet1.xml and check its size โ€” if a sheet with 50 rows of data has a 30 MB XML file, you have an excess-formatting problem. Look at xl/pivotCache/ โ€” every file in there is a cached copy of source data, sometimes triplicated. Check xl/media/ for images. Whichever folder is biggest tells you which fix to try first.

Find your bloat before you fix it

Make a copy of the workbook. Rename the copy from .xlsx to .zip. Double-click to open it like a regular folder. The four subfolders that matter are xl/worksheets (one XML per sheet), xl/media (images), xl/pivotCache (pivot source caches), and xl/sharedStrings.xml (every text value in the file). Whichever is largest is your target. A sheet XML over 5 MB usually means excess formatting. A pivotCache folder over 10 MB means orphaned pivot data. A media folder over 5 MB means uncompressed images. This takes thirty seconds and tells you exactly where the weight is.

Fix #1: Delete the Unused Rows and Columns Beyond Your Data

This is the most common cause of bloat by a wide margin. When you highlight a whole column to apply a color, Excel doesn't just paint the cells with data โ€” it paints all 1,048,576 cells in that column. The same trap catches you when you highlight whole rows. Hit Ctrl+End on a bloated workbook and you'll often see Excel jump to row 1,048,576 or column XFD, miles past where your actual data ends. That whole region is being tracked as formatted, and the XML for the sheet stores every one of those cells.

The fix takes three steps. Move to the cell just below your last row of real data. Press Ctrl+Shift+End to select everything from there to the corner of the sheet. From the Home tab, click Clear > Clear All. Then do the same for columns: pick the first empty column to the right of your data, Ctrl+Shift+End, Clear All.

Save and reopen the file. Hit Ctrl+End again โ€” the marker should now land near your real data instead of in the empty quadrant. On a workbook with this problem you can see file size drop by 80% or more from this single fix.

One subtlety. If you have entire-row formatting that you actually want (alternating row colors across an Excel table, for example), don't clear all rows. Instead, convert the data to an Excel Table with Ctrl+T. Tables apply formatting dynamically to the data range only, so the format follows the data instead of being baked into a million cells. After converting, clear the leftover formatting from the cells outside the table.

The Eight Fixes, Mapped to Symptoms

๐Ÿ”ด Sheet XML huge, no images

Excess formatting outside the data range. Use Clear All on cells past Ctrl+End. Fix #1 and #2 below.

๐ŸŸ  Media folder fat

Uncompressed images. Right-click any picture > Compress Pictures > Email 96 ppi. Fix #3.

๐ŸŸก PivotCache enormous

Orphaned pivot caches. Refresh-on-open, delete unused pivots, save. Fix #4.

๐ŸŸข File feels slow + huge

Switch from .xlsx to .xlsb binary format. Same data, half the size, faster open. Fix #5.

๐Ÿ”ต Conditional rules everywhere

Conditional format duplication. Manage Rules > clear and re-apply once. Fix #6.

๐ŸŸฃ Vendor file with 5,000 names

Inherited defined names and macros. Name Manager > Filter > Delete. Fix #7.

๐Ÿฉต Source data pasted in

Replace with Power Query or external connections. Fix #8.

๐Ÿฉท Still huge after everything

Open the XML directly to find the orphan โ€” sharedStrings.xml or a stray drawing.

Fix #2: Clear the Excess Formatting Inspector Misses

Sometimes Clear All on blank cells doesn't fully solve it. The format is bound to the cell style rather than the cell, so the cells look empty but each one carries a style definition.

Excel has a built-in cleaner for this. Open File > Options > Add-Ins, in the Manage box at the bottom pick COM Add-ins, click Go, and check Inquire. The Inquire tab now appears in the ribbon. Click Clean Excess Cell Formatting, choose All sheets in this workbook, and let it run. It walks every cell, strips redundant style references, and rebuilds the format storage from scratch.

Inquire is built into Microsoft 365 and Office 2019/2021 Professional Plus. If you can't find it, the manual equivalent works fine. Select your data range, open the Cell Styles gallery on the Home tab, right-click Normal, and pick Apply. That resets every selected cell to the Normal style without touching values or formulas. Then reapply your real formatting on top. You'll often see another 20-40% file size drop after this pass โ€” especially on workbooks that have been edited by ten different people over the years.

Watch out for one trap. The Cell Styles gallery itself can bloat. Open it and you might find dozens of styles named Normal 2, Normal 3, Normal 4, all inherited from copy-paste between workbooks. Right-click each duplicate and pick Delete. Each style you delete shrinks the file's style storage. Keep only the styles you actually use.

Excess Formatting: Three Ways to Hunt It Down

๐Ÿ“‹ Ctrl+End test

Press Ctrl+End on every sheet. If the cursor lands far past your real data, you have phantom formatting. Select the row below your last data row, press Ctrl+Shift+End, hit Delete, then right-click the row header and pick Delete. Repeat for columns. Save, close, reopen, and run Ctrl+End again to confirm. The marker should now sit on your actual last cell.

๐Ÿ“‹ Inquire add-in

File > Options > Add-Ins > COM Add-ins > Go > check Inquire > OK. The Inquire tab appears in the ribbon. Click Clean Excess Cell Formatting, pick All Sheets, save the file. Inquire walks every cell, drops redundant style links, and rebuilds the styles dictionary. Available in Microsoft 365 and Office 2019/2021 Professional Plus.

๐Ÿ“‹ Style reset

Open the Cell Styles gallery on the Home tab. Right-click Normal, pick Apply. This pushes every selected cell back to the Normal style. Then delete duplicate styles (Normal 2, Normal 3, etc.) by right-clicking them in the gallery and choosing Delete. The fewer styles, the smaller the styles XML.

๐Ÿ“‹ Conditional rules

Home > Conditional Formatting > Manage Rules > set the dropdown to This Workbook. If you see hundreds of rules, most are duplicates that piled up from drag-fill or paste-special. Delete them all and reapply your real rules from scratch. One rule per range, not one rule per cell, keeps the formatting XML small.

Fix #3: Compress or Replace the Images

Drag a phone photo into Excel and you've just added 4 or 5 MB to the file. The image gets stored at full resolution, even though the cell it sits in might only render a 200-pixel thumbnail. Excel has a one-click compressor built in. Click any image, switch to the Picture Format tab, and click Compress Pictures.

In the dialog, uncheck Apply only to this picture to compress every image in the workbook at once, choose Email (96 ppi) for maximum savings, and click OK. A workbook with twenty embedded photos can drop from 90 MB to 4 MB after this single click.

For better results, replace the source images before you paste them in. Resize them in any photo tool to about 800 pixels wide for visual content, or to whatever your sheet actually displays. Save them as .jpg for photos (smaller than PNG) or as .png for screenshots with text and crisp edges. Then paste.

Excel won't recompress, but it also won't bloat the file with data it can't display anyway. If you only need an image as a reference, consider linking instead of embedding: Insert > Pictures > Insert from URL or Insert > Pictures > Link to file. The file stays small and the image still appears.

One more trick. Excel logos, headers, and chart backgrounds often arrive as high-res PNGs with embedded transparency channels. Right-click the image, pick Save as Picture, save it locally, open it in any image editor, flatten the layers, and re-save as a lighter file. Then delete the original from Excel and re-insert the compressed version. You keep the visual, lose 95% of the bytes.

Take the Excel Practice Quiz

Fix #4: Strip Pivot Caches and Refresh on Open

Pivot tables are powerful, but they pay for that power with size. Each pivot stores a full snapshot of its source data so it can re-aggregate without reading the source again. That snapshot โ€” the pivot cache โ€” lives inside the workbook. If you've built five pivots from the same source, you might have five copies of that data in the file.

The first fix is to share the cache. Build the first pivot, then for every subsequent pivot pick Insert > PivotTable > Use external data source > Choose Connection and point at the existing pivot. All five pivots now share one cache.

The bigger win is dropping the cache from disk entirely. Right-click any pivot, pick PivotTable Options, switch to the Data tab. Uncheck Save source data with file. Check Refresh data when opening the file. Excel stops writing the cache to disk during save.

When you reopen the file, the cache gets rebuilt from the source, which can be either a sheet in the same workbook or an external connection. The file is smaller, the open time is a hair longer (you'll feel it on a million-row source, not on small data), and the cached data stays accurate because it's recreated fresh each session.

Orphan caches are even worse. If you deleted a pivot table but Excel didn't clean up its cache, that data lives in the file forever. The Excel UI doesn't expose orphan caches directly, but a quick fix flushes them: hit Alt+F11 to open the VBA editor, paste For Each pc In ActiveWorkbook.PivotCaches: pc.MissingItemsLimit = xlMissingItemsNone: Next pc into the Immediate window, press Enter, save the file. Alternatively, use the free third-party utility ASAP Utilities > Workbook > Find and remove orphaned pivot caches if you'd rather not touch VBA. Both approaches recover megabytes from old workbooks.

Quick Size-Reduction Checklist

Save a backup copy before making destructive changes.
Press Ctrl+End on every sheet to find phantom formatting.
Clear all formatting from rows and columns past your real data range.
Run the Inquire add-in's Clean Excess Cell Formatting tool.
Compress all images to Email (96 ppi) in Picture Format > Compress Pictures.
Open PivotTable Options > Data: turn off Save source data, turn on Refresh on open.
Save the file as .xlsb binary format and compare the file sizes.
Delete unused defined names in Formulas > Name Manager.
Replace pasted source data with Power Query or external connections.
Rename a copy to .zip and confirm xl/worksheets, xl/media, xl/pivotCache are all reasonable.

Fix #5: Save as .xlsb (Binary) Instead of .xlsx

The .xlsx format is XML wrapped in a zip โ€” readable, portable, and slightly chunky. The .xlsb format is the binary equivalent. Same data, same formulas, same charts, but stored as raw binary instead of marked-up XML. On a typical bloated workbook, switching from .xlsx to .xlsb cuts file size by 50-70% and roughly halves open and save times. The data is identical and Excel reads both formats natively โ€” no add-in, no converter, no quality loss.

The switch is one click. Open the file, File > Save As, change the dropdown to Excel Binary Workbook (*.xlsb), save. That's it. Every formula, chart, pivot, slicer, and macro carries over. The file still works in Excel for the web (with a couple of edge-case limitations), still works on Mac, still works in Excel mobile. The only thing it loses is human-readability inside the zip โ€” you can't crack it open in a text editor anymore. For most teams that doesn't matter, and the size saving is worth it.

When to skip .xlsb. If your workbook is consumed by a programmatic pipeline that reads XML directly (a Python script using openpyxl, for example, or a Power Automate flow that parses the XLSX), the script might not handle binary. Test before you switch. Some compliance environments also block binary spreadsheets because they can carry compiled macros. Check policy. Otherwise, default to .xlsb for any file you don't share with strangers.

Fix #6: Trim Conditional Formatting and Data Validation

Conditional formatting is a frequent bloat source because rules duplicate. Drag-fill a cell with a conditional rule and Excel often copies the rule for every destination cell instead of extending the original rule's range. Do that a thousand times across a workbook and you've got 30,000 overlapping rules where you wanted 30.

Open Home > Conditional Formatting > Manage Rules, change the scope dropdown to This Workbook, and scroll the list. If the count is wildly higher than the number of unique rules you intentionally created, the rules have duplicated. Select all and click Delete, then reapply each rule once with the correct range.

Data validation does the same thing. Each cell with a dropdown list stores a copy of the validation criteria. If you've copy-pasted dropdown cells across thousands of rows, each cell stores the rule separately. The lighter approach is to apply data validation to an Excel Table column once, then add rows โ€” the validation extends automatically without duplicating its storage. For sheets that aren't tables, select the range, apply validation in one shot via Data > Data Validation, and avoid the cell-by-cell copy that bloats the file.

.xlsb Binary vs .xlsx Standard

Pros

  • File size typically 50-70% smaller for the same data
  • Open and save times roughly halved on large workbooks
  • Native support in every modern Excel โ€” no converter needed
  • Formulas, charts, pivots, slicers, macros all carry over unchanged
  • Works in Excel for the web with minor limitations
  • Encryption and password protection still work normally

Cons

  • Not human-readable โ€” you can't open the zip and inspect XML
  • Some third-party tools (older openpyxl, certain ETL pipelines) don't parse it
  • Compliance environments sometimes block binary spreadsheets for macro risk
  • Slightly slower in Power Automate Excel actions versus .xlsx
  • Differential save (track every keystroke) less efficient than .xlsx
  • Recovery tools have weaker results on corrupted .xlsb than on .xlsx

Fix #7: Clean Out Defined Names and Macros You Don't Use

Defined names are tiny on their own, but they pile up. Copy a sheet from another workbook and Excel pulls every named range and named formula along with it โ€” often invisibly. Open Formulas > Name Manager and you might see thousands of entries: _xlnm.Print_Area, _xlfn.UNIQUE, Some.Vendor.Constant.001, dozens of internal names you never created.

Some are needed; most aren't. Click the Filter dropdown at the top of Name Manager and pick Names with Errors. Those refer to ranges that no longer exist โ€” pure dead weight. Select all, click Delete. Repeat for Names Defined by User that you don't recognize. Each name you remove shrinks the file's name storage and speeds up formula recalculation.

Macros are a different beast. A VBA project gets stored in a binary blob inside the file, and the size depends on how much code is in there. If you've never written or imported a macro, your file should have no macro project โ€” confirm by checking File > Info; if it offers to Remove Personal Information, accept.

If the file has a macro project you don't recognize (a common consequence of opening a file from a vendor or template), open the VBA editor with Alt+F11, find any module you didn't author, right-click, and pick Remove. Save as .xlsx (not .xlsm) to strip the macro project entirely. The file is smaller and the macro warning at open time goes away.

Fix #8: Use External Data Connections Instead of Embedded Source Data

If your workbook contains a million-row table that's a copy of a database query, you're carrying that data twice โ€” once in the source system and once inside the spreadsheet. The lighter pattern is to connect instead of copy.

Use Data > Get Data > From Database (or From SharePoint, From Web, From File) to pull data through Power Query at refresh time. Excel stores the query definition, not the rows. The pivot, the chart, and the formula references all work, but the actual data lives outside the workbook. File size drops dramatically and the data stays current.

For files that need to travel offline, the middle ground is to use Power Query but load to the data model instead of to a sheet. Get Data > query > Load To > Only Create Connection > Add this data to the Data Model. The data model compresses heavily โ€” a 50 MB source table can sit in the model in 3-5 MB thanks to columnar storage. Pivots and Power Pivot measures work against the model directly. The file stays small even with the full data inside.

Practice More Excel Skills

Putting It All Together: A Workflow for the Bloated Workbook

Hit a 40 MB file and you can guess your way through fixes, but a systematic pass usually wins. Start with the diagnostic โ€” rename to .zip, see which folder is biggest. If xl/worksheets dominates, run the Ctrl+End test and clear excess formatting first. If xl/media is bigger, compress images. If xl/pivotCache is the heaviest, attack pivot caches with refresh-on-open. After each fix, save the file, close it, reopen it, and check the new file size. Don't skip the close-and-reopen step โ€” Excel doesn't always reclaim space until the file is saved fresh.

Once the structural fixes are in, try the format switch. Save a copy as .xlsb and compare file sizes. If the workbook is small enough to share again, you're done. If it's still huge, the leftover bloat is almost always one of three things: a duplicated chart embedded as a picture, a deeply nested defined-name graph (vendor templates love these), or an unrecognized OLE object lurking on a hidden sheet. Open the zip, browse xl/embeddings/ and xl/drawings/, and you'll usually spot the orphan. Delete the source object inside Excel, save, and the bloat goes with it.

The maintenance habit that prevents future bloat is simple. Apply formatting to ranges, not to whole columns. Convert recurring datasets to Excel Tables so format follows data. Compress images when you insert them, not later. Refresh pivots from external connections rather than from pasted source data. Save as .xlsb for personal files, keep .xlsx only when you have a specific compatibility reason. None of these take extra time once you build the habit, and your future self will thank you when no workbook ever crosses 5 MB again.

Excel Questions and Answers

Why is my Excel file so large when it has so little data?

The most common cause is formatting applied to whole columns or rows instead of just the data range. Excel tracks formatting for every cell in the formatted region, even if those cells are empty, which can balloon a sheet's storage to tens of megabytes. Other common culprits are embedded uncompressed images, orphan pivot caches from deleted pivots, duplicate conditional formatting rules, and inherited defined names from vendor templates.

How do I reduce Excel file size without losing data?

Every recommended fix preserves your data. Clear formatting from blank cells past Ctrl+End, compress images via Picture Format > Compress Pictures, set pivot tables to refresh on open instead of saving the cache, save the file as .xlsb binary, and delete unused defined names in Name Manager. None of these touch the numbers, formulas, charts, or sheet structure you actually care about.

What is the .xlsb format and is it safe to use?

The .xlsb format is Excel Binary Workbook โ€” the same data as .xlsx but stored in binary instead of XML inside the zip. It cuts file size by 50-70% on most workbooks and roughly halves open and save times. It supports every Excel feature including macros, charts, pivots, and slicers. It works in Excel for the web, Mac, and mobile. The only caveat is that some third-party tools and compliance environments prefer XML-based formats, so test before switching for shared workflows.

How do I compress images in Excel to reduce file size?

Click any image, go to the Picture Format tab, click Compress Pictures, uncheck Apply only to this picture, pick Email (96 ppi) for maximum savings, and click OK. Excel recompresses every image in the workbook at once. A file with twenty embedded photos can drop from 90 MB to under 5 MB from this single action.

Why does my Excel file get bigger after I delete data?

Deleting cell contents removes values but often leaves the formatting and style references behind. The cells look empty but Excel is still tracking formatting for them. Use Home > Clear > Clear All on the affected ranges, not just Delete, to drop both content and format together. After clearing, save and reopen the file โ€” Excel rewrites the sheet XML during a fresh save, which is when the actual size reduction lands.

How do I find what is making my Excel file so large?

Save a copy of the workbook, rename it from .xlsx to .zip, and open it like a regular folder. Look at the sizes of xl/worksheets (excess formatting), xl/media (images), and xl/pivotCache (pivot caches). Whichever folder is largest tells you which fix to apply first. This diagnostic takes thirty seconds and is more accurate than guessing.

Can I reduce Excel file size in Excel for the web?

Excel for the web has fewer cleanup tools than the desktop app โ€” there is no Inquire add-in and no Save As .xlsb option from the browser. You can still compress images, delete unused sheets, clear formatting from blank cells, and remove defined names in Name Manager. For deeper cleanup, download the file, run the desktop fixes, and re-upload.

Does saving as .xlsb break anything compared to .xlsx?

For everyday workbooks, no. .xlsb supports every feature .xlsx supports โ€” formulas, charts, pivots, slicers, macros, conditional formatting, data validation, password protection. The only practical differences are reduced human-readability of the underlying file (binary instead of XML), slightly weaker recovery if the file corrupts, and reduced compatibility with a small set of older third-party tools that only read .xlsx.
โ–ถ Start Quiz