Thursday, June 25, 2026

SQL Server 2025 ZSTD LEVEL Isn't Exactly What You Think It Is

SQL Server 2025's new ZSTD backup compression is a real upgrade. It's also one of the more over-hyped ones. Ask the internet. You'll hear it shrinks your files 30-50%, halves your RTO, and even powers row and page compression. One of those is half-true, one is shaky, and one is just wrong. Here's the version backed by people who actually ran the backups.

First, the good news: there's nothing to 'turn on'. No configuration setting to enable and no restart. ZSTD is just a value you pass to the COMPRESSION option in your backup command like this:

BACKUP DATABASE YourDB
TO DISK = 'C:\MSSQL\Backup\YourDB.bak'
WITH COMPRESSION (ALGORITHM = ZSTD);   -- defaults to LEVEL = LOW

That one word -- ZSTD -- is the whole feature. But what does it do? How is it any different from the compression you were already using?

Three compression algos

That question has three possible answers, because SQL Server 2025 gives you three backup compression algorithms to choose from. The one real, unarguable win for ZSTD over the 2022 addition is that it needs no special hardware.

Algorithm Since Special hardware?
MS_XPRESS 2008 No (still the default)
Intel QAT 2022 Yes -- Intel QAT
ZSTD 2025 No

But picking ZSTD is only half the decision. Unlike MS_XPRESS, ZSTD lets you dial the effort with a LEVEL setting, and that's where the real choice lives. There are three, and if you don't name one you get LOW:

... WITH COMPRESSION (ALGORITHM = ZSTD, LEVEL = LOW);      -- the default
... WITH COMPRESSION (ALGORITHM = ZSTD, LEVEL = MEDIUM);
... WITH COMPRESSION (ALGORITHM = ZSTD, LEVEL = HIGH);

The instinct might be to read LOW, MEDIUM, HIGH as 'a little', 'some', and 'a lot' of compression, and then reach for HIGH -- but that's wrong. You have to know what the three level settings actually do to understand why.

Think of the LEVEL setting as more of a speed dial than a size dial.

Here's what those three levels really influence. Moving from LOW to HIGH always trades more CPU and backup time for additional compression, but on many real-world databases the extra space savings are surprisingly small. Think of the LEVEL setting as more of a speed dial than a size dial, and reaching for HIGH because it sounds better may not have the impact you're looking for.

Run the demo below against a real database and you'll watch it happen. On my ~2 GB test, ZSTD LOW finished about 38% faster than MS_XPRESS (roughly 3.0 versus 4.8 seconds) while producing a bak file almost exactly the same size. LOW already matches the old default and beats it on the clock. And here's the kicker: LOW actually finished faster than backing up with no-compression at all, because compression left so much less to write to disk.

In my test, MEDIUM and HIGH shaved well under 1% off the size, and HIGH took about 3x as long as LOW to do it. Triple the work to shave a rounding error. My maintenance windows are already too tight to spend that without a better gain.

The speed win travels. On a far larger ~389 GB real-world set, Anthony Nocentino clocked ZSTD LOW about 44% faster than MS_XPRESS -- but on his data the LOW file came out slightly larger than MS_XPRESS, where on mine it came out a touch smaller. That's the real takeaway. LOW's most consistent win is the clock, not the byte count. Which way the size tips depends on your data, so don't bank on LOW to beat MS_XPRESS on size. Bank on it to beat MS_XPRESS on time -- that part holds.

Level Speed vs MS_XPRESS Size vs MS_XPRESS Reach for it when
LOW (default) Faster About the same Almost always -- the sweet spot
MEDIUM Slower A hair smaller You want every last MB and can spare the time
HIGH Much slower Barely smaller than MEDIUM Archival / offsite, time truly doesn't matter

Prove it with your own numbers

Compression ratios and timings swing with your data, so the only numbers that matter are yours. Here's a self-contained demo that builds a ~2 GB lab database, backs it up five ways while timing each to the millisecond, and prints a scoreboard. Point it at a real dev database for even cleaner gaps. ZSTDdemo and C:\MSSQL\Backup\ are placeholders -- rename yours to suit.

1. Build and inflate a lab database

The trick to realistic numbers is mixing data that compresses with data that doesn't. Like a repetitive text column that squashes down nicely, plus a random binary column that won't budge -- that's the type of blend you'd see in real pages. This demo builds about 2 GB. Plan for ~5 GB of .bak files across the five runs, so be sure you've got enough space free on your backup drive.

USE master;
GO
CREATE DATABASE ZSTDdemo;
GO

ALTER DATABASE ZSTDdemo SET RECOVERY SIMPLE;
ALTER DATABASE ZSTDdemo MODIFY FILE (NAME = 'ZSTDdemo', SIZE = 3GB, FILEGROWTH = 512MB);
GO

USE ZSTDdemo;
GO

-- heap (no clustered key) so the bulk load can be minimally logged
CREATE TABLE dbo.Filler
(
    ID INT IDENTITY(1,1) NOT NULL,
    Padding CHAR(2000) NOT NULL,      -- repetitive -> compresses
    Noise VARBINARY(2000) NOT NULL    -- random     -> will not compress
);
GO

SET NOCOUNT ON;
DECLARE @batches INT = 100, @i INT = 0;   -- ~20 MB/batch; 100 ~ 2 GB. Bump for more.
WHILE @i < @batches
BEGIN
    INSERT dbo.Filler WITH (TABLOCK) (Padding, Noise)
    SELECT TOP (5000)
           REPLICATE(CONVERT(CHAR(50),
               CONCAT('cust-', a.object_id, '-region-', a.object_id % 7, '-')), 40),
           CRYPT_GEN_RANDOM(2000)
    FROM sys.all_objects a CROSS JOIN sys.all_objects b;

    SET @i += 1;
    IF @i % 20 = 0 CHECKPOINT;            -- keep the log small
END;
GO

2. Back it up five ways, timed, and read the scoreboard

Backup history only records start/finish to the second, which rounds sub-second backups to zero. So the demo times itself to the millisecond into a temp table and then joins to history for the sizes. Run this whole block in one window -- the temp table only lives for this session.

USE master;
SET NOCOUNT ON;

IF OBJECT_ID('tempdb..#t') IS NOT NULL DROP TABLE #t;
CREATE TABLE #t (Test sysname, StartedAt datetime2(3), EndedAt datetime2(3));
DECLARE @s datetime2(3);

SET @s = SYSDATETIME();
BACKUP DATABASE ZSTDdemo TO DISK = 'C:\MSSQL\Backup\ZSTDdemo_none.bak' 
  WITH INIT, FORMAT, NO_COMPRESSION, NAME = 'NONE';
INSERT #t VALUES ('NONE', @s, SYSDATETIME());

SET @s = SYSDATETIME();
BACKUP DATABASE ZSTDdemo TO DISK = 'C:\MSSQL\Backup\ZSTDdemo_xpress.bak'
  WITH INIT, FORMAT, COMPRESSION (ALGORITHM = MS_XPRESS), NAME = 'MS_XPRESS';
INSERT #t VALUES ('MS_XPRESS', @s, SYSDATETIME());

SET @s = SYSDATETIME();
BACKUP DATABASE ZSTDdemo TO DISK = 'C:\MSSQL\Backup\ZSTDdemo_zstd_low.bak'
  WITH INIT, FORMAT, COMPRESSION (ALGORITHM = ZSTD, LEVEL = LOW), NAME = 'ZSTD LOW';
INSERT #t VALUES ('ZSTD LOW', @s, SYSDATETIME());

SET @s = SYSDATETIME();
BACKUP DATABASE ZSTDdemo TO DISK = 'C:\MSSQL\Backup\ZSTDdemo_zstd_med.bak'
  WITH INIT, FORMAT, COMPRESSION (ALGORITHM = ZSTD, LEVEL = MEDIUM), NAME = 'ZSTD MEDIUM';
INSERT #t VALUES ('ZSTD MEDIUM', @s, SYSDATETIME());

SET @s = SYSDATETIME();
BACKUP DATABASE ZSTDdemo TO DISK = 'C:\MSSQL\Backup\ZSTDdemo_zstd_high.bak'
  WITH INIT, FORMAT, COMPRESSION (ALGORITHM = ZSTD, LEVEL = HIGH), NAME = 'ZSTD HIGH';
INSERT #t VALUES ('ZSTD HIGH', @s, SYSDATETIME());

-- scoreboard: timing from the harness, sizes from backup history
SELECT r.Test,
       DATEDIFF(MILLISECOND, r.StartedAt, r.EndedAt) Millis,
       CAST(bs.backup_size / 1048576.0 AS DECIMAL(10,1)) RawMB,
       CAST(bs.compressed_backup_size / 1048576.0 AS DECIMAL(10,1)) FileMB,
       CAST(bs.backup_size * 1.0
            / NULLIF(bs.compressed_backup_size, 0) AS DECIMAL(5,2)) Ratio
FROM #t AS r
CROSS APPLY (
    SELECT TOP (1) backup_size, compressed_backup_size
    FROM msdb.dbo.backupset
    WHERE database_name = 'ZSTDdemo' 
    AND name = r.Test
    ORDER BY backup_finish_date DESC
) AS bs
ORDER BY r.StartedAt;

IF OBJECT_ID('tempdb..#t') IS NOT NULL
DROP TABLE #t;

My scoreboard, on a ~2 GB database:

Read it like this. NONE is your uncompressed baseline, and MS_XPRESS is the old default you're trying to beat. Compare every ZSTD row against the MS_XPRESS row. FileMB answers 'is it smaller?' -- not by much -- and Millis answers 'is it faster?'. LOW and MEDIUM, yes, but what happened to HIGH? We have significantly increased duration for almost no shrinkage at all.

3. Confirm what actually landed in the file

Don't take it on faith -- RESTORE HEADERONLY reads the header without restoring, and the CompressionAlgorithm column is your receipt. Run this and be sure it says 'ZSTD'.

RESTORE HEADERONLY
FROM DISK = 'C:\MSSQL\Backup\ZSTDdemo_zstd_low.bak';

The fine print they skip

30-50% is a ceiling, not a promise. That figure is Microsoft's internal benchmark. Plenty of independent testers never hit it. Remember, this depends entirely on your data shape. Treat it as best-case, not as expected.

Encrypted data barely compresses -- with any algorithm. If you see a claim that ZSTD can find hidden patterns in column-encrypted data or TDE, ignore it. Encrypted bytes are high-entropy by design, and Microsoft's own docs are clear that compressing encrypted backups may not shrink them much at all. You should always compress before you encrypt anyway. If you compress after encryption, the storage savings will be negligible with too much overhead.

Faster restores are real, but secondary. ZSTD does decompress efficiently, and Nocentino measured restore improvements -- so it's a genuine perk. Just don't expect it to halve your RTO on its own. Restore time is dominated by I/O, not decompression.

This is backup compression, full stop. ZSTD in SQL Server 2025 applies to BACKUP. It does not change ROW or PAGE data compression, regardless of what you may read elsewhere.

Version requirements

ZSTD backup compression requires SQL Server 2025 (17.x) or later, any edition, Windows or Linux, no special hardware. It's T-SQL only -- there's no checkbox for it in SSMS and you don't run sp_configure. It's just in your BACKUP command like the examples above. Also important, Ola Hallengren's backup solution already supports ZSTD, so you're good there, if you're running that like so many of the rest of us. Thank you, Ola!

If you only remember one thing from this post, it should be that you should benchmark your own data. ZSTD's LEVEL changes both backup time and compression ratio, but on many real-world databases, the difference in backup time is far more than the difference in backup size.

More to Read

Backup Compression (SQL Server) -- Microsoft Learn
BACKUP (Transact-SQL) -- Microsoft Learn
Anthony Nocentino -- Using ZSTD Backup Compression
Simple Talk -- SQL Server 2025 ZSTD Backup Compression Guide

No comments:

Post a Comment