Topics

Saturday, April 4, 2026

Query Store Hints: Code-Free Performance Tuning

That query is killing your server. You know which one. You found the plan with sp_BlitzCache a couple weeks ago and spit out your coffee. But, it belongs to a vendor application and you cannot touch it.

Your SQL Server doesn't care about any of that.

And, you do have an option. SQL Server 2022 and later, you can 'fix' the query without touching it. No code changes. No procedure rewrites. No begging the business rep to contact the vendor. You just inject a hint using Query Store and the next execution performs much better. That's Query Store hints — and if you're not using them, you've left a very powerful DBA tool just sitting on the shelf.

This is Code-Free Performance Tuning.

Query Store: The Flight Data Recorder

If you read my PSP Optimization post, you already know that Query Store is the foundation under all of SQL Server's modern Intelligent Query Processing features. Think of it like the flight data recorder for your database — it captures query text, execution plans, and runtime statistics across every execution and it survives a restart. Unlike DMVs, which vanish on restart, Query Store remembers. That history is what makes hints possible — and you don't have to setup something new to perform diagnostics before you can use it. How cool is that? 😆

Query Store is on by default in SQL Server 2022. v2019 or earlier, turn it on with this:

ALTER DATABASE YourDatabase
SET QUERY_STORE = ON (
    OPERATION_MODE = READ_WRITE,
    MAX_STORAGE_SIZE_MB = 1024
);

What a Query Store Hint Actually Does

You pick a hint — MAXDOP, RECOMPILE, a join strategy, etc. You attach it to a specific query_id in Query Store. SQL Server intercepts the query at execution time and injects the hint before the optimizer sees it. The application sends the same T-SQL it always has, but the optimizer now processes it differently.

Step 1: Find the Query ID

Everything starts with the query_id. Query Store already assigned one to every query that's touched your database. Run this to find your worst offenders:

-- Top 20 queries by average logical reads in Query Store
SELECT TOP 20
    qsqt.query_sql_text,
    qsq.query_id,
    qsrs.avg_logical_io_reads,
    qsrs.avg_duration / 1000.0  AS avg_duration_ms,
    qsrs.count_executions,
    qsp.plan_id
FROM sys.query_store_query_text qsqt JOIN sys.query_store_query qsq
  ON qsqt.query_text_id = qsq.query_text_id JOIN sys.query_store_plan qsp
    ON qsq.query_id = qsp.query_id JOIN sys.query_store_runtime_stats qsrs
      ON qsp.plan_id = qsrs.plan_id
ORDER BY qsrs.avg_logical_io_reads DESC;

Write down the query_id. That's your target.

Step 2: Inject the Hint

The system procedure is sys.sp_query_store_set_hints. Pass the query_id and whatever hint you'd normally write in an OPTION clause:

-- Query hammering CPU with bad parallelism? Cap it.
EXEC sys.sp_query_store_set_hints
    @query_id    = 1234,
    @query_hints = N'OPTION (MAXDOP 1)';
-- Bad cached plan? Force a recompile every time. Use carefully.
EXEC sys.sp_query_store_set_hints
    @query_id    = 1234,
    @query_hints = N'OPTION (RECOMPILE)';
-- Optimizer keeps choosing nested loops. Make it stop.
EXEC sys.sp_query_store_set_hints
    @query_id    = 1234,
    @query_hints = N'OPTION (HASH JOIN)';
-- Stack them.
EXEC sys.sp_query_store_set_hints
    @query_id    = 1234,
    @query_hints = N'OPTION (MAXDOP 4, HASH JOIN)';

Done. No code change. No restart. No cache flush. The hint fires on the next call.

Verify It Applied

SELECT
    query_hint_id,
    query_id,
    query_hint_text,
    last_query_hint_failure_reason_desc,
    is_enabled_by_system
FROM sys.query_store_query_hints
WHERE query_id = 1234;

Check last_query_hint_failure_reason_desc. If it's not NO_FAILURE, the hint didn't apply — and that column tells you exactly why. No guessing.

Remove It When You're Done

The vendor shipped a fix and the underlying problem is gone. Pull the hint:

EXEC sys.sp_query_store_clear_hints
    @query_id = 1234;

Why This Buries Plan Guides

Plan Guides have been around since SQL Server 2005 and allow you to optimize queries when you cannot change the code - but the implementation is hellish. A Plan Guide requires an exact text match: whitespace, parameter declarations, the sp_executesql wrapper, all of it. One extra space and the guide silently does nothing. You won't know. It just won't work.

Query Store Hints match on query_id. That's it. No text matching. Failures surface in a DMV instead of disappearing into the void.

Plan Guides Query Store Hints
Matching method Exact T-SQL text match query_id — no text matching
Silent failures Yes — whitespace breaks them No — failure reason surfaced in DMV
SSMS visibility sys.plan_guides only Integrated into QS interface
Minimum version SQL Server 2005 SQL Server 2022

The Per-Query Compatibility Level Trick

This one is underused and underappreciated. You can use a Query Store hint to force a specific database compatibility level for a SINGLE query while leaving everything else at the current level. That means you could upgrade a database to a newer compatibility level for the new features and keep one query pinned an older CL until you sort out why it always goes sideways.

-- Force compat level 150 for one query while the DB runs at 160
EXEC sys.sp_query_store_set_hints
    @query_id    = 1234,
    @query_hints = N'OPTION (USE HINT (''QUERY_OPTIMIZER_COMPATIBILITY_LEVEL_150''))';

If you've ever delayed a compatibility level upgrade because some queries went south with the new CL, this is your exit ramp. Pin the problem query and upgrade the database CL. Fix it later when your schedule allows.

SQL Server 2025: The Nuclear Option

Query Store hints were introduced in SQL Server 2022 but were significantly extended in SQL Server 2025. Using ABORT_QUERY_EXECUTION, the query tries to run. SQL Server kills it. The caller gets an error. That's it.

-- Block a query from executing entirely. SQL Server 2025+.
EXEC sys.sp_query_store_set_hints
    @query_id    = 1234,
    @query_hints = N'OPTION (USE HINT (''ABORT_QUERY_EXECUTION''))';

The caller gets:

Msg 8778, Level 16, State 1
Query execution has been aborted because the ABORT_QUERY_EXECUTION hint was specified.

Think about what that means. A report doing a full scan of a 500GB table every time someone hits a button, or a runaway ETL query that drags your server to its knees every Monday at 9AM. You can block it right now, in 30 seconds, without touching the application ...and now you've got time to have another discusion with whomever owns that code.

To unblock:

EXEC sys.sp_query_store_clear_hints @query_id = 1234;

SQL Server 2025: Hints on Secondary Replicas

Before SQL Server 2025, Query Store hints could only target the primary replicas. In SQL Server 2025 we can now use hints with your readable secondary replicas, completely removed from the primary.

-- Find your replica group ID first
SELECT * FROM sys.query_store_replicas;

-- Apply the hint only on that secondary
EXEC sys.sp_query_store_set_hints
    @query_id         = 1234,
    @query_hints      = N'OPTION (MAXDOP 2)',
    @replica_group_id = 1;  -- from sys.query_store_replicas

Maybe you've got all reporting going to your secondary but it's behaving differently than primary. Different statistics or data distribution? Now you can tune your secondary separately without touching the primary.

The Query Hint Recommendation Tool in SSMS v22

SSMS v22 ships with a Query Hint Recommendation tool that tests different hint combinations against your query and tells you which ones actually help. Brent Ozar put it through the wringer and the results were genuinely useful — but it does run the query multiple times and has a performance overhead of 3-5%. Don't point it at production at the wrong time of day.

One catch: it's not installed by default. Open the Visual Studio Installer, select Modify on your SSMS 22 installation, go to the Code Tools workload, and check 'Query Hint Recommendation Tool'. Once installed, you'll find it under Tools → Query Hint Recommendation Tool in SSMS. Highlight a SELECT query, click Start, and it runs it repeatedly with different hint combinations. Definitely worth installing, but remember, it will be executing your query multiple times. You want to be smart about when you run it.

The Bottom Line

Query Store hints are not a workaround. They're a legit, production-grade mechanism for fixing execution plans on code you don't own and cannot modify. Vendor apps, 3rd party pieces, Reporting tools — Find thequery_id. Apply the hint. All done.

If you think your Query Store hints are being ignored or not working, or you need some help identifying what is killing your server, call me. Let's consider a SQL Server Health Check.

More to Read:

SQL Server 2022's Parameter Sensitive Plan Optimization — sqlfingers.com
Query Store Hints — Microsoft Learn
Query Store Hints Best Practices — Microsoft Learn
SSMS v22 Query Hint Recommendation Tool — Brent Ozar

Thursday, April 2, 2026

Your Nightly Index Rebuild Job May Be On Its Way Out

Microsoft's announcement of Automatic Index Compaction is titled 'Stop defragmenting and start living'. That is not an accident. Brent Ozar has been making the case for years that defragmenting indexes is largely maintenance theater — that external fragmentation barely matters on modern SSDs and shared storage and that nightly rebuild jobs hammer your transaction log and I/O for gains that are difficult to measure.

His sessions on the topic have been circulating for over a decade, and now Microsoft's own documentation states it plainly: 'For most workloads, a higher index fragmentation doesn't affect query performance or resource consumption.' I believe that may be Brent's argument almost verbatim in their official docs.

What Microsoft is shipping instead is Automatic Index Compaction — a continuous, low-overhead background process that keeps page density high as your data changes, without a scheduled job, without a maintenance window, and without the collateral damage of a full rebuild. Here is what it actually is, how it works, and what the honest limitations are.

What It Does

Index bloat happens because pages get partially empty as data changes — deletes leave gaps, updates move rows, and over time you're reading twice the pages for the same data. Automatic Index Compaction attacks that continuously in the background. It piggybacks on the PVS cleaner — or the persistent version store, which is the same process that removes obsolete row versions after DML. As it visits recently modified pages, it checks whether rows from the pages can be consolidated onto the current page. If that move frees at least one full page, it makes the move and deallocates the empty page. Fewer pages, higher page density, less I/O to read the same data.

The critical distinction from REORGANIZE and REBUILD is the scope. REORGANIZE and REBUILD process every page in an index. Automatic compaction only touches pages that have been recently modified. That is what makes the overhead minimal rather than punishing.

How to Enable It

One command, per database, no restart required. Compaction starts or stops within minutes:

-- Enable
ALTER DATABASE [YourDatabase]
SET AUTOMATIC_INDEX_COMPACTION = ON;

-- Disable
ALTER DATABASE [YourDatabase]
SET AUTOMATIC_INDEX_COMPACTION = OFF;

How It Compares to What You're Already Doing

Consideration REORGANIZE / REBUILD Auto Compaction
Scope All pages in the index Recently modified pages only
Overhead High — hammers I/O and log during maintenance window Minimal — runs continuously in background
Reduces fragmentation Yes No — improves page density, not fragmentation
Updates statistics REBUILD does; REORGANIZE does not No
Requires scheduled job Yes No
Space in data files required REBUILD needs free space equal to index size None
Blocking risk REBUILD can block; REORGANIZE is always online Short-term millisecond locking only, rare

Note on fragmentation vs. page density. There are two different metrics in sys.dm_db_index_physical_stats: avg_fragmentation_in_percent and avg_page_space_used_in_percent. Compaction targets page density, not fragmentation. You may actually see fragmentation numbers go up when compaction is running. Microsoft says this is expected and, for most workloads, not a concern.

The Gotchas

It is not available for on-prem SQL Server 2025 yet.

Don't get too excited. The Microsoft Learn documentation for SQL Server 2025 index maintenance references automatic compaction as an alternative — but the feature itself currently applies only to Azure SQL Database, Azure SQL Managed Instance (with the Always-up-to-date update policy) and SQL database in Fabric. On-prem SQL Server 2025 is not in the 'Applies to' list. Watch this one. It is clearly on its way, but it is not here yet.

It maintains compactness better than it remediates bloat.

This is the limitation Jeff Moden surfaced in Brent's comment thread when the announcement dropped, and Dimitri Furman from Microsoft confirmed it. The process works on recently modified pages. If you have a badly bloated index today, compaction will improve it gradually over time as DML touches those pages — but how fast depends entirely on your workload patterns. A table with low DML activity may take a very long time to compact. If you have significant existing bloat, the guidance is to run a one-time REBUILD first to get page density up, then let auto compaction maintain it from that point forward.

Fill factor edge case.

Compaction never fills a page above the fill factor. But if DML has already pushed a page above the fill factor, compaction will not reduce it. Those pages stay as-is. For most shops running at the default fill factor of 100%, this is not a practical issue.

It does not update statistics.

If your current rebuild job is also your statistics update strategy — and in many shops it is — you will need a separate statistics update job to cover that gap. Auto compaction does not touch statistics.

Transaction log impact.

Moving rows between pages generates log writes. For write-intensive workloads, you may see an increase in log I/O and larger transaction log backups. The Microsoft FAQ says this impact is not noticeable for most workloads, but I would still be sure to monitor after enabling.

It suspends under pressure.

Compaction is deprioritized when the PVS cleaner is under load. It suspends entirely when PVS size exceeds 150 GB or when aborted transactions exceed 1,000. In those conditions, PVS cleanup takes priority and compaction stops until the backlog clears.

What it will not compact:

Not eligible
Heap tables
ROW_OVERFLOW_DATA and LOB_DATA allocation units
Columnstore compressed rowgroups
Memory-optimized tables
System tables (msdb is the sole exception)
Indexes with page locks disabled

Monitoring It

If you enable this in Azure today, the query below will give you page count, average page density, and average fragmentation across all eligible indexes — the three numbers to track before and after enabling compaction:

SELECT 
    COALESCE(OBJECT_SCHEMA_NAME(ips.object_id), '<Total>') [schema_name],
    COALESCE(OBJECT_NAME(ips.object_id), '<Total>') [object_name],
    COALESCE(i.name, '<Total>') [index_name],
    AVG(ips.avg_page_space_used_in_percent) [avg_page_density],
    AVG(ips.avg_fragmentation_in_percent) [avg_fragmentation],
    SUM(ips.page_count) [page_count]
FROM 
    sys.dm_db_index_physical_stats(DB_ID(), DEFAULT, DEFAULT, DEFAULT, 'SAMPLED') ips INNER JOIN sys.indexes AS i
      ON ips.object_id = i.object_id
      AND ips.index_id = i.index_id
WHERE 
    i.type_desc IN ('CLUSTERED', 'NONCLUSTERED', 'XML', 'SPATIAL')
    AND ips.index_level = 0
    AND ips.page_count > 0
    AND ips.alloc_unit_type_desc = 'IN_ROW_DATA'
GROUP BY 
    ROLLUP(ips.object_id, i.name, i.type_desc, ips.partition_number)
    HAVING (ips.object_id IS NULL AND i.name IS NULL)
    OR (ips.object_id IS NOT NULL AND i.name IS NOT NULL)
ORDER BY 
    IIF(ips.object_id IS NULL, 0, 1), page_count DESC;

Change 'SAMPLED' to 'DETAILED' for precise results — but be aware that DETAILED mode does a full scan, which can be timely on large databases.

The Bottom Line

What it is not, is a replacement for all index maintenance. Statistics still need to be maintained separately. Severely bloated indexes benefit from a one-time REBUILD first, and for workloads with heavy random deletes causing genuine size growth, a targeted rebuild on the worst offenders is still the right tool.

But the underlying message from Microsoft — that fragmentation percentages are less important than page density, and that nightly full-index rebuilds are often more overhead than benefit — is the same argument that has been circulating in this community for years. The feature is the engine finally making it operational rather than just philosophical.

On-prem shops: watch the Microsoft Learn docs for an update to the 'Applies to' line on the automatic compaction page. That is when the conversation gets practical for most of your environments.

More to Read:

Stop Defragmenting and Start Living: Introducing Auto Index Compaction — Microsoft Community Hub
Automatic Index Compaction (preview) — Microsoft Learn
Optimize Index Maintenance to Improve Query Performance — Microsoft Learn
Stop Worrying About SQL Server Index Fragmentation — Brent Ozar
Why Index Fragmentation Doesn't Matter — Brent Ozar

Wednesday, April 1, 2026

Azure Is Selling You Half a Core at Full SQL Server Price

The Azure migration pitch is convincing. Lower overhead, no hardware refresh cycles, pay for what you use. Management loves it. The CFO approved the budget. The project kicked off. And then — somewhere between the on-prem decommission and the first Azure invoice — somebody notices the SQL Server licensing line is larger than expected. Quite a bit larger.

This is not a billing error. It is a VM selection problem — and it is very easy to walk straight into if nobody flags it during the planning phase.

The Short Version

Azure VMs are sized in vCPUs — virtual CPUs. On most Azure VM series, one vCPU is one hyper-threaded logical core (hyper-threading is the common name for SMT — Simultaneous Multithreading), not one physical core. A physical core with hyper-threading enabled exposes two logical cores to the operating system. SQL Server sees both logical cores. SQL Server licensing counts both logical cores. But under any real CPU load, those two logical cores share the physical core's execution resources. You are paying for two licenses to run on what is, functionally, one core's worth of throughput.

On-premises, this distinction rarely bites you. VMware will preferentially schedule a VM's vCPUs across different physical cores rather than packing two threads onto the same core. You tend to get close to physical core performance per vCPU. Azure does not offer that guarantee. In a multi-tenant cloud environment, your VM gets the vCPUs you paid for — logical cores, fairly distributed — and that's it.

Joe Obbish documented this thoroughly last week over at Darling Data. The research is solid. Go read it. What I want to do here is translate what it means for you — and for your customers — before the migration truck shows up.

What the Numbers Actually Look Like

This is not a theoretical concern. The cost difference between VM families, expressed as effective SQL Server licensing cost per physical core, is significant enough to change a project budget.

VM Series SMT Monthly Cost License / Physical Core
E16bds v5 Enabled $5,892.56 $547.50
F16ams v7 Disabled $5,962.64 $259.52

The monthly sticker price is nearly identical but the effective licensing cost per physical core is more than double. The workload that fits on a 16-physical-core on-prem box may need 32 vCPUs on a hyper-threaded Azure VM to deliver equal throughput — which means 32 licensed cores instead of 16. On SQL Server Enterprise Edition, that difference is not a rounding error. It is a budget conversation you do not want to have after the migration is done.

A commenter on Joe's post said it plainly: 'This seems intentional, honestly.' Whether it is or not, it doesn't change the math. What matters is that you know about it before your customer's first invoice, not after.

Why Nobody Catches This in the Planning Phase

The Azure sizing conversation usually goes like this: 'You're on a 16-core on-prem VM, so let's look at 16-vCPU Azure VMs.' The vCPU count matches. The memory looks right. The pricing seems reasonable. The migration plan gets approved. What the conversation missed was a question about whether those 16 vCPUs represent 16 physical cores or 8 physical cores with hyper-threading, and what that means for both performance and licensing.

The Azure documentation I've seen does not surface this loudly. The VM series pages describe vCPU counts, memory, and storage. The specific marketing language about 'full physical cores' only appears on the series that don't use SMT — ie., the ones where the distinction is a selling point. On the series that do use hyper-threading, there is no equivalent language flagging it. You have to know to look.

There is also a natural assumption carried over from on-prem: hyper-threading doesn't affect SQL Server licensing on physical hardware. Microsoft's own licensing guide says so. That rule applies to physical servers. Azure VMs are a different licensing surface, and the assumption does not transfer.

Joe Obbish calls it a licensing trap. That is exactly the right word for it — because it only catches you if you don't see it coming.

What to Check While Planning Your Customer's Move to Azure

First, know exactly what you're moving. How many physical cores is the current SQL Server licensed for, and what is the actual CPU usage pattern? A lightly loaded instance at 10% CPU is a different migration conversation than one at 60–70% usage.

-- Current core count and SQL Server edition
SELECT
    cpu_count                          AS logical_cpus,
    hyperthread_ratio,
    cpu_count / hyperthread_ratio      AS physical_cores,
    SERVERPROPERTY('Edition')          AS edition
FROM sys.dm_os_sys_info;

If hyperthread_ratio is 2 and cpu_count is 16, you have 8 physical cores with hyper-threading enabled. That 8-physical-core workload moved to a hyper-threaded Azure VM will need 16 vCPUs to match throughput — which is 16 licensed cores. If your current licensing is based on 8 physical cores, this has just become a licensing change, not just a migration.

Second, check average and peak CPU utilization. Moving an instance that rarely exceeds 20% CPU is much less sensitive to the hyper-threading distinction than one running hot. High utilization is exactly when hyper-threaded logical cores start delivering only 50% of physical core performance, and exactly when the licensing math hurts most.

-- Recent CPU utilization from the ring buffer (last ~256 minutes)
SELECT TOP 30
    DATEADD(ms, -1 * (ts.ms_ticks - rb.[timestamp]),  GETDATE()) AS sample_time,
    rec.value('(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') AS sql_cpu_pct,
    100 - rec.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int')   AS total_cpu_pct
FROM sys.dm_os_ring_buffers rb
CROSS JOIN sys.dm_os_sys_info ts
CROSS APPLY (SELECT CONVERT(xml, rb.record)) x(rec)
WHERE rb.ring_buffer_type = N'RING_BUFFER_SCHEDULER_MONITOR'
ORDER BY rb.[timestamp] DESC;

Third, before you commit to a VM series, review the docs and verify one thing: does this VM give me a full physical core per vCPU, or a hyper-threaded logical core? The answer determines how many SQL Server licenses you will need. Most Azure VM series are hyper-threaded — one physical core exposes two vCPUs, and SQL Server licenses both of them at full price. A small number of series ship without hyper-threading, where one vCPU genuinely equals one physical core. Microsoft only calls this out when it is true. The Famsv7 series explicitly says 'vCPU is mapped to a full physical core.' The Ebdsv5 says no such thing — because it doesn't apply. If the documentation does not say it, assume you are buying logical cores and license accordingly.

The Escape Routes, and Their Honest Trade-offs

There is no clean answer here. Each option has a cost:

Option The Catch
Choose an SMT-disabled VM series Limited families; ie., no local temp storage on Famsv7
Disable hyper-threading inside the VM Preview feature; doesn't help on Microsoft-licensed VMs, BYOL only
Use constrained vCPU sizes No guarantee of physical core scaling
Right-size a lightly loaded instance No help for heavy OLTP

The takeaway is not 'Don't go to Azure.' It's 'Know what you're buying before you commit'. A 16-vCPU hyper-threaded Azure VM is a fundamentally different licensing and performance proposition than a 16-physical-core on-prem server, and the migration plan should account for that difference explicitly.

Where This Shows Up in a Health Check

When I do a SQL Server health check on an instance being considered for cloud migration, the CPU utilization profile and current licensing baseline are part of the conversation. An instance running at 65% CPU on 8 physical cores does not migrate cleanly to a hyper-threaded 16-vCPU VM and come out with the same performance and license cost. The performance will likely be lower. The license cost will likely be higher. That's a finding, not a footnote.

If you are in the middle of an Azure migration conversation right now, it's worth a quick look at the target VM spec before the project sign off. The pricing difference between a hyper-threaded VM and a physical-core VM at the same vCPU count can easily exceed $200 per physical core per month on SQL Server Enterprise. At scale, and over a three-year Azure commitment, that math becomes real very fast.

Get the VM selection right before you move. It is a much easier conversation than explaining the bill afterward.

More to Read:

Hyper-threading on Azure VMs is a SQL Licensing Trap — Joe Obbish, Darling Data
Famsv7-series Azure VMs (no SMT) — Microsoft Learn
Constrained vCPU sizes for Azure VMs — Microsoft Learn
SQL Server Health Check — sqlfingers.com