sqlfingers.com

Wednesday, November 19, 2025

TempDB: The Unsung Hero Living in Your SQL Server's Basement

You know that friend who lets everyone crash at their place after parties? That's the tempdb. It's SQL Server's communal workspace, shared playground, and sometimes... total chaos zone.

Why TempDB is Having All the Fun (And All the Problems)

Unlike your user databases with their carefully planned schemas, tempdb is like a Vegas hotel room: what happens there should stay there, everything resets on checkout (restart)... and sometimes you find some really weird stuff left behind.

Every single session on your SQL Server shares this ONE database. It's handling:

Your temporary tables (#temp) and table variables (@table)
Sort operations that spill from memory
Row versioning for snapshot isolation
Internal worktables for spools, hashing, and sorting
DBCC CHECKDB operations
Online index builds

The '8 Files or Bust' Rule

Here's the deal: tempdb defaults to one data file. That's like having one bathroom at a Metallica concert. It's going to get ugly.

The golden rule: Configure one tempdb data file per CPU core, up to 8 files. Beyond 8, you're probably not gaining much unless you are in a very specific high-contention scenario.

Why? SQL Server uses a proportional fill algorithm, spreading writes across files. More files = more GAM/SGAM pages = less allocation contention. It's beautiful when it works.

-- Are your tempdb files sized equally?
SELECT name, size/128.0 AS size_mb, growth
FROM tempdb.sys.database_files
WHERE type_desc = 'ROWS';

-- Who's hogging tempdb space RIGHT NOW?
SELECT 
    s.session_id,
    DB_NAME(r.database_id) AS database_name,
    CAST(SUM(u.user_objects_alloc_page_count) * 8.0/1024 AS DECIMAL(10,2)) AS user_objects_mb,
    CAST(SUM(u.internal_objects_alloc_page_count) * 8.0/1024 AS DECIMAL(10,2)) AS internal_objects_mb
FROM sys.dm_exec_requests r
INNER JOIN sys.dm_db_task_space_usage u ON r.session_id = u.session_id
INNER JOIN sys.dm_exec_sessions s ON r.session_id = s.session_id
WHERE u.user_objects_alloc_page_count > 0 
   OR u.internal_objects_alloc_page_count > 0
GROUP BY s.session_id, r.database_id
ORDER BY (u.user_objects_alloc_page_count + u.internal_objects_alloc_page_count) DESC;

Pro Tips That'll Save Your Bacon

Size all files identically - If one file is 1GB and another is 10GB, SQL Server will hammer that smaller file trying to keep proportions. Don't do this to yourself.
Set reasonable autogrowth - 64MB or 10% on a 500GB file? Please. Use 256MB for data and 128MB for log files. Autogrowth events are expensive.
Instant file initialization is your friend - Enable it at the Windows level. Tempdb creates new files on every restart, and you don't want to wait for Windows to zero out 100GB.
Put it on fast storage - Tempdb is high-throughput, low-durability. SSDs are perfect. It doesn't need to be on your expensive SAN.

Next time someone says 'the database is slow', check tempdb first:

-- What's waiting on tempdb?
SELECT 
    wait_type,
    waiting_tasks_count,
    wait_time_ms,
    signal_wait_time_ms
FROM sys.dm_os_wait_stats
WHERE wait_type LIKE 'PAGE%LATCH%'
   OR wait_type LIKE 'PAGELATCH%'
ORDER BY wait_time_ms DESC;

If you see PAGELATCH_UP or PAGELATCH_EX waits on 2:1:1 or 2:1:3 (PFS and SGAM pages), you've got allocation contention. Time to add more files.

The Bottom Line

TempDB is the workhorse of your SQL Server instance. Treat her right: give her more space than she needs, multiple files, fast disk, and for the love of all things holy, size those files equally.

Your future self (and your users) will thank you. I promise.

Thursday, November 13, 2025

Same Query. Same Index. Totally Different Performance. Why?

In electronic trading, nothing is more frustrating than a query that:

Runs in 200 ms during UAT
Then takes 45 seconds in PROD at market open…

With the same proc, same index, and same code.

You rebuild indexes and update stats with fullscan.. but it still behaves like a moody market maker on a Monday.

Most likely not bad indexing. More likely Parameter Sniffing.

Quick Reminder: What is Parameter Sniffing?

SQL Server does something clever when you run a parameterized query:

On the first execution, it “sniffs” the parameter values.
It compiles an execution plan optimized for those values.
It caches and reuses that plan for future executions.

Most of the time this is a big win, but when your data distribution is skewed, that cached plan can be perfect for one set of orders and terrible for another. Think:

One trader ID with a few intraday fills.
Another trader ID with hundreds of of micro-lots across USD/JPY and EUR/USD.

Where Indexes Enter the Drama

Consider a simplified orders table:

CREATE TABLE dbo.Orders
(
  OrderID          bigint IDENTITY(1,1) PRIMARY KEY,
  TimeOfExecution  datetime2(3),
  TraderID         int,
  Symbol           varchar(16),
  Side             char(1),      -- 'B' = Buy, 'S' = Sell
  Qty              int,
  Price            decimal(18,4),
  Status           char(1)       -- 'O' = Open, 'F' = Filled, 'C' = Cancelled
);

CREATE INDEX idx_Orders_TimeOfExec_TraderID
    ON dbo.Orders (TimeOfExecution, TraderID);

CREATE INDEX idx_Orders_TimeOfExec_Symbol
    ON dbo.Orders (TimeOfExecution, Symbol);

And a stored procedure on a busy reporting box:

CREATE OR ALTER PROCEDURE dbo.usp_GetTradersTrades
    @TraderID int,
    @FromTime datetime2(3),
    @ToTime   datetime2(3)
AS
BEGIN
    SET NOCOUNT ON;

    SELECT OrderID, TraderID, Symbol, Side, TimeOfExecution, Qty, Price, Status
    FROM dbo.Orders
    WHERE TraderID        = @TraderID
      AND TimeOfExecution >= @FromTime
      AND TimeOfExecution <  @ToTime;
END;
GO

Run 1: Quiet Retail Trader

EXEC dbo.usp_GetTradersTrades
    @TraderID = 101,   -- 200 rows
    @FromTime = '2025-11-13 09:00:00',
    @ToTime   = '2025-11-13 16:00:00';

SQL Server sniffs these parameters, sees a very selective trader and creates a plan. The plan is cheap, tight, and fast -- and now it is cached.

Run 2: HFT / Flow Account from the Same Proc

EXEC dbo.usp_GetTradersTrades
    @TraderID = 9001, -- 10 thousand rows
    @FromTime = '2025-11-13 09:00:00',
    @ToTime   = '2025-11-13 16:00:00';

Now we hit an HFT account with thousands of fills across the day. The small trader plan is reused on this account --- and quickly degenerates into a slow, row-by-row grind with massive I/O.

Did the index suddenly become bad? No. Very simply, the chosen plan is wrong for these new parameters, but it’s being forced to reuse it.

Hello Parameter Sniffing.

The 'perfect' index can amplify the gap between good and bad plans.
You can get:
- One plan that loves the quiet retail trader, and
- The same plan that punishes the HFT account at the opening bell.

If your query runs fast sometimes and face-plants other times with the same index, start suspecting plans + parameters, rather than just the index.

Fast Triage in a Trading Environment

Run the proc with a good TraderID and a bad TraderID:

SET STATISTICS IO, TIME ON;

EXEC dbo.usp_GetTradersTrades @TraderID = 101,  @FromTime = ..., @ToTime = ...;  -- Good
EXEC dbo.usp_GetTradersTrades @TraderID = 9001, @FromTime = ..., @ToTime = ...;  -- Bad

If one execution wants a seek + nested loops and the other really needs a scan + hash join, you’re looking at parameter sniffing.

Force a Fresh Plan on the Bad Execution

EXEC dbo.usp_GetTradersTrades
    @TraderID = 9001,
    @FromTime = '2025-11-13 09:00:00',
    @ToTime   = '2025-11-13 16:00:00'
OPTION (RECOMPILE);

Practical Fixes That Don’t Suck

1. Targeted OPTION (RECOMPILE)

For procedures that:

Run less frequently throughout the day (end-of-day reports, risk aggregation), or
Are extremely sensitive to trader / symbol distribution,

You can selectively recompile:

SELECT OrderID, TraderID, Symbol, Side, TimeOfExecution, Qty, Price, Status
FROM dbo.Orders
WHERE TraderID        = @TraderID
  AND TimeOfExecution >= @FromTime
  AND TimeOfExecution <  @ToTime
OPTION (RECOMPILE);

2. 'Optimize For' the Common Case

If 90% of calls are for high-volume flow accounts, you can bias the optimizer:

SELECT ...
FROM dbo.Orders
WHERE TraderID        = @TraderID
  AND TimeOfExecution >= @FromTime
  AND TimeOfExecution <  @ToTime
OPTION (OPTIMIZE FOR (@TraderID UNKNOWN));

UNKNOWN pushes the optimizer toward an average selectivity plan instead of letting one weird trader dominate the compilation. You avoid full recompilation while still stabilizing behavior.

3. Split the Logic on Purpose

Sometimes your data model really does have two worlds:

A small set of ultra-high-volume trader IDs.
A very long tail of low-volume accounts.

In that case, it can be cleaner to admit that in code:

IF @TraderID IN (9001, 9002, 9003)  -- known firehose accounts
BEGIN
    SELECT ...
    FROM dbo.Orders
    WHERE TraderID        = @TraderID
      AND TimeOfExecution >= @FromTime
      AND TimeOfExecution <  @ToTime;
END
ELSE
BEGIN
    SELECT ...
    FROM dbo.Orders WITH (INDEX(idx_Orders_TimeOfExec_TraderID))
    WHERE TraderID        = @TraderID
      AND TimeOfExecution >= @FromTime
      AND TimeOfExecution <  @ToTime;
END

Not pure from a theoretical standpoint, but brutally clear when you’re on call during a volatile session and need behavior you can predict.

Index Tuning with Market Behavior in Mind

When tuning indexes for trading workloads, think beyond just the columns...

Data distribution
Are there whale traders or symbols that dwarf everything else? Are you mixing those in the same proc as tiny retail accounts?
Parameter sets
Test both:
- Selective cases (light traders, off-peak times).
- Worst-case flow (HFT / heavy flow accounts at market open).
Plan stability
If the plan flips between instant and agonizing with no code changes -- you’re not done tuning yet.

The One-Line Takeaway

When performance flips between instant and miserable on the same proc, in the same trading system, your index probably isn’t lying to you – your cached plan is.

Keep tuning your indexes, but be aware of Parameter Sniffing. In electronic trading, indexes and parameter sniffing are a package deal. Ignore one, and the other will eventually creep up on you.

Thursday, November 6, 2025

SQL Server 2025 RC1 — The Calm Before a Very Big Shift

Microsoft quietly dropped the first Release Candidate for SQL Server 2025, and if you blinked, you might have missed one of the most important signals for the data platform’s future.

The official announcement calls out acceleration, but buried inside the noise is something deeper: the database engine itself is evolving faster than our operational playbooks. The 2025 build isn’t just another compatibility bump -- it’s a re-tuning of how SQL Server thinks, plans, and executes.

Highlights That Should Make Every DBA Pay Attention

Adaptive Compilation Pipeline: The optimizer now learns from workload patterns dynamically. Parameter sniffing workarounds and hint gymnastics might finally have an expiration date.
Vector Indexing and JSON/Regex Enhancements: Microsoft is embedding AI-friendly data structures directly into the core engine. On-prem databases will now play in the same field as cloud analytics. This is very interesting.
Availability and Failover Resilience: Improvements in synchronization and resource isolation hint that AGs and FCI deployments could see real-world performance gains — less lag, faster recovery, and fewer “mystery” disconnects. Wow.
Performance Isolation: Query Store now interacts with workload groups more intelligently, minimizing interference between heavy maintenance tasks and user queries.

What It Means for Practicing DBAs

If your environment depends on custom job synchronization, failover pipelines, or tightly-tuned Agent workflows, this release is a perfect dress rehearsal. Spin up a sandbox, restore a copy of your critical AG, and test your procedures under the RC engine. Look for differences in plan stability, job duration, and any change in background I/O or tempdb contention. You’ll want to capture these deltas before production catches up to you.

The truth is, this release isn’t about shiny features — it’s about a shifting baseline. Your known good queries, jobs, and maintenance scripts may still work, but they will work differently. The earlier you test, the less surprise you’ll have when the inevitable patch train arrives.

Bottom Line

SQL Server 2025 RC1 feels like the moment right before the storm — calm, stable, almost routine — until you realize how much has changed underneath. Those who prepare now will glide into the next cycle. Those who don’t will find themselves debugging “nothing changed” problems all over again.

If you live and breathe uptime, replication health, and Agent discipline — this is your early warning. Test the RC, measure, document, and adapt. Because when the final GA lands, you’ll want your environment to just work.

Thursday, October 23, 2025

The Death of Buffer Cache Hit Ratio: Why Your 2025 SQL Server Needs New Performance Metrics

Remember when a 99% Buffer Cache Hit Ratio meant your SQL Server was healthy? Those days are long gone.

Modern SQL Server workloads have exposed many legacy metrics as dangerous untruths. Here's why your tried-and-true performance counters are actively misleading you, and what you should monitor instead.

The Problem: Your SQL Server Isn't What It Used to Be

Today's SQL Server instances are shapeshifting monsters. They run OLTP transactions in the morning, serve analytics queries at lunch, and handle API calls all afternoon. Can you say multiple personalities? Or maybe closer to a Swiss Army knife on steroids... A single metric like Page Life Expectancy becomes meaningless when legitimate analytics queries cause massive, intentional page churn.

Your virtualized or cloud-hosted SQL Server? Those physical counters you're watching might not even exist. Azure SQL Database abstracts them away entirely. You're essentially checking the oil in an electric car.

The New Reality: Query Store is Your Flight Recorder

Stop firefighting with PerfMon. Start using Query Store as your primary diagnostic tool. It captures:

Actual query text and execution plans
Runtime metrics over time
Performance regressions before users notice them

Build baselines for your top 20 queries daily. When performance tanks, you'll know exactly which query went rogue and when.

The Game Changer: Parameter Sensitive Plan (PSP) Optimization

Parameter sniffing has plagued SQL Server for decades. SQL Server 2025's PSP optimization finally gives us adaptive execution plans that adjust to runtime parameters. No more manually forcing plans or creating plan guides for every problematic stored procedure.

Your 7-Step Modern Tuning Playbook

Identify bottlenecks with wait stats - Let SQL Server tell you where it's struggling
Establish Query Store baselines - Record current performance metrics before making changes
Analyze execution plans - Focus on the operators consuming 80%+ of query cost
Enable PSP optimization - Let SQL Server handle parameter variations automatically
Partition hot tables - Reduce lock contention in high-concurrency scenarios
Schedule statistics updates - Proactive maintenance beats reactive firefighting. Big time.
Monitor trends, not snapshots - Performance is a movie, not a photo

The Bottom Line

Traditional performance counters are comfort food for DBAs -- familiar, but ultimately unhealthy. Modern SQL Server tuning demands modern tools and techniques. Query Store and PSP optimization aren't just nice-to-haves; they're essential for managing today's hybrid, unpredictable workloads.

Stop checking Buffer Cache Hit Ratio. Start building Query Store baselines. Your future self (and your users) will thank you.

Ready to modernize your SQL Server performance tuning? Check out the new Query Store enhancements in SQL Server 2025 and start building those baselines today.

Friday, October 17, 2025

Breaking Changes & Migration Risks in SQL Server 2025

Every new SQL Server release comes with shiny features — but SQL Server 2025 brings more than just enhancements. It's important to know that there are several breaking changes under the hood that could futz your upgrade if you’re not paying attention.

The New Reality

SQL Server 2025 marks Microsoft’s biggest structural shift since 2019. Many legacy subsystems are being removed or rewritten to make room for AI-driven query features, better JSON handling, and tighter security boundaries. That progress comes at a cost, especially for older or hybrid environments.

What’s Breaking or Disappearing

Hot-Add CPU and Lightweight Pooling. Both are deprecated and no longer supported in 2025.
Master Data Services (MDS) and Data Quality Services (DQS). Officially retired; time to look toward Azure Purview or Fabric.
Remote Server connections. Linked Server encryption rules now enforce modern TLS by default, breaking many legacy connections.
Replication & Log Shipping authentication. May fail under new encryption defaults unless reconfigured with certificates or updated service accounts.
Extended stored procedures. Custom DLL calls are now fully blocked; migrate to CLR or external services.

Security & Connectivity Changes

Microsoft has quietly raised the security baseline. If your environment still uses older SQL Native Client providers, you’ll see connection errors after upgrade. The new defaults enforce:

TLS 1.2+ only (no fallback)
Encrypted channel by default for Linked Servers
Hardened credentials in system views

That means every legacy Linked Server, SSIS package, or service account using old OLE DB providers needs to be tested and re-registered. Don’t make the assumption that it's just going to reconnect. It won't.

Collation & Compatibility

Collation rules have evolved again. Unicode normalization is now more aggressive, and Latin1_General_100_CI_AI_SC_UTF8 behaves differently in ordering and comparisons. Even small changes in sort order can break ETL checksums or equality joins when migrating from 2017/2019 builds.

Query Engine Behavior

SQL Server 2025 introduces deeper refinements in the Intelligent Query Processing framework and smarter cost model calibration. These changes continue the trend that began in 2019 and 2022 — giving the optimizer more freedom to adapt to runtime feedback and plan variability. While performance generally improves, some workloads may behave differently as a result of:

Re-estimated row counts during parameter-sensitive execution plans (via new IQP features).
Reordered joins and aggregations under the updated cardinality model.
Deprecated or ignored trace flags that previously influenced join behavior or costing.

SQL Server 2025 also expands the Optional Parameter Plan Optimization (OPPO) feature, which lets the engine build multiple parameter-specific plans for a single query. That can dramatically improve parameter-sniffing stability — but it can also cause unexpected plan shifts compared to previous compatibility levels.

To stay safe: capture baselines in Query Store, compare plan regressions under compatibility level 170, and validate with representative workloads before cutting over to production.

Always test your workload under DBCC TRACEON(3604, 8675) or via Query Store captures before committing. You might very well find that previously stable queries are taking new paths now -- possibly not the right ones!

Migration Risk Checklist

✅ Validate every Linked Server using new TLS policies.
✅ Review replication agents and connection strings.
✅ Rebuild SSIS packages that use older OLE DB drivers.
✅ Test collation-dependent logic and string joins.
✅ Re-benchmark key workloads in compatibility level 170.

Final Thoughts

SQL Server 2025 isn’t a simple “upgrade.” It’s a platform shift toward secure, AI-ready data operations... whether you want them or not.

For DBAs, the best migration plan is simple: know your dependencies, test everything twice, and never assume yesterday’s connection string will work tomorrow.

More to read: Optional Parameter Plan Optimization

Monday, October 13, 2025

Move SQL Server Data Instantly -- ALTER TABLE SWITCH

Use ALTER TABLE ... SWITCH to move very large tables instantly. Yes. I said instantly. ALTER TABLE .. SWITCH doesn't copy the data or physically move it. It just reassigns the page ownership. This means that only the metadata with the data pointer changes, and that's why it completes in milliseconds and barely touches the transaction log.

Why DBAs might use SWITCH

Replace a bad load (month/year) in seconds.
Archive or purge without huge deletes.
Make schema changes on large tables like remove IDENTITY property, go from INT to BIGINT, or change compression with zero downtime.
Stage and validate offline, then promote instantly.

Requirements

Your tables match: Same columns -- names, order, types, nullability, same indexes clustered and non, same compression, same computed columns, collation, filegroups... everything.

Target table is empty before the SWITCH

Foreign Keys / constraints must be compatible (or you can temporarily drop them and recreate afterward)

Example -- To remove IDENTITY column

-- Original table with IDENTITY
CREATE TABLE dbo.Orders
(
  OrderID   INT IDENTITY(1,1) PRIMARY KEY,
  OrderDate DATE NOT NULL,
  Amount    MONEY NOT NULL
);

-- Target clone WITHOUT identity (same shape)
CREATE TABLE dbo.Orders_NoIdentity
(
  OrderID   INT NOT NULL PRIMARY KEY,
  OrderDate DATE NOT NULL,
  Amount    MONEY NOT NULL
);

-- Batch copy + wait + watch log file + make dinner + wait longer
INSERT dbo.Orders_NoIdentity (OrderID, OrderDate, Amount)
SELECT OrderID, OrderDate, Amount
FROM dbo.Orders;

OR

-- Instant handoff
ALTER TABLE dbo.Orders
SWITCH TO dbo.Orders_NoIdentity;  -- completes instantly

Serious. It is that easy. I, too, was totally Mrs. Doubtfire the first time I did it on a table with 633,910,472 records. I kid you not. It was instant. I literally jumped in my seat! Looked around me hoping that someone else might have seen it. Checking my logs, SELECTing COUNT... I was floored.

ALTER TABLE ... SWITCH is also regularly used with data partitions. In fact, that is where I first learned about it years ago, from my very favorite MSFT expert. She taught me how to manage the RANGE RIGHT partitions quarterly in a rather large OrderEvents database using ALTER TABLE ... SWTICH PARTITION. Very good, very effective, but I'm just saying that you don't HAVE to have partitions to use this feature.

You should try this yourself on any table - big or small. Make sure your source and target table definition matches and your target is empty. Then watch the magic happen. Let me know how many rows you 'moved' in less than a second.

More to read: Transferring Data with SWITCH TO

Thursday, October 9, 2025

Using AI to Write SQL: What's Real, What's Hype

"Show me top customers by sales for last quarter."

AI writes the query, formats it, and even adds a comment. Look at the time you saved!!! Magic, right? enh.... Let’s talk about what’s real and what’s hype when it comes to AI writing SQL.

The Promise

AI-Assisted query generation has exploded. Between Copilot in SSMS, ChatGPT and the new Azure AI Integrations, we're seeing something incredible. People who don't even speak SQL can now build queries that actually run!

Here’s what AI already does well:

Generates boilerplate queries and joins in seconds.
Converts English into working T-SQL (ie., 'show me top customers by sales').
Suggests filters, aggregations, and window functions.

It’s faster, smarter, and, WHEN GUIDED CORRECTLY, can be remarkably accurate for common requests.

The Reality

We really mustn't forget the most important thing. The part that makes the SQL correct, efficient, usable, safe, even just applicable -- this part still belongs to you. 😉

AI doesn’t:

Know your schema or naming conventions.
Understand your business rules or data quality quirks.
Predict query plan costs, indexes, or blocking risk.
Catch logic traps like date overlaps, cardinality mismatches or cartesian joins.

Think of AI like a gifted intern. Quick with code but you still need to review.

What’s Actually Working

Capability Tools Strong Use Cases

SQL Server Waits

SQL Server records every moment it spends waiting — on locks, latches, I/O, CPU coordination, memory and network calls. The DMV sys.dm_os_wait_stats is the scoreboard for the wait statistics. This dmv provides details about all waits encountered by executing threads, and it is essential for diagnosing performance issues with SQL Server and specific queries being run.

The only caveat is that the sys.dm_os_wait_stats is crowded with a lot of noise, so you need to filter and prioritize to be sure you are looking at what matters. Here are the waits that consistently give you usable performance insight.

1. Concurrency & Blocking
LCK_M_% (locks)
Meaning: Sessions are blocked waiting on locks.
Why it matters: Points to blocking chains, poor indexing, or long transactions.
Action: Run blocking queries (XE or sys.dm_exec_requests), shorten

transactions, add missing indexes.

PAGELATCH_%
Meaning: Latch contention, often in tempdb.
Why it matters: Classic sign of tempdb allocation contention.
Action: Add multiple equally sized tempdb files (usually 1 per 4 cores up to

8), check hot spots.

2. Parallelism
CXPACKET / CXCONSUMER
Meaning: Threads coordinating parallel queries.
Why it matters: High values = skew or poor parallelism decisions.
Action: Review MAXDOP, update statistics, look at plan skew (one thread

does all the work).

3. I/O Bottlenecks
WRITELOG
Meaning: Waiting to flush to the transaction log.
Why it matters: Log is bottlenecking throughput.
Action: Pre-size logs, fix VLF fragmentation, confirm storage latency, keep log

on its own disk/LUN.

PAGEIOLATCH_%
Meaning: Waiting on data file I/O.
Why it matters: Points to slow disk or excessive scans.
Action: Tune queries, add indexes, check storage latency.

4. Memory Pressure
RESOURCE_SEMAPHORE
Meaning: Queries are waiting for memory grants.
Why it matters: Memory-intensive operators (hash joins, sorts) can starve

the system.

Action: Add missing indexes, reduce row estimates, break up big queries, or

scale RAM.
MEMORY_GRANT_PENDING (via XE)
Meaning: Same problem, live view.
Action: Use sys.dm_exec_query_memory_grants to see offenders.

5. Network & Client

ASYNC_NETWORK_IO
Meaning: SQL is waiting for the client to fetch rows.
Why it matters: Not a SQL bottleneck — it’s the app pulling rows too slowly.
Action: Fix fetch size, batching, or chatty app design.

6. Ignore the Junk
Don’t waste time on:

SLEEP_TASK
XE_TIMER_EVENT
BROKER_ waits (unless you use Service Broker)

These inflate totals but tell you nothing about performance. Filter them out

when running wait stats queries.

Quick Script: Top Useful Waits

SELECT TOP 10
wait_type,
waiting_tasks_count,
wait_time_ms/1000.0 seconds,
CAST(100.0 * wait_time_ms / SUM(wait_time_ms) OVER() AS DECIMAL(5,2)) pct
FROM sys.dm_os_wait_stats
WHERE wait_type NOT IN (
'SLEEP_TASK','BROKER_TASK_STOP','XE_TIMER_EVENT',
'BROKER_TO_FLUSH','SQLTRACE_BUFFER_FLUSH'
)
ORDER BY wait_time_ms DESC;

* Focus on the top 2–3 by percentage, not just raw counts. This is where your users are feeling the pain. *

The sys.dm_os_wait_stats DMV is a very powerful tool for diagnosing SQL Server performance concerns. Using it can help you to better understand your server's wait statistics and help you to more easily identify bottlenecks and optimize SQL Server's performance.

Saturday, September 27, 2025

TempDB Under Control in SQL Server 2025

TempDB has been every DBA’s troublemaker. Runaway sort spills, hash joins gone wild, or a single session chewing up all the space until everything grinds to a halt. You know, you've all seen it, but with SQL Server 2025, Microsoft finally gave us a new lever to help keep tempdb in check.

TempDB Space Resource Governance (via Resource Governor)

Think of this as a quota system for tempdb. You can now define percent-based limits per workload group. If a session exceeds its quota, it’s stopped before blowing up the entire instance.

Example: create a pool and group with tempdb limits

   -- enable Resource Governor
   ALTER RESOURCE GOVERNOR RECONFIGURE;
   -- create a pool
   CREATE RESOURCE POOL TempDBGuard
   WITH (MAX_TEMPDB_MEMORY_PERCENT = 20);
   -- workload group for reporting queries
   CREATE WORKLOAD GROUP Reporting
   USING TempDBGuard;
   -- classifier function (simplified)
   CREATE FUNCTION dbo.rgClassifier() RETURNS sysname
   WITH SCHEMABINDING
   AS
   BEGIN
   RETURN CASE
   WHEN APP_NAME() = 'Report Builder' THEN 'Reporting'
      ELSE 'default' END;
   END;
   GO
   ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION = dbo.rgClassifier);
   ALTER RESOURCE GOVERNOR RECONFIGURE;

What happens when exceeded?

The offending query fails with error 701 (out of memory)
Other sessions are protected
DMV sys.dm_resource_governor_workload_groups shows usage

We've all lived through the 2AM 'tempdb full' events. Until now, the best we could do was add more files, pre-size them and hope that the workload behaves. With the resurce governance, we finally get hard stop protections. This moves tempdb from weakest link to something we can actually govern.

VIP (very important point): These percent limits only apply if your tempdb files have a MAXSIZE defined. If you’re still unlimited + autogrow, the governance can’t kick in.

This isn’t just nice to have. For shops that rely on heavy reporting, ETL, or big temp table manipulations, this is the difference between full protection vs one bad query killing the instance.

SQL Server 2025 isn’t GA yet, so test this still only Preview. Let's hope it makes it through the rounds and sticks, so we can finally say that tempDB can be governed.

More to read: TempDB Space Resource Governance

Thursday, September 25, 2025

SQL Server's Memory Grant Feedback --> The Fix That Breaks Things

What Is Memory Grant Feedback?

Before SQL Server runs a query, it estimates how much memory it needs for sorting and joining. But what if it gets it wrong?

Too little memory → Spills to tempdb (slow)
Too much memory → Starves other queries

SQL Server 2017+ tries to fix bad estimates based on previous calls with 'Memory Grant Feedback'. Kinda like: 'Last time I gave you 2GB but you only used 50MB. Next time I'm giving you less.'

The Problem - Memory grant feedback adjusts based on the LAST execution. But what if your query returns different amounts of data each time?

-- Monday: small customer (10 orders)
EXEC usp_GetOrdersByCustomer @CustomerID = 12345
-- SQL grants 100MB, only uses 1MB
-- SQL learns: 'This query needs less memory'

-- Tuesday: huge customer (100,000 orders)
EXEC usp_GetOrdersByCustomer @CustomerID = 1
-- SQL grants 5MB (based on Monday's run)
-- Spills everywhere, runs for 20 minutes
-- SQL learns: 'This query needs WAY more memory'

-- Wednesday: normal customer (1,000 orders)
EXEC usp_GetOrdersByCustomer @CustomerID = 999
-- SQL grants 5GB (based on Tuesday's disaster)
-- Wastes memory, blocks other queries

How to Tell If You're Affected?

Symptom 1: Same query, wildly different performance

-- Check memory grant variance
SELECT TOP 10
q.query_id,
MIN(rs.avg_memory_grant_kb) AS min_memory_kb,
MAX(rs.avg_memory_grant_kb) AS max_memory_kb,
MAX(rs.avg_memory_grant_kb) / NULLIF(MIN(rs.avg_memory_grant_kb),0) AS variance_ratio
FROM sys.query_store_runtime_stats rs JOIN sys.query_store_plan p
ON rs.plan_id = p.plan_id JOIN sys.query_store_query q
ON p.query_id = q.query_id
WHERE rs.avg_memory_grant_kb > 0
GROUP BY q.query_id
HAVING COUNT(DISTINCT rs.avg_memory_grant_kb) > 1
ORDER BY variance_ratio DESC

Symptom 2: Spills that appear and disappear -- Today: No spills -- Tomorrow: Massive tempdb spills -- Day after: No spills again -- You haven't changed anything

The Fix: Nuclear Option - Turn It Off Database-Wide

-- Disable both types of memory grant feedback
ALTER DATABASE YourDatabase
SET BATCH_MODE_MEMORY_GRANT_FEEDBACK = OFF;

ALTER DATABASE YourDatabase
SET ROW_MODE_MEMORY_GRANT_FEEDBACK = OFF;

Surgical Option - Fix specific queries

-- Add this hint to problematic queries (my preference)
SELECT * FROM YourTable
OPTION (USE HINT('DISABLE_BATCH_MODE_MEMORY_GRANT_FEEDBACK'))

Smart Option - Keep it, but limit the impact

-- Cap how much memory any query can take
ALTER RESOURCE GOVERNOR RECONFIGURE;
ALTER WORKLOAD GROUP [default]
WITH (REQUEST_MAX_MEMORY_GRANT_PERCENT = 25);

Real Example That Breaks. Here's a simple repro:

-- Create a proc that returns variable rows
CREATE PROCEDURE dbo.usp_GetOrders (
@Days INT
)
AS
SET NOCOUNT ON;

SELECT * FROM Orders
WHERE OrderDate > DATEADD(DAY, -@Days, GETDATE())

-- Run 1: Get 1 day of orders (small)
EXEC dbo.usp_GetOrders @Days = 1

-- Run 2: Get 365 days (huge) - will spill
EXEC dbo.usp_GetOrders @Days = 365

-- Run 3: Get 30 days (medium) - over-allocated
EXEC dbo.usp_GetOrders @Days = 30

Each execution 'learns' from the previous one and often times makes the wrong choice

for the next.

The Bottom Line Memory grant feedback works great for queries that return consistent result sizes.

For everything else, it's like having a thermostat that sets tomorrow's temperature

based on today's weather. If your SQL Server 2017+ queries are randomly slow, randomly fast, or showing

tempdb spills that come and go - this may be why.

SQL Server Consulting

Wednesday, November 19, 2025

Why TempDB is Having All the Fun (And All the Problems)

The '8 Files or Bust' Rule

Pro Tips That'll Save Your Bacon

Next time someone says 'the database is slow', check tempdb first:

More to read

Thursday, November 13, 2025

Quick Reminder: What is Parameter Sniffing?

Where Indexes Enter the Drama

Run 1: Quiet Retail Trader

Run 2: HFT / Flow Account from the Same Proc

Hello Parameter Sniffing.

Fast Triage in a Trading Environment

Force a Fresh Plan on the Bad Execution

Practical Fixes That Don’t Suck

1. Targeted OPTION (RECOMPILE)

2. 'Optimize For' the Common Case

3. Split the Logic on Purpose

Index Tuning with Market Behavior in Mind

The One-Line Takeaway

More to read

Thursday, November 6, 2025

Highlights That Should Make Every DBA Pay Attention

What It Means for Practicing DBAs

Bottom Line

Thursday, October 23, 2025

The Problem: Your SQL Server Isn't What It Used to Be

The New Reality: Query Store is Your Flight Recorder

The Game Changer: Parameter Sensitive Plan (PSP) Optimization

Your 7-Step Modern Tuning Playbook

The Bottom Line

More to read

Friday, October 17, 2025

The New Reality

What’s Breaking or Disappearing

Security & Connectivity Changes

Collation & Compatibility

Query Engine Behavior

Migration Risk Checklist

Final Thoughts

Monday, October 13, 2025

Why DBAs might use SWITCH

Requirements

-- Original table with IDENTITYCREATE TABLE dbo.Orders( OrderID INT IDENTITY(1,1) PRIMARY KEY, OrderDate DATE NOT NULL, Amount MONEY NOT NULL);

-- Target clone WITHOUT identity (same shape)CREATE TABLE dbo.Orders_NoIdentity( OrderID INT NOT NULL PRIMARY KEY, OrderDate DATE NOT NULL, Amount MONEY NOT NULL);

-- Batch copy + wait + watch log file + make dinner + wait longerINSERT dbo.Orders_NoIdentity (OrderID, OrderDate, Amount)SELECT OrderID, OrderDate, AmountFROM dbo.Orders;

OR

-- Instant handoffALTER TABLE dbo.OrdersSWITCH TO dbo.Orders_NoIdentity; -- completes instantly

Thursday, October 9, 2025

The Promise

Friday, October 3, 2025

Saturday, September 27, 2025

Thursday, September 25, 2025

`-- Original table with IDENTITYCREATE TABLE dbo.Orders(OrderID INT IDENTITY(1,1) PRIMARY KEY,OrderDate DATE NOT NULL,Amount MONEY NOT NULL);`

`-- Target clone WITHOUT identity (same shape)CREATE TABLE dbo.Orders_NoIdentity(OrderID INT NOT NULL PRIMARY KEY,OrderDate DATE NOT NULL,Amount MONEY NOT NULL);`

`-- Batch copy + wait + watch log file + make dinner + wait longerINSERT dbo.Orders_NoIdentity (OrderID, OrderDate, Amount)SELECT OrderID, OrderDate, AmountFROM dbo.Orders;`

`-- Instant handoffALTER TABLE dbo.OrdersSWITCH TO dbo.Orders_NoIdentity; -- completes instantly`