Your tempdb just grew 40GB in an hour. The disk alert fired. Someone important is asking questions. You need answers, and you need them now.
TempDB is SQL Server's shared scratch space -- every database on the instance uses it. Sorts, spills, version store, temp tables... it all lands here. When it blows up, everyone and everything feels it.
Let's find out who's hogging the space.
What's Using TempDB Right Now?
This script shows you which sessions are consuming the most tempdb space, sorted by the worst offenders:
SELECT
t.session_id,
DB_NAME(s.database_id) AS DatabaseName,
CAST((t.user_objects_alloc_page_count + t.internal_objects_alloc_page_count) / 128.0 AS DECIMAL(10,2)) AS TempDB_MB,
CAST(t.user_objects_alloc_page_count / 128.0 AS DECIMAL(10,2)) AS UserObjects_MB,
CAST(t.internal_objects_alloc_page_count / 128.0 AS DECIMAL(10,2)) AS InternalObjects_MB,
s.login_name,
s.host_name,
s.program_name,
r.command,
r.wait_type,
SUBSTRING(qt.text, (r.statement_start_offset/2)+1,
((CASE r.statement_end_offset
WHEN -1 THEN DATALENGTH(qt.text)
ELSE r.statement_end_offset
END - r.statement_start_offset)/2)+1) AS CurrentStatement
FROM sys.dm_db_session_space_usage t INNER JOIN sys.dm_exec_sessions s
ON t.session_id = s.session_id LEFT JOIN sys.dm_exec_requests r
ON t.session_id = r.session_id
OUTER APPLY sys.dm_exec_sql_text(r.sql_handle) AS qt
WHERE t.user_objects_alloc_page_count + t.internal_objects_alloc_page_count > 0
ORDER BY TempDB_MB DESC;
Sample output when things are on fire:
| Session | TempDB MB | Login | Program |
|---|---|---|---|
| 87 | 38,421.50 | DOMAIN\svc_reports | SSRS |
| 142 | 12,847.00 | DOMAIN\analyst_bob | Excel |
| 56 | 245.75 | DOMAIN\app_pool | .NET SqlClient |
Aaaah yes, the Reporting Service account. Shocking absolutely no one.
Catch the Spills
Sorts and hashes that don't fit in memory spill to tempdb. A few spills are normal. A million spills during your batch window? That's your problem.
SELECT
DB_NAME(qt.dbid) AS DatabaseName,
qs.execution_count,
qs.total_spills,
qs.total_spills / NULLIF(qs.execution_count, 0) AS AvgSpillsPerExec,
SUBSTRING(qt.text, 1, 200) AS QuerySnippet
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qt
WHERE qs.total_spills > 0
ORDER BY qs.total_spills DESC;
High spill counts usually mean bad memory grants. The query optimizer guessed wrong -- probably because statistics are stale or the query is doing something creative. ;)
The Version Store Problem
Are you using Read Committed Snapshot Isolation (RCSI) or AlwaysOn readable secondaries? Your version store lives in tempdb. It can grow quietly until it doesn't.
SELECT
DB_NAME(database_id) AS DatabaseName,
reserved_page_count / 128 AS VersionStore_MB
FROM sys.dm_tran_version_store_space_usage
WHERE reserved_page_count > 0
ORDER BY reserved_page_count DESC;
If you see a database holding hundreds of gigs here, you've got a long-running transaction preventing version cleanup. Find it. Kill it. (With the appropriate career-preserving precautionary measures.)
Quick Wins
Add more tempdb files. One file per CPU core, up to 8. Be sure to keep them equally sized. SQL Server will thank you with reduced allocation contention.
Update statistics. Bad cardinality estimates → bad memory grants → spills. It's almost always statistics. I've actually asked a couple AI tools about query plans, and even they say 'update your statistics' before anything else.
Check for temp table abuse. That stored procedure creating a 50-column temp table inside a loop? Yeah. We've all seen it. We've all inherited it.
Monitor proactively. Set up an alert when tempdb exceeds 70% of its max size. Future you will appreciate past you. I promise.
The Bottom Line
TempDB problems are everyone's problems. When it fills up, virtually everything touching that instance is going to feel it. Five minutes of detective work now prevents the RCA you have to write later.
Go check yours. Right now. I'll wait.
More to Read:
No comments:
Post a Comment