Tuesday, January 20, 2026

I/O requests taking longer than 15 seconds to complete

When diagnosing storage or latency issues, one SQL Server message factors in more than many:

“SQL Server has encountered X occurrence(s) of I/O requests taking longer than 15 seconds to complete on file...”

Where X might be 1, 5 or 50, and it could list a file from any one of your databases. When you see this, the next good question is when did it happen and where.

But -- what if you have a lot of SQL Servers?   How can you find this in any of them quickly?   The goal in this post is simple: traverse your SQL Server error logs for this warning and return a daily count so you will see everywhere you need to investigate the warning.

I ran this code on a CMS (Central Management Server) to locate the warning from many SQL Servers with only one query:

DECLARE @LogNumber int = 0;
DECLARE @MaxLogNumber int;

IF OBJECT_ID('tempdb..#ErrorLogs') IS NOT NULL DROP TABLE #ErrorLogs;
IF OBJECT_ID('tempdb..#Hits') IS NOT NULL DROP TABLE #Hits;

CREATE TABLE #ErrorLogs
(
    ArchiveNumber int,
    LogDate datetime,
    LogSizeMB int
);

CREATE TABLE #Hits
(
    LogDate datetime,
    ProcessInfo nvarchar(50),
    [Text] nvarchar(max)
);

INSERT INTO #ErrorLogs
EXEC sys.xp_enumerrorlogs;

SELECT @MaxLogNumber = MAX(ArchiveNumber)
FROM #ErrorLogs;

WHILE @LogNumber <= @MaxLogNumber
BEGIN
    INSERT INTO #Hits
    EXEC sys.xp_readerrorlog
        @LogNumber,
        1,
        N'I/O requests taking longer than 15';

    SET @LogNumber += 1;
END;

SELECT
    CAST(LogDate AS date) AS LogDate,
    COUNT(*) AS OccurrenceCount
FROM #Hits
GROUP BY CAST(LogDate AS date)
ORDER BY LogDate;

DROP TABLE #Hits;
DROP TABLE #ErrorLogs;

Sample results:

Where we go from here is a much larger discussion.  I will follow up with another post, but for a quick pointer, I would begin here:

Diagnose the I/O Subsystem

  • The long I/O warning is an indicator of disk performance issues. This may not be a SQL Server problem. You should first investigate the underlying storage.
  • Check hardware health. Coordinate with your hardware or SAN Team to review the physical disk(s) and RAID configuration hosting the affected file (for example, GDB\tempdb mssql 6.ndf).  Review the Windows Event Viewer for disk, controller, or path-related errors.
  • Analyze performance counters. Use Windows Performance Monitor (Perfmon) to capture disk metrics for the affected volume. Focus on:
    • Avg. Disk Queue Length
    • Avg. Disk Read Time
    • Avg. Disk Write Time
  • Review latency using DMVs. Query sys.dm_io_virtual_file_stats to examine read and write latency at the database file level. This helps determine whether the problem is systemic or isolated to a specific file or workload.

More to Read

No comments:

Post a Comment