Tuesday, May 26, 2026

SIL — Because Holding 1,000 KEY Locks Is So 2022

Back in September I wrote about Optimized Locking in SQL Server 2025. Two components: Transaction ID (TID) locking and Lock After Qualification (LAQ). I told you the engine had stopped 'locking first and asking questions later'. That was true, but it was also only half the story.

Microsoft has since gone back and fine-tuned the feature, giving us two improvements that make a database more lock-free without you touching anything. The headliner is called Skip Index Locks (SIL), and the numbers are pretty good:

Measurement Result
Key locks skipped (Microsoft demo) 99.6%
Page locks skipped (Microsoft demo) 79%
Nonclustered index locks skipped (Azure fleet, ~25K DBs) 81%

No configuration. No hint to add. If you're on SQL Server 2025 with Optimized Locking enabled, SIL is already in there working for you. This post explains what SIL is, how it works, and a demo you can run yourself to see the locks vanish.

Quick recap: TID and LAQ

Transaction ID (TID) locking. Every modified row gets stamped with the ID of the transaction that touched it last. Row and page locks are still taken, but they're released the moment each row is modified, instead of being held through to the commit. The only thing held until commit is one X lock on the TID. Lock memory drops, and lock escalation drops with it.

Lock After Qualification (LAQ). The engine evaluates the predicate against the latest committed row version first, then locks only the rows that actually qualify. Pre v2025, a WHERE OrderDate < '2025-01-01' UPDATE took a U lock on every row it scanned, qualified or not. With LAQ, locks land only on the rows that match.

Both require Accelerated Database Recovery and LAQ also requires Read Committed Snapshot Isolation. Both of these shipped in SQL Server 2025 at launch.

The improvement: Skip Index Locks (SIL)

Under TID locking with RCSI and LAQ in the mix, those short-duration row and page locks are only necessary if some other query is also touching the row and expecting it to stay stable. The kind of query that needs that guarantee is one running under Repeatable Read or Serializable, or using locking hints like 'HOLDLOCK' or 'READCOMMITTEDLOCK'. Microsoft calls these Row Locking Queries (RLQ).

If no RLQ is touching the page, those row and page locks aren't preventing any conflict. Meaning, there's nothing there to conflict with and SIL recognizes that and skips them. The row still gets modified, the change is still ACID-safe, and the lock manager never gets the call. Nice.

The mechanism: Every page in the buffer pool now carries a single No-RLQ bit. When that bit is set, it tells the engine that no Row Locking Query is currently touching any row on the page. A DML statement modifying a row on that page skips the row lock and the page lock entirely, and takes a short exclusive page latch instead. The latch is brief. Each lock would have been a round trip into the Lock Manager, and for a transaction touching thousands of rows, those round trips will add up. That is the overhead that SIL minimizes for us.

Per Microsoft's Azure telemetry, RLQ-style queries are uncommon in real-world workloads using RCSI or snapshot isolation. That is why the skip ratio is so high. Most pages, most of the time, have nobody using them in a way that demands a lock.

What SIL skips, and what it doesn't

SIL IS used for:

  • INSERT statements on heaps - the intent-exclusive (IX) page lock is skipped.
  • UPDATE statements on clustered, nonclustered indexes, and heaps - both the IX page lock and the exclusive (X) row lock are skipped. The engine leans on the page latch and the TID lock instead.

SIL is NOT used for:

  • DELETE statements.
  • UPDATE on heaps where the row has existing forwarding pointers, or where the update creates new ones.
  • Rows with LOB columns - VARCHAR(MAX), NVARCHAR(MAX), VARBINARY(MAX), JSON.
  • Rows on pages that were split in the same transaction.

The exclusions matter. If your hot tables are LOB-heavy or your workload is heavy deletes, your skip ratio will be lower than the headline numbers. Those headline numbers come from update-heavy OLTP under RCSI, which is one of the most common workloads out there.

The other half: LAQ

SIL got the headline, but it shipped with a sibling. Lock After Qualification, or LAQ. LAQ is optimistic. It evaluates query predicates without placing locks on the rows while scanning. In short, it checks the latest committed version of the row and acquires the necessary lock only after the row has qualified for the modification.

The demo

Here is a full working demo for SQL Server 2025 and ADR and RCSI are required.

Step 1: Create the database and enable everything.

CREATE DATABASE SILDemo;
ALTER DATABASE SILDemo SET ACCELERATED_DATABASE_RECOVERY = ON;
ALTER DATABASE SILDemo SET READ_COMMITTED_SNAPSHOT ON;
ALTER DATABASE SILDemo SET OPTIMIZED_LOCKING = ON;
GO

Step 2: Verify SIL is enabled.

USE SILDemo;
SELECT DATABASEPROPERTYEX(DB_NAME(), 'IsOptimizedLockingOn') AS is_optimized_locking_enabled;
GO

Step 3: Create a table with a clustered index and add data.

CREATE TABLE dbo.TestSIL
(
  ID int IDENTITY(1,1) PRIMARY KEY CLUSTERED,
  Value int NOT NULL
);
GO

INSERT INTO dbo.TestSIL (Value)
SELECT TOP (1000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_objects a
CROSS JOIN sys.all_objects b;
GO

Step 4: SIL active — clustered index update.

Open a transaction, update rows, and see what locks the engine holds.

BEGIN TRANSACTION;

UPDATE dbo.TestSIL
SET Value = Value + 10;

SELECT resource_type,
       request_mode,
       COUNT(*) AS lock_count
FROM sys.dm_tran_locks
WHERE request_session_id = @@SPID
  AND resource_type IN ('KEY','PAGE','RID','OBJECT','XACT')
GROUP BY resource_type, request_mode
ORDER BY resource_type, request_mode;

COMMIT TRANSACTION;
GO

If SIL is working correctly, we will see ZERO KEY locks. Under classic locking, an update of 1,000 rows would hold 1,000 X KEY locks (one per modified row) all the way to the commit. SIL skipped every one of them. We still see a small number of PAGE locks (IX mode) and an OBJECT lock (IX), but those are expected and are not held long-term. The KEY lock count is where the story is, and the KEY lock count is zero.

In a nutshell, under classic locking without SIL, updating 1,000 rows would hold 1,000 X KEY locks. With SIL, those KEY locks are skipped entirely.

What you need, and why it matters

SIL turns on for free if the rest of the stack is in place:

  • SQL Server 2025 — it shipped with GA release.
  • Accelerated Database Recovery enabled on the database.
  • Read Committed Snapshot Isolation enabled — without it, LAQ never engages and SIL has nothing to build on.
  • Optimized Locking enabled — ALTER DATABASE [dbname] SET OPTIMIZED_LOCKING = ON;
  • A workload that mostly INSERTs or UPDATEs indexed tables, under RCSI, light on LOB columns.

Azure SQL Database and SQL database in Fabric get all of this automatically. Azure SQL Managed Instance is rolling out. On-prem SQL Server 2025 needs it setup at the database-level.

When it's working, the payoff is the kind a DBA notices: lock memory drops, blocking chains on update-heavy OLTP shorten or even disappear, and you got there with no code changes and no new hints scattered throughout your queries. This is server level. There is no way to opt a single statement into SIL, and there doesn't need to be.

SQL Server 2025's Optimized Locking was already a win. I think SIL is the part that makes the numbers even more worth bragging about.

More to Read

Microsoft Tech Community: Introducing optimized locking v2
Microsoft Learn: Optimized Locking
Microsoft Learn: Accelerated Database Recovery
sqlfingers inc: Optimized Locking in SQL Server 2025 — Concurrency Gets Smarter (September 2025)

Monday, May 25, 2026

Your SQL Server 2016 Upgrade Will Succeed. Then Monday Happens.

SQL Server 2016 goes out of support on July 14, 2026. You know that already. What you may not know is how many seemingly successful upgrades are still going to fail.

Not during the installer or post-upgrade checks. Not during the Friday night smoke test. It's Monday morning -- when the applications reconnect, the linked servers are called, and the ETL starts up -- that things that worked quietly for a decade start throwing SSL errors nobody saw coming.

The upgrade succeeded. The reconnect didn't. That is this post.

The engine is not the problem

When upgrades go bad, it's typically not the engine. It's everything standing around it. Newer drivers. Stricter TLS defaults. Certificate validation that suddenly matters -- and an old environment that never had to think about any of it.

Your installer finishes, the databases come online, SSMS connects, the smoke tests pass, and everyone signs off. Then the application reconnects with a newer OLE DB or ODBC driver -- one that validates the certificate by default -- and the SQL Server that ran fine for years starts refusing connections because nobody ever put a real certificate on it.

Here is what actually changed. Somewhere between 2016-era connectivity and current Microsoft providers, certificate validation became strict by default. Old environments leaned on self-signed certificates, internal certificates that nothing ever really trusted, or no certificate story at all -- because the clients were not checking. The gap was always there, but nothing ever pushed it to the table.

Until now — and this is what it looks like:

A connection was successfully established with the server, but then an error
occurred during the login process. (provider: SSL Provider, error: 0 -
The certificate chain was issued by an authority that is not trusted.)

Read that first line again. The connection to SQL Server succeeded. The network is fine. The SQL Server is fine. The client reached the server, looked at the certificate, and stopped right there. The login never happens. It's a TLS trust problem wearing a login error as a costume.

And why didn't your testing catch it? Your DBA workstation probably already trusts that certificate chain -- so SSMS worked perfectly for everything you tested. The application servers and ETL servers do not trust it. You go home Friday believing the environment is healthy. Production tells you otherwise Monday morning.

Catch this before you upgrade

You don't have to wait until you've upgraded to see this happen. The switch is sitting right in the connection string, and you can test it today against any dev or non-prod SQL Server you have. The keyword is TrustServerCertificate.

This is the old behavior -- the one many of us leaned on for years:

Server=YourServer;Database=YourDb;
Encrypt=True;TrustServerCertificate=True;

And this is the modern default:

Server=YourServer;Database=YourDb;
Encrypt=True;TrustServerCertificate=False;

Test that second one against a SQL Server running a self-signed certificate and the SSL Provider error shows up on demand. That is the whole failure.

It is also why setting TrustServerCertificate=True is not a fix, even though it makes the error vanish instantly. It doesn't solve anything -- it just switches the check off. The traffic is still encrypted, but you are no longer verifying who you connected to. That workaround is all over the forums because it restores connectivity fast, but it's also the kind of shortcut you don't want running in production for another ten years.

The real fix is to put a properly trusted certificate on SQL Server before the migration, then validate the chain from the application and ETL servers that actually have to reconnect. Confirm the environment behaves under modern driver defaults before upgrade weekend -- not after.

Linked servers are a common place to miss this

Linked servers get overlooked because people don't picture them as client connections. But that's exactly what they are -- it's just that SQL Server is the client.

After an upgrade, a linked server inherits the same provider behavior and certificate rules as everything else. So a linked server that has run untouched since 2016 can start failing SSL validation against its partner, even though nothing in the linked server definition changed. The upgrade log won't explain it. The instance looks healthy. You find out when the nightly ETL fails, the reports stop building, or a cross-server query throws an SSL error at 2 AM.

I've seen the identical linked servers work from one SQL Server and fail from another, because the providers and trust stores underneath were different. Same definition. Same target. Completely different TLS behavior. The linked server didn't change. Everything underneath it did.

What about Extended Security Updates?

Not every shop is going to make the July 14 date. Miss it, and 2016 keeps running -- it's the patching that stops. To cover that gap, Microsoft sells Extended Security Updates for up to three more years. ESU buys security update coverage and nothing else. No features, no fixes, no performance. It just keeps the lights on while your upgrade clock keeps ticking -- and the price climbs high every year.

If you genuinely need the runway, ESU can make sense. Just treat it as what it is -- a temporary holdover, not a long-term plan. For a good breakdown of how expensive that path gets, Brent Ozar covered it here.

The bottom line

The upgrade is usually the easy part. The surrounding reconnects are where the breakage live.

Certificates, drivers, linked servers, old providers -- assumptions that survived quietly for a decade. These are what break in an upgrade. Test the connection layer now, while it is still your project and not everybody else's July emergency.

More to Read

Microsoft Lifecycle: SQL Server 2016
Microsoft Learn: Configure SQL Server Encryption
Microsoft Learn: SQL Server Extended Security Updates
Brent Ozar: SQL Server 2016 Support Ends in 90 Days

Thursday, May 21, 2026

Your JSON Column Was Never a JSON Column

For roughly a decade, the way you stored a JSON document in SQL Server was to not store a JSON document at all. You stored a string. NVARCHAR(MAX), an ISJSON check constraint to keep the garbage out, and a quiet agreement with yourself that the column was JSON because you said so.

SQL Server 2025 ships an actual json data type. Binary storage, parsed once, queryable without re-parsing every row. This post is the demo: the old pain, the new behavior, and the one word in the Microsoft docs you need to read before you put this into production.

The old way

Here is the pattern every shop has somewhere. A string column, pretending.

USE tempdb;
GO

CREATE TABLE dbo.OrdersOld
(
    OrderID INT IDENTITY(1,1) PRIMARY KEY,
    Payload NVARCHAR(MAX)
        CONSTRAINT ck_OrdersOld_IsJson_Payload CHECK (ISJSON(Payload) = 1)
);
GO

INSERT dbo.OrdersOld (Payload) VALUES
(N'{"customer":"Acme","total":1200.00,"items":["Coffee","Tea"]}'),
(N'{"customer":"Globex","total":450.00,"items":["Cookies"]}');
GO

It works. It has always worked. But look at what it actually is:

What you wanted What NVARCHAR(MAX) gave you
A JSON document A string to parse
Validation A check constraint you remembered to add
Fast reads A full re-parse of the text on every JSON_VALUE call
Cheap updates Rewrite the whole document to change one key

The check constraint is the giveaway. The engine does not know the column is JSON. You are bolting validation on from the outside, and if a table gets created without that constraint -- it happens -- the column happily accepts 'not json at all' and nobody finds out until a report breaks.

The new way

Same table, one keyword different.

CREATE TABLE dbo.OrdersNew
(
    OrderID INT IDENTITY(1,1) PRIMARY KEY,
    Payload JSON NOT NULL
);
GO

INSERT dbo.OrdersNew (Payload) VALUES
(N'{"customer":"Acme","total":1200.00,"items":["Coffee","Tea"]}'),
(N'{"customer":"Globex","total":450.00,"items":["Cookies"]}');
GO

No check constraint. The validation is in the data type now as it should be. The engine will not let an invalid document into a json column. Try it:

INSERT dbo.OrdersNew (Payload) VALUES (N'not json at all');

That is the difference, right there in the screenshot. The old table only caught bad data if you remembered the constraint. The new table catches it because the JSON data type itself will not accept anything that is not valid JSON.

It is stored parsed, not as text

This is the part that matters for performance, and it is invisible until you go looking. A json column is not stored as the characters you typed -- it is stored in a binary format, parsed once when the row is written.

That changes the math on reads. With the old method, NVARCHAR(MAX), every JSON_VALUE call parses the string from scratch: the engine walks it character by character, finds your key, hands back the value, throws the parse away, and does it all again on the next row. With the json type, that work is already done, and the reads navigate the binary structure directly.

The functions you already know will work against it as-is, no syntax changes:

SELECT
    OrderID,
    Customer = JSON_VALUE(Payload, '$.customer'),
    Total = JSON_VALUE(Payload, '$.total')
FROM dbo.OrdersNew;

Identical T-SQL to what you would write against the old NVARCHAR(MAX) column. That is deliberate -- Microsoft kept the surface area the same so your existing code keeps running. What changed is underneath it.

That is the claim. Here is the measurement. Build the same document set -- 200,000 rows of a heavier document -- in both an NVARCHAR(MAX) column and a json column:

CREATE TABLE dbo.JsonString (id INT PRIMARY KEY, Payload NVARCHAR(MAX));
CREATE TABLE dbo.JsonNative (id INT PRIMARY KEY, Payload JSON);
GO

INSERT dbo.JsonString (id, Payload)
SELECT TOP (200000)
    ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),
    N'{"customer":"Acme Industries","region":"West","tier":"Gold",
"total":1200.00,"tax":96.00,"shipping":"expedited","priority":7,
"notes":"recurring quarterly order, net 30 terms on file",
"items":["Coffee","Tea","Cake","Cookies","Sandwich","Juice"],
"contact":{"name":"Jane Roe","email":"jane@acme.test","phone":"555-0100"}}'
FROM sys.all_objects a CROSS JOIN sys.all_objects b;

INSERT dbo.JsonNative (id, Payload)
SELECT id, Payload FROM dbo.JsonString;
GO

Now extract several values per row -- each JSON_VALUE call against the string column is another full parse of that document:

SET STATISTICS TIME ON;

-- The old way: re-parse the string on every call, every row
SELECT COUNT(*)
FROM dbo.JsonString
WHERE JSON_VALUE(Payload, '$.customer')      IS NOT NULL
  AND JSON_VALUE(Payload, '$.region')        IS NOT NULL
  AND JSON_VALUE(Payload, '$.tier')          IS NOT NULL
  AND JSON_VALUE(Payload, '$.contact.name')  IS NOT NULL
  AND JSON_VALUE(Payload, '$.contact.email') IS NOT NULL
  AND JSON_VALUE(Payload, '$.shipping')      IS NOT NULL;

-- The new way: document already parsed, navigated directly
SELECT COUNT(*)
FROM dbo.JsonNative
WHERE JSON_VALUE(Payload, '$.customer')      IS NOT NULL
  AND JSON_VALUE(Payload, '$.region')        IS NOT NULL
  AND JSON_VALUE(Payload, '$.tier')          IS NOT NULL
  AND JSON_VALUE(Payload, '$.contact.name')  IS NOT NULL
  AND JSON_VALUE(Payload, '$.contact.email') IS NOT NULL
  AND JSON_VALUE(Payload, '$.shipping')      IS NOT NULL;

SET STATISTICS TIME OFF;

On my box the string column ran 5,672 ms of CPU. The json column ran the identical query in 3,828 ms -- roughly a third faster, for changing one keyword in a CREATE TABLE. Same query, same data, same result. The only difference is that the string column re-parses the whole document on every one of those six JSON_VALUE calls, on every row, while the json column was parsed once on insert and just reads the structure. Your numbers will differ, but the shape will not: the more keys you pull and the bigger the document, the wider that gap gets.

A note on JSON_MODIFY

On a string column, changing one key means rewriting the entire document -- read it, modify the text, write the whole thing back. The json type adds a modify method that is built to do in-place updates where the new value fits, instead of rewriting the document in full. Very useful, and on the list of things I plan to lab, but don't miss the version note below. The modify method specifically is one of the pieces still marked preview.

Change existing column to json

You do not have to rebuild the table. ALTER TABLE ... ALTER COLUMN converts a character column in place:

-- The check constraint references Payload -- must be dropped first
ALTER TABLE dbo.OrdersOld
DROP CONSTRAINT ck_OrdersOld_IsJson_Payload;
GO

-- Now take the old string column to the real type
ALTER TABLE dbo.OrdersOld
ALTER COLUMN Payload JSON NOT NULL;
GO

One thing to know before you run that against a real table: the conversion validates every row. Any row holding text that is not well-formed JSON will fail the ALTER. That is not a bug -- it is the data type doing its job -- but it means the ALTER to json doubles as a data-quality check. On a column that has been collecting strings for years with a check constraint that may not have always been trusted, you can probably expect the first attempt to find something. That is not a bad thing.

The order of operations for a production column json change: add the new json column alongside the old one, backfill in batches, find and fix the rows that will not convert, then drop the old column when you know it's clean.

The one thing you should read first

Here is the honest version, because the headlines blur it.

The native JSON data type is generally available -- on Azure SQL Database and Azure SQL Managed Instance. For the boxed product, SQL Server 2025, Microsoft's own documentation still describes the json type as being in preview. The type ships in the box, it works under every database compatibility level, and you can build on it today in a dev environment. But 'in the box' and 'GA' are not the same thing, and the modify method and the JSON aggregate functions are explicitly called out as preview on SQL Server 2025.

Where json type status
Azure SQL Database Generally available
Azure SQL Managed Instance GA (SQL Server 2025 / Always-up-to-date policy)
SQL Server 2025 (box) In preview, per Microsoft Learn

What that means in practice: lab it, learn it, play with it. Do not bet a production schema on the preview-flagged pieces until Microsoft moves them to GA. Preview features can change between CUs, and you don't want to build on a preview feature that gets turned off later. Be sure to check the docs against your current build before you commit.

The bottom line

The NVARCHAR(MAX)-plus-ISJSON pattern was never wrong. It was just the best we had. Very much like the FOR XML PATH string trick was the best we had before STRING_AGG. SQL Server 2025 finally makes JSON a real data type instead of a string with a usage note attached: validated by the engine, stored parsed, cheaper to read and to modify.

It is not a license to turn your relational database into a document store. JSON is still for the variable-shaped data -- the flexible attribute bag, API payloads and unmodeled log records. Do not use it as a shortcut to avoid defining columns that should have been native columns from day one. But for the JSON you legitimately have, this is your upgrade. Just keep an eye on that 'preview' status -- it's a dev-box feature on SQL Server 2025 until Microsoft says otherwise.

More to Read

Microsoft Learn: JSON data type
Microsoft Learn: Work with JSON data in SQL Server
Microsoft Learn: SQL Server 2025 Preview Features FAQ
Microsoft: GA of JSON data type and JSON aggregates
sqlfingers inc: Three T-SQL Tricks You May Not Be Using Yet

Wednesday, May 20, 2026

CISA Left 844 MB of Credentials on Public GitHub. Check Your Own Repos.

CISA is the United States federal agency that tells everyone else how to protect their passwords and credentials. For six months, it left 844 MB of its own — passwords, cloud keys, certificates — sitting in a public GitHub repository.

That is not a hack. Nobody broke in. A contractor put it there.

Disclaimer: Not a cybersecurity expert. Just a DBA reading the news.

What happened

On May 14, 2026, researcher Guillaume Valadon at GitGuardian found a public GitHub repository named 'Private-CISA'. Inside was 844 MB of plaintext passwords, AWS GovCloud credentials, API tokens, Entra ID SAML certificates, SSH keys, Kubernetes manifests, Terraform code, and internal documentation backups belonging to the Cybersecurity and Infrastructure Security Agency. The repository had been created on November 13, 2025 — public for six months before anyone outside flagged it.

Valadon reported it. CISA took the repository offline within 26 hours. Some of the exposed AWS keys stayed valid for another 48 hours after that.

The repository was maintained on a personal GitHub account by a contractor, who appears to have used it to sync files between a work machine and a home machine. The commit history shows GitHub's secret scanning was deliberately disabled.

Again — CISA is the agency that writes the rules.

CISA's statement: "Currently, there is no indication that any sensitive data was compromised as a result of this incident."

What was actually in it

The file names tell the whole story.

importantAWStokens — administrative credentials to three AWS GovCloud accounts.

AWS-Workspace-Firefox-Passwords.csv — plaintext usernames and passwords for dozens of internal CISA systems — the kind of list a browser builds when you let it save every login.

Nobody had to decrypt anything. A file called importantAWStokens held exactly that. No exploit. No zero-day. Just a contractor saving passwords to a CSV and pushing it to a public repo.

Valadon, the researcher who found it, put it this way:

"I honestly believed that it was all fake before analyzing the content deeper. This is indeed the worst leak that I've witnessed in my career. It is obviously an individual's mistake, but I believe that it might reveal internal practices."

This is not a federal story. It is a habits story.

It is easy to read this as a government story and move on. Do not. Every failure in this incident is something I find in SQL Server shops.

The CISA failure The SQL Server shop equivalent
Plaintext passwords in a CSV Connection strings with embedded sa passwords in .dtsConfig files, deployment scripts, and .publish.xml
Backups committed to Git .bak files and BCP exports sitting in a repo
Secret scanning disabled on purpose The 'make it stop warning me' move, in any repo your team owns
AWS keys valid 48 hours after takedown Can you actually rotate a SQL login or a linked server credential in minutes?
Personal repo used to sync work files Contractors and developers moving work between machines through personal GitHub

Read that last row again. Work files on a personal account. One of the worst things for any security team to see - and it's everywhere.

Find it in your own estate

You do not need a federal contractor to have this problem. SQL Server credentials hide in source control in predictable places.

SSIS configuration files
A .dtsConfig with a connection string carries whatever was in it, password included, unless the package protection level forced the credential out.

Deployment artifacts
.publish.xml profiles, post-deployment scripts, and .sqlproj files routinely carry connection details.

SQL Agent job scripts
Job step T-SQL scripted out to a repo can carry linked server definitions and the credentials behind them.

BCP exports and format files
A .bak or a BCP dump committed 'temporarily' is a database sitting in your Git history forever.

Two things to do very soon. First, secret scanning — a GitHub feature that watches commits for things that look like credentials, AWS keys, tokens, connection strings, and either warns you or blocks the push outright. Turn it on for every repo your team owns, and confirm nobody has turned it off. The CISA contractor's repo had it disabled on purpose. Then ask the harder question: if a credential leaked today, how fast could you rotate it? If the honest answer is days, that is its own finding — just like the 48-hour CISA window.

The bottom line

I have written a lot about AI governance — agents with elevated credentials, the absence of data discipline, the legal exposure forming around it. This is the same story without the AI. A shop moving fast, skipping the basics, one switched-off setting away from a very bad day.

This leak is an 844 MB reminder of how little it takes. No AI. No attacker. Just plaintext passwords in a repo and a scanning feature switched off on purpose.

Secret scanning on. Secrets out of Git. Revocation you can do in minutes, not days. The boring discipline is still where shops are bleeding.

CISA could not keep its credentials out of a public repository. Can you?

More to Read

Krebs on Security: CISA Admin Leaked AWS GovCloud Keys on Github
GitGuardian: How We Got a CISA GitHub Leak Taken Down in Under a Day
The Register: America's top cyber-defense agency left a GitHub repo open
sqlfingers inc: AI Agent. Nine Seconds. One Production Database. Gone.

Tuesday, May 19, 2026

Patch Tuesday May 2026: SQL Didn't Get Off Quite That Easy

Last Wednesday I told you SQL Server got off easy in the May Patch Tuesday release. No critical, no public PoC, no headline. Apply the GDR or CU and move on.

I missed something. There was a SQL Server CVE worth talking about, and it got by me because the bug isn't in SQL Server. It's in SSIS.

CVE-2026-40370

CVSS 8.8. Published May 12. Microsoft's one-line description: 'External control of file name or path in SQL Server allows an authorized attacker to execute code over a network.' That reads like a SQL Server engine bug. It isn't.

The actual fix, per KB5089899 and the sibling KBs covering every supported version back to 2016 SP3:

This fix addresses an XML external entity (XXE) vulnerability in the Web
Service Task that allows an attacker to read arbitrary files from the local
file system or cause a denial-of-service (DoS) attack.

Fix area: Integration Services. Component: Integration Services. Platform: Windows.

That is a SSIS bug shipped under a SQL Server CVE number. If you don't run SSIS, the headline does not apply to you. If you do run SSIS and have any package using the Web Service Task, this is yours to deal with.

What XXE is, briefly

XML External Entity is an old class of XML parser vulnerability. The XML spec lets a document declare external entities, like references to content fetched from a URL or a local file at parse time. If the parser resolves those references and the application returns the result, an attacker who controls the XML can read files off the server.

The classic payload looks like this:

<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///c:/windows/win.ini">
]>
<foo>&xxe;</foo>

A vulnerable parser fetches c:\windows\win.ini and substitutes the contents into the response. Swap in any path the SQL Server service account can read -- config files, certificates, credential stores, anything -- and you have file disclosure. Point the entity at a non-responsive server and you have a DoS.

OWASP has covered this for over a decade. The fix in modern XML parsers is to disable DTD processing and external entity resolution by default. Microsoft's KB confirms that is what shipped on May 12 for the Web Service Task.

Who is actually affected

Three conditions have to be true at the same time for this CVE to matter to your environment:

Condition Detail
You run SSIS Integration Services installed on a Windows host running a supported SQL Server version.
At least one package uses Web Service Task The control flow task that calls a SOAP web service. If your packages do not include it, the bug is not reachable.
An attacker can influence the XML Microsoft classifies the attacker as 'authorized,' meaning they need permission to run packages or supply package input. Not unauthenticated.

The third condition is what keeps this off the wormable list, and it is why I let it slide last week. 'Authorized attacker' is not nothing though. A user with SSIS execute permissions, a compromised application account, a malicious insider — any of those qualifies. And depending on which environment we are talking about, that could be a dozen people or a hundred.

One thing the bug description does not say is whether the vulnerability is reachable through package design alone, or only through runtime input. Until that is clear, the safe read is: if you have packages with the Web Service Task and any non-DBA can affect what XML those packages process, patch.

The patch matrix

Microsoft shipped a fix for every supported SQL Server version on May 12. Builds and KBs verified against Microsoft Support:

Version Servicing path KB Build
SQL Server 2016 SP3 GDR KB5089271 13.0.6490.1
SQL Server 2017 GDR KB5090347 14.0.2110.2
SQL Server 2019 GDR KB5090408 15.0.2170.1
SQL Server 2022 GDR KB5091158 16.0.1180.1
SQL Server 2022 CU24 KB5089900 16.0.4252.3
SQL Server 2025 GDR KB5091223 17.0.1115.1
SQL Server 2025 CU4 KB5089899 17.0.4040.1

Sibling KBs for the CU paths on 2016, 2017, and 2019 also shipped the same day. Check the Microsoft Security Update Guide for the package that matches your current build. As I noted last week, GDR and CU are separate lanes. Pick the one you have been applying.

SQL Server 2014 and earlier are out of mainstream support and do not appear in the May release. If you are on one of those and running SSIS, you are not getting this fix.

What to do this week

1. Inventory your packages.

Find every package that uses the Web Service Task. If your packages live in SSISDB, the project XML is queryable. If they live on the file system or in msdb, a recursive grep for STOCK:WebServiceTask in the .dtsx files will find them. Most shops have never inventoried which SSIS tasks are in use where. This is a good reason to start.

2. Identify who can supply input to those packages.

Any account that can execute the package, modify its variables at runtime, or supply the XML input it processes. SSIS catalog permissions, SQL Agent proxy accounts, application accounts calling the package via T-SQL -- all of them count.

3. Patch.

Apply the GDR or CU that matches your servicing path. The fix is in the SSIS runtime on the server, not the package itself. You do not need to redeploy packages.

4. If you cannot patch this week, look at your packages.

The Web Service Task is one of several tasks in SSIS that consume XML. If a package is reading XML from a source an untrusted user can influence, that is the exposure window. Consider disabling those packages until the patch lands.

What I should have said last week

'SQL Server got off easy' was true for the engine. It was not true for SSIS. A CVSS 8.8 RCE that ships under a SQL Server CVE number, with a fix in every supported version, deserves more than a footnote -- even when the attack surface is narrower than the headline suggests.

This is the second month in a row that the SQL Server CVE worth caring about is not actually in the engine. March was the sysadmin escalation in linked servers. May is XXE in the SSIS Web Service Task. The pattern is worth noting. The headline CVE rating tells you the score. It does not tell you which component shipped the fix or whether the bug is reachable in your environment. For that, you have to read the KB

Apologies for the miss. Patch your SSIS hosts this week.

More to Read

Microsoft Security Response Center: CVE-2026-40370
Microsoft Security Update Guide: May 2026 Release Notes
KB5089899: SQL Server 2025 CU4 (canonical fix description)
OWASP: XML External Entity (XXE) Processing
sqlfingers inc: Patch Tuesday May 2026 — SQL Got Off Easy. Your Domain Didn't.

Wednesday, May 13, 2026

Patch Tuesday May 2026: SQL Got Off Easy. Your Domain Didn't.

Yesterday was Patch Tuesday. For the first time since June 2024, Microsoft shipped a release with zero zero-days. Nothing exploited in the wild. Nothing publicly disclosed before the fix landed. 118 CVEs, 16 critical, 102 important, and not a single one on fire.

SQL Server got off easy this month, too. We're in the patch list, but no critical SQL Server specific CVE drew a headline. A welcome change after March's CVE-2026-21262 sysadmin-escalation party.

Before you breathe out though, read my title again. The bullet missed SQL Server. It did not miss your domain controller.

The numbers

Metric May 2026
Total CVEs 118
Critical 16
Important 102
Zero-days exploited 0
Zero-days publicly disclosed 0
EoP share 48.3%
RCE share 24.6%

Quick refresher on what a zero-day actually is: a vulnerability the vendor hasn't patched yet — but attackers already know about it. Microsoft tracks two flavors separately. 'Exploited in the wild' means attackers are already using it. 'Publicly disclosed' means the technical details are out, so weaponization is hours away. The 'zero' counts the days defenders had to prepare. March's CVE-2026-21262 was publicly disclosed before the patch shipped — that's why we said 'stop reading and go patch.' This month we've got neither flavor. That is very interesting.

Why this matters to you

SQL Server didn't draw a critical CVE, but three flaws in the May release hit the stack around it. In priority order:

1. The domain controllers your SQL Servers live behind

CVE-2026-41089 is a stack-based buffer overflow in Windows Netlogon. CVSS 9.8. A remote, unauthenticated attacker sending a crafted packet to a server running as a domain controller can execute code on the box. Microsoft tagged it 'Exploitation Less Likely' — and ZDI called it 'wormable' in the same breath, noting that a compromised domain controller is a compromised domain.

'wormable'? All I can say is, do you remember SQL Slammer? January 2003. Took down half the internet in ten minutes. Oh man, the war stories...

Your SQL Server doesn't need to be vulnerable to be at risk. Most of us run Windows Authentication. Most of us trust the domain. If the DC is owned, your service accounts, your AG endpoints, your SPNs, your sa-equivalent Windows logins — all of it inherits the problem.

2. The Word document that owns the DBA laptop

Four critical RCEs in Microsoft Word landed this month — CVE-2026-40361, CVE-2026-40364, CVE-2026-40366, CVE-2026-40367. All CVSS 8.4. The first two are flagged 'Exploitation More Likely'. The detail that should make every DBA stop scrolling: Microsoft notes the Preview Pane is an attack vector. You don't have to open the file. You just have to see it listed in Outlook.

Now think about how a DBA's inbox actually looks. Vendor docs. Customer-supplied schemas in .docx. Audit checklists, Technical Reports and Logs, SOWs... Half of it arrives from an email you don't recognize with attachments too many will double-click before verifying. Patch your workstation this week.

3. Everything else that touches the database tier

CVE What CVSS Why you care
CVE-2026-41096 Windows DNS Client RCE 9.8 Heap overflow via malicious DNS response. Every Windows box runs the client.
CVE-2026-40415 Windows TCP/IP RCE 9.8 Use-after-free, unauthenticated, no user interaction. Wormable.
CVE-2026-42898 Dynamics 365 On-Prem RCE 9.9 Authenticated, with scope change. NAV/BC shops, this is yours.
CVE-2026-41103 SSO Plugin for Jira & Confluence EoP 9.1 Forged identity, bypasses Entra ID auth. 'Exploitation More Likely'.

SQL Server's patches this month

SQL Server appears in the May 2026 patch list. No critical, no public PoC, no headline. Just the routine cumulative-update plumbing we expect. I like boring. Boring is good. Apply the GDR or CU for your servicing path and move on.

One reminder while you're in there: GDR and CU are separate servicing paths. Once you apply a CU, you can't revert to GDR for that installation. Pick the lane that matches what you have already been applying. Microsoft's Security Update Guide is the canonical source for which KB matches your build.

What 'no zero-day' does not mean

It does not mean safe. It means defenders got a head start for a change. Every Patch Tuesday, attackers reverse-engineer the fixes to figure out what bug was patched and how to exploit it. That work is continually faster — AI tooling now does in hours what used to take a skilled researcher days. A 'no zero-day' month just means the exploit work starts after the patch ships instead of before. The exploits are still coming. You just have a little more time to prepare.

It also doesn't mean you can skip a cycle. A zero-day-free Patch Tuesday can still be a bad Patch Tuesday — wormable DC bugs, preview-pane RCEs, and TCP/IP unauthenticated RCEs all add up to a busy week.

The bottom line

SQL Server caught a break. Your domain didn't. Patch your DCs this week. Patch your laptop this week. Apply the SQL Server GDR or CU for your path while you're at it. We got lucky. Be glad May was quiet.

More to Read

Microsoft Security Update Guide: May 2026 Release Notes
Tenable: May 2026 Patch Tuesday Addresses 118 CVEs
Zero Day Initiative: The May 2026 Security Update Review
Help Net Security: Many fixes, but no zero-days
sqlfingers inc: Patch Tuesday — Your sysadmin Role Was Up for Grabs (March 2026)

Friday, May 8, 2026

June 1. Copilot Billing Changes. SSMS Has a Problem.

Three weeks. One billing model change. One SSMS bug that may already be costing you. On June 1, 2026, GitHub Copilot moves from fixed-fee request-based billing to token-based billing. The fallback to a cheaper model when credits run out option is gone. Why? Because an autonomous agent session can cost the same as a quick chat question, and GitHub says the flat rate model is no longer sustainable. And SSMS 22's Copilot integration had a documented infinite-loop bug that was silently exhausting user request quotas. Microsoft fixed it in March — but only if you patched.

Connect those dots.

Disclaimer: not a GitHub billing expert. Just a DBA reading the announcement.

What's changing on June 1

Per GitHub's April 27 announcement, premium request units (PRUs) are out. GitHub AI Credits are in. One credit equals one US cent. Plan prices are unchanged. What changes is how those dollars get consumed.

Plan Monthly cost Monthly AI Credits
Free $0 Limited allowance
Pro $10 $10 (1,000 credits)
Pro+ $39 $39 (3,900 credits)
Business $19/seat $19 (1,900 credits)
Enterprise $39/seat $39 (3,900 credits)

Code completions and Next Edit Suggestions remain free and unmetered. Everything else — chat, agent mode, code review — draws from the credit pool, priced per token by model. Per GitHub's published rate sheet, premium models like Anthropic's Claude Opus and OpenAI's GPT-5.5 cost dramatically more per token than default models.

Two changes worth pulling out of the announcement:

The fallback to a cheaper model is gone. Today, when you exhaust your PRUs, Copilot quietly downgrades you to a smaller model so things keep working. After June 1, when credits run out, that's it — you're done. Unless an admin has set up additional usage budget, in which case you are paying overage at published rates with no cap unless one is configured.

Annual plans are also getting squeezed. Annual Pro and Pro+ subscribers stay on the request-based model until renewal, but the model multipliers are increasing on June 1, and the refund-and-cancel option is only open until May 20. That is twelve days from today.

Why DBA work eats tokens faster than dev work

Token math, briefly. The general rule is roughly four characters per token for English prose. Code and markup run denser — around three to three-and-a-half characters per token — because of all the brackets, attributes, and namespace noise.

What does that mean in practice? Here is a rough input-token cost per paste, per artifact a DBA hands to Copilot every day:

Artifact Typical size Input tokens (approx)
Simple plan XML 10-50 KB 2,500 - 12,500
Complex plan XML ~100 KB 25,000 - 33,000
Big analytical plan 500 KB+ 125,000+
sp_BlitzCache top 50 50-200 KB 12,500 - 65,000
Deadlock graph XML 5-50 KB 1,250 - 12,500
Full schema (50 tables) 100 KB+ 25,000+

And that is per paste. The DBA workflow rarely stops at one. Paste plan, ask. Paste a different plan, ask again. Call sp_BlitzCache @AI = 1. Paste sp_WhoIsActive output, ask why blocking is occurring. Paste the deadlock graph, ask which transaction was the victim. Three to five iterations is normal for any non-trivial troubleshooting session, and each one carries the full prior context with it as input tokens.

Compare that to the developer workflow Copilot was originally built around. Inline code completions. Short prompts. Small inline edits. Inline completions remain free under the new model. The DBA pattern of pasting big diagnostic XML into chat is what now costs real money — and at premium model rates, the math gets real big real fast.

Multiply your typical paste size by three to five iterations, multiply that by the per-token rate of whichever model you are using, and you have your monthly Copilot exposure. Then go look at their pricing page and do the math with your actual workflow.

The SSMS 22 Copilot bug — fixed in 22.4.1 — if you patched

A user thread on the GitHub Community board, discussion #181818, documents Copilot in SSMS 22 entering an infinite loop on certain failed or stalled query executions, sending repeated API calls in the background while showing a loading screen, and burning through users' premium request quotas overnight. One user reported 1,202 Claude Sonnet 4.5 requests in a single day attributed to the SSMS integration alone. Multiple users in the thread confirmed identical behavior, often without even using the chat. The same Copilot integration in Visual Studio and VS Code did not exhibit this issue.

The PM for GitHub Copilot in SSMS responded directly in the thread: "We are aware of this problem and are working on a resolution." The issue was also tracked on Microsoft's SSMS Developer Community.

SSMS 22.4.1, released March 25, 2026 alongside the GA of GitHub Copilot in SSMS, includes improved handling of query executions that either return no results or fail completely. That is the fix. Some users have also reported success rolling back to SSMS 22.1.0 if 22.4.1 still misbehaves in their specific environment.

If you are running an earlier SSMS 22 build with Copilot enabled and the patched 22.4.1 is not deployed, you are carrying this risk straight into the per-token billing model change on June 1. Today the bug burns PRUs — you hit a usage cap, you get warnings, you investigate. After June 1, the same bug burns your credit card balance at API rates with no fallback option. For shops on Copilot Business or Enterprise, where credits are pooled across the organization, a single unpatched user with this bug active could drain their team's monthly allocation overnight.

Read that again. One unpatched SSMS 22 user with a bad API loop, can drain the pooled credit pool for the entire team in a single day.

What to do before June 1

  • Check the preview bill. GitHub launched a preview bill experience in early May, accessible from the Billing Overview page on github.com. It shows what your April usage would have cost under the new model. Look at it before June 1 surprises you.
  • Decide on the annual plan refund window. If you are on annual Pro or Pro+, the refund-with-cancel option closes May 20. After that, you are stuck on request-based billing with the new (worse) model multipliers until renewal.
  • Set spending budgets. Admins on Business and Enterprise can set per-user budgets. A $0 user budget is the kill switch — no credit consumption for that user. Use this on accounts that do not need Copilot or where you suspect bug exposure.
  • Confirm everyone is on SSMS 22.4.1 or later. The 22.4.1 release contains the Copilot infinite-loop fix. Anyone running an earlier SSMS 22 build with the AI Assistance workload installed is exposed. Verify the version with Help, About in SSMS. If you are already on 22.4.1 and the issue persists, rolling back to SSMS 22.1.0 has resolved it for some users.
  • Train the team off paste-the-whole-thing prompts. Pasting a 500KB execution plan into chat to ask 'why is this slow' is now an expensive habit. Targeted questions against scoped context — the specific operator you are concerned about, the specific predicate, the specific wait type — cost a fraction of the same answer at published per-token model rates. Another good reminder that we should be very mindful of what we give to AI.
  • Watch the SSMS feedback site. Per the PM in discussion #181818, the SSMS feedback site is where SSMS Copilot issues should be logged. Subscribe to the relevant items if you are running SSMS 22 in production.

The bigger picture

None of this should be a surprise. The flat-rate AI buffet was the customer-acquisition phase. This is the bill phase. It was always going to come.

I have been writing about the operational and legal risks of AI tooling in production for weeks now. An AI agent deleted a production database in nine seconds. 99% of US enterprises consider themselves AI-ready while 60% admit they cannot manage their data. P2SQL injection turns plain English into damaging, destructive commands. SQL Server Ledger gives us tamper-evident logging when an agent runs amok.

Now we add a new risk to the operational and legal pile: cost. The all-you-can-eat AI subscription was a transitional pricing model. GitHub said it themselves — the per-seat model is not sustainable when an autonomous agent session can cost the same as a quick chat question. Token-based billing is what AI economics actually look like at the API layer, and it's just making its way to Copilot itself.

The shops that survive the transition cleanly will be the ones whose DBAs treat tokens like CPU cycles — finite, measurable, and worth optimizing. Targeted prompts. Scoped context. Cheaper models for the routine stuff. Premium models reserved for the questions that actually need them.

The shops that do not will find out in July, when the first usage-based bill arrives.

More to Read

GitHub Blog: GitHub Copilot is moving to usage-based billing
GitHub Docs: About billing for individual GitHub Copilot plans
GitHub Docs: Models and pricing for GitHub Copilot
GitHub Community Discussion #181818: SSMS 22 Copilot integration usage spike
Microsoft Fabric Community: SSMS 22.4.1 and GitHub Copilot in SSMS (Generally Available)
Microsoft: SSMS Feedback Site

Wednesday, May 6, 2026

Three T-SQL Tricks You May Not Be Using Yet

Three T-SQL features that have shipped over the last few releases and quietly retired patterns many of us are still using out of habit. Each replaces a stale workaround with one line of code, and in two of three cases it runs much faster, too. Take a look, try them out.

1. APPROX_COUNT_DISTINCT -- when COUNT(DISTINCT) is too expensive

Available since SQL Server 2019. COUNT(DISTINCT col) on a high-cardinality column requires SQL Server to track every unique value it has seen, which means a memory grant proportional to cardinality and a hash aggregate that loves to spill to tempdb on big tables. APPROX_COUNT_DISTINCT uses HyperLogLog instead, which trades a small accuracy hit for dramatically lower memory and faster runs.

Haven't heard of HyperLogLog before? Short story: HyperLogLog (HLL) in SQL Server is used to provide extremely fast, approximate counts of unique values (cardinality) in massive datasets using very little memory. Implemented with the APPROX_COUNT_DISTINCT function, allowing analytics on billions of rows with a roughly 2% error margin, significantly faster than COUNT(DISTINCT). Pretty cool. I use this a lot in trading data.

Build a five-million-row table with a few hundred thousand distinct customers (takes a few seconds on a laptop):

WITH N AS
(
    SELECT TOP (5000000)
        OrderID = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
    FROM sys.all_objects a
    CROSS JOIN sys.all_objects b
    CROSS JOIN sys.all_objects c
)
SELECT
    OrderID,
    CustomerID = ABS(CHECKSUM(NEWID())) % 250000,
    OrderDate = DATEADD(SECOND, OrderID, '2020-01-01'),
    Amount = CAST(OrderID % 1000 AS DECIMAL(10,2))
INTO dbo.OrdersDemo
FROM N;

CREATE CLUSTERED INDEX IX_OrdersDemo_OrderID ON dbo.OrdersDemo (OrderID);

The old way:

SET STATISTICS TIME, IO ON;

SELECT COUNT(DISTINCT CustomerID) AS UniqueCustomers
FROM dbo.OrdersDemo;

The new way:

SELECT APPROX_COUNT_DISTINCT(CustomerID) AS UniqueCustomers
FROM dbo.OrdersDemo;

SET STATISTICS TIME, IO OFF;

Run both, compare CPU time and elapsed time, or even pull the actual execution plan to compare the memory grant on the root operator. You can clearly see, APPROX_COUNT_DISTINCT finishes noticeably faster with a smaller grant.

And the memory grant...

Don't use it for anything that has to be exact -- billing, regulatory counts, reconciliation. Do use it for the dozens of dashboards and exploratory queries where 'about a million' is the same answer as 'exactly 1,003,847'.

2. STRING_AGG -- retire the FOR XML PATH('') hack

Available since SQL Server 2017. Many shops still use legacy code building delimited strings with FOR XML PATH('') and STUFF(). It worked, technically. It was also suboptimal performance.

Setup -- a tiny order schema with a few items per order:

CREATE TABLE dbo.Products
(
    ProductID    INT PRIMARY KEY,
    ProductName  NVARCHAR(50)
);

CREATE TABLE dbo.Orders
(
    OrderID      INT PRIMARY KEY,
    CustomerID   INT
);

CREATE TABLE dbo.OrderItems
(
    OrderID    INT,
    ProductID  INT
);

INSERT dbo.Products VALUES
(1, 'Coffee'), (2, 'Tea'), (3, 'Cookies'), (4, 'Cake'), (5, 'Sandwich');

INSERT dbo.Orders VALUES
(1001, 50), (1002, 50), (1003, 51), (1004, 52);

INSERT dbo.OrderItems VALUES
(1001, 1), (1001, 3),
(1002, 2), (1002, 4), (1002, 5),
(1003, 1), (1003, 2),
(1004, 5);

The old way:

SELECT
    o.OrderID,
    o.CustomerID,
    Products = STUFF((
        SELECT ', ' + p.ProductName
        FROM dbo.OrderItems oi JOIN dbo.Products p 
          ON oi.ProductID = p.ProductID
        WHERE  oi.OrderID = o.OrderID
        FOR XML PATH(''), TYPE
    ).value('.', 'NVARCHAR(MAX)'), 1, 2, '')
FROM dbo.Orders o
ORDER BY o.OrderID;

The new way:

SELECT
    o.OrderID,
    o.CustomerID,
    Products = STRING_AGG(p.ProductName, ', ')
                 WITHIN GROUP (ORDER BY p.ProductName)
FROM dbo.Orders o JOIN dbo.OrderItems oi 
  ON o.OrderID = oi.OrderID JOIN dbo.Products p
    ON oi.ProductID = p.ProductID
GROUP BY o.OrderID, o.CustomerID
ORDER BY o.OrderID;

Both produce the same resultset:

But look at that implicit conversion, larger plan cache and higher subtree cost in one vs the other:

3. GENERATE_SERIES -- kill the homemade tally CTE

This one was introduced in SQL Server 2022. Every DBA has copy-pasted some variant of a recursive CTE to generate a sequence of numbers or dates. The classic use case: a date-spine report that fills in zeros for days with no activity. GENERATE_SERIES replaces it with one function call.

Setup -- a sales table with deliberately sparse dates:

CREATE TABLE dbo.DailySales
(
    SaleDate DATE,
    Amount   DECIMAL(10,2)
);

INSERT dbo.DailySales VALUES
('2026-01-02', 1200.00),
('2026-01-03',  450.00),
('2026-01-05', 2300.00),
('2026-01-08',  890.00),
('2026-01-10', 1100.00);

The old way -- recursive CTE for the date spine:

DECLARE @StartDate DATE = '2026-01-01';
;WITH Days AS
(
    SELECT @StartDate AS TheDate, 0 AS n
    UNION ALL
    SELECT DATEADD(DAY, n + 1, @StartDate), n + 1
    FROM Days
    WHERE n < 9
)
SELECT
    d.TheDate,
    Amount = ISNULL(s.Amount, 0)
FROM Days d LEFT JOIN dbo.DailySales s 
  ON s.SaleDate = d.TheDate
ORDER BY d.TheDate
OPTION (MAXRECURSION 0);

The new way:

DECLARE @StartDate DATE = '2026-01-01';
SELECT
    TheDate = DATEADD(DAY, gs.value, @StartDate),
    Amount = ISNULL(s.Amount, 0)
FROM GENERATE_SERIES(0, 9) gs LEFT JOIN dbo.DailySales s
  ON s.SaleDate = DATEADD(DAY, gs.value, @StartDate)
ORDER BY TheDate;

Both produce the same ten-row date spine with zeros filled in for the empty days, but compare the two constructs. With GENERATE_SERIES we've got no CTE, no MAXRECURSION and no helper table. GENERATE_SERIES accepts (start, stop) or (start, stop, step), supports negative steps, and returns a single column called value. The Optimizer treats it as a normal table-valued function and can parallelize work that depends on it -- something the recursive CTE cannot do.

Why we are still writing the old versions

Full disclosure: this post happened because I caught myself reaching for a recursive CTE last night before remembering the GENERATE_SERIES improvement. Habits stick. But they can be broken easily.

The recursive tally CTE was the first 'clever' T-SQL many of us wrote. The FOR XML PATH trick was THE Stack Overflow answer for a decade. COUNT(DISTINCT) still gets typed reflexively when 'roughly how many' would do.

None of these replacements are new. STRING_AGG turned eight this year. APPROX_COUNT_DISTINCT shipped in 2019, and GENERATE_SERIES has been around since 2022. Time to check them out.

More to Read

Microsoft Learn: APPROX_COUNT_DISTINCT
Microsoft Learn: STRING_AGG
Microsoft Learn: GENERATE_SERIES

Monday, May 4, 2026

99% of US Enterprises Are AI-Ready. Their Lawyers Beg to Differ.

Two facts. The same week.

99% of US enterprises consider themselves AI-ready. 88% believe they are ahead of their competitors. 60% of those same organizations cite data management and governance as their number one challenge.

A federal court just denied Meta's motion to dismiss a class action that turned on this question: when a platform's generative AI tools "developed the ultimate content" of fraudulent ads, does Section 230 still protect the platform?

One of those is a confidence problem. The other is a liability problem. They are about to meet.

Disclaimer: not a lawyer, not legal advice. Just a DBA reading the docket.

The Confidence Gap

Semarchy released its 2026 AI Report on March 9. The full survey covered 1,000 C-level executives in the UK, US, and France, all at companies with $200 million-plus in annual turnover. The US-specific findings landed in Solutions Review on April 27, courtesy of Semarchy's CTO Craig Gravina:

Finding US enterprises
Consider themselves AI-ready 99%
Believe they are ahead of competitors 88%
Cite data management and governance as their single biggest challenge 60%

Read that again. The same set of organizations is telling pollsters two contradictory things in the same survey: 'we are ready' and 'we cannot manage our data'.

The supporting numbers do not improve the picture. These are from the global Semarchy report covering the UK, US, and France:

Finding Share of respondents
Implementing AI initiatives without Master Data Management foundations 51%
Not enforcing data quality standards 38%
Experienced AI project delays last year due to data quality concerns 22%
Reported operational inefficiencies from unreliable data 21%
CDOs viewed as holding a chief role in their organization's AI strategy 7%
CIOs viewed as holding a chief role in their organization's AI strategy 18%

In plain English: the people who actually know whether the data is ready are not in the room when the AI strategy is set. But the strategy is being set anyway.

This is what the customer base looks like right now, regardless of what the press releases say. Willing to bet that anyone reading this who runs SQL Server for a living is nodding.

The courts just drew a line

Now layer this against what the Northern District of California has been doing.

Three cases, all involving Meta. Stay with me — the technical detail is what makes the bridge to the data layer.

Case Citation Judge Outcome at dismissal
Forrest v. Meta Platforms, Inc. N.D. Cal., No. 22-cv-03699-PCP, opinion 6/17/24 P. Casey Pitts Motion to dismiss denied. Case survived.
Bouck v. Meta Platforms, Inc. N.D. Cal., No. 25-cv-05194-RS, opinion 3/24/26 Chief Judge Richard Seeborg Motion to dismiss denied. Case survived.
Suddeth v. Meta Platforms, Inc. N.D. Cal., No. 25-cv-08581-RS, opinion 3/24/26 Chief Judge Richard Seeborg Motion to dismiss granted. Case dismissed.

In Forrest, the plaintiffs alleged that Meta's ad tools "mix and match" images, videos, text, and audio, and use generative AI to optimize ads automatically. Under the Ninth Circuit's framework in Calise v. Meta, that active involvement created a genuine factual dispute over whether Meta materially contributed to the ads' illegality. The case survived dismissal.

In Bouck, the plaintiffs alleged that Meta's generative AI tools themselves "developed the ultimate content of the fraudulent ads", making Meta "a genuine co-conspirator in the creation of the offending content". Section 230 did not protect Meta at the dismissal stage.

In Suddeth, same judge, same day, different outcome. The plaintiffs alleged that Meta's machine-learning systems 'maximize reach, engagement, or downstream actions' and that algorithmic amplification was itself content development. Judge Seeborg dismissed the case. Meta's targeting tools, he wrote, are "content neutral on their own". Algorithmic amplification "is nothing more than an averment of facilitation".

A motion to dismiss is not a finding on the merits. Surviving one means the plaintiffs alleged enough to proceed — but not that they have won. The dividing line the courts are drawing matters either way: targeting an audience is protected distribution, transforming or generating ad content is not.

Where plaintiffs plausibly allege that generative AI itself authored the content — not just amplified it — Section 230 has now failed twice to make the case disappear at the pleadings stage. That sentence is doing a lot of work, pro and con. Maybe read it twice.

The shoe that hasn't dropped

Here is where I have to be careful, because what comes next is doctrinal commentary, not a court holding.

Seth Oranburg, a professor at the University of New Hampshire School of Law, published an analysis in Bloomberg Law on April 14 arguing that the Bouck/Forrest line of reasoning collides with a separate Supreme Court doctrine on securities fraud.

The Supreme Court held in Janus Capital Group v. First Derivative Traders that "the maker of a statement is the person or entity with ultimate authority over the statement, including its content and whether and how to communicate it". The same opinion noted that "merely hosting a document on a Web site does not indicate that the hosting entity adopts the document as its own statement or exercises control over its content".

Oranburg's argument: when a platform's generative AI exercises ultimate authority over the assembled content of a fraudulent investment solicitation, the platform may be the 'maker' under Rule 10b-5 of the Securities Exchange Act. Primary 10b-5 liability has no Section 230 analog.

I am quoting Oranburg directly here because the precision matters. He calls this "the argument no court has yet reached".

So to be clear about what is and isn't true today:

Question Status
Does Section 230 protect platforms when generative AI assembles ad content? Being tested. Two N.D. Cal. cases have survived motions to dismiss on this theory.
Can a platform whose AI has 'ultimate authority' over assembled fraudulent content be a 'maker' under Rule 10b-5? Not decided by any court. Currently a doctrinal argument from legal scholars, sitting on top of a Supreme Court precedent, watching for a case that brings the question forward.

The shoe is dangling. Whether and when it drops is not known today... but I am going to be watching this one for sure.

Why DBAs should care

The principle the courts are circling is broader than ad platforms. It is about who has ultimate authority over assembled content when AI is doing the assembling.

Sit with that for a second. Who has ultimate authority over assembled content when AI is doing the assembling?

Now overlay the Semarchy picture. Half of enterprises are running AI without MDM. A third don't enforce data quality. The data leaders are not in the AI strategy meetings. And on top of those data foundations — or in the absence of them — companies are deploying agents that summarize regulatory submissions, draft customer-facing financial output, generate reports that flow into SEC filings, and write rows that auditors will later have questions about.

If the doctrinal extension Oranburg outlines ever reaches a court holding, the test will be: who exercised ultimate authority over the assembled content? An organization that cannot say where its data lives, who can change it, what shape it takes when it leaves the database, or which AI agent touched it last is going to have a difficult time answering that question.

Two days ago I wrote that permissions are the only line agents cannot cross. The legal frame is starting to align with that observation. The DBA who can show what an agent touched, when, and under whose credentials is also the DBA whose company has an answer ready.

The closer

The numbers and the law are pointing the same direction. Confidence is up. Data discipline is down. Legal exposure is forming around the gap between them.

The boring work — Master Data Management, data quality, audit trails, scoped permissions, governance, knowing what your AI agents are touching and on whose authority — is now also a legal posture, not just an operational one. Those shops treating it as unnecessary overhead may be in the legal cases we're reading about next year.

99% of US enterprises think they are AI-ready. Is yours?

More to Read

Solutions Review (Craig Gravina, 4/27/26): Why Your AI Investments Keep Failing (And How to Fix It)
Semarchy press release (3/9/26): Data Management Overtakes Cost and Talent as Top AI Challenge
Bloomberg Law (Seth Oranburg, 4/14/26): Meta Cases Put Social Media Platforms at Securities Fraud Risk
Cornell LII: Rule 10b-5
sqlfingers inc: AI Agent. Nine Seconds. One Production Database. Gone.