sqlfingers.com: Your JSON Column Was Never a JSON Column

For roughly a decade, the way you stored a JSON document in SQL Server was to not store a JSON document at all. You stored a string. NVARCHAR(MAX), an ISJSON check constraint to keep the garbage out, and a quiet agreement with yourself that the column was JSON because you said so.

SQL Server 2025 ships an actual json data type. Binary storage, parsed once, queryable without re-parsing every row. This post is the demo: the old pain, the new behavior, and the one word in the Microsoft docs you need to read before you put this into production.

The old way

Here is the pattern every shop has somewhere. A string column, pretending.

USE tempdb;
GO

CREATE TABLE dbo.OrdersOld
(
    OrderID INT IDENTITY(1,1) PRIMARY KEY,
    Payload NVARCHAR(MAX)
        CONSTRAINT ck_OrdersOld_IsJson_Payload CHECK (ISJSON(Payload) = 1)
);
GO

INSERT dbo.OrdersOld (Payload) VALUES
(N'{"customer":"Acme","total":1200.00,"items":["Coffee","Tea"]}'),
(N'{"customer":"Globex","total":450.00,"items":["Cookies"]}');
GO

It works. It has always worked. But look at what it actually is:

What you wanted	What NVARCHAR(MAX) gave you
A JSON document	A string to parse
Validation	A check constraint you remembered to add
Fast reads	A full re-parse of the text on every JSON_VALUE call
Cheap updates	Rewrite the whole document to change one key

The check constraint is the giveaway. The engine does not know the column is JSON. You are bolting validation on from the outside, and if a table gets created without that constraint -- it happens -- the column happily accepts 'not json at all' and nobody finds out until a report breaks.

The new way

Same table, one keyword different.

CREATE TABLE dbo.OrdersNew
(
    OrderID INT IDENTITY(1,1) PRIMARY KEY,
    Payload JSON NOT NULL
);
GO

INSERT dbo.OrdersNew (Payload) VALUES
(N'{"customer":"Acme","total":1200.00,"items":["Coffee","Tea"]}'),
(N'{"customer":"Globex","total":450.00,"items":["Cookies"]}');
GO

No check constraint. The validation is in the data type now as it should be. The engine will not let an invalid document into a json column. Try it:

INSERT dbo.OrdersNew (Payload) VALUES (N'not json at all');

That is the difference, right there in the screenshot. The old table only caught bad data if you remembered the constraint. The new table catches it because the JSON data type itself will not accept anything that is not valid JSON.

It is stored parsed, not as text

This is the part that matters for performance, and it is invisible until you go looking. A json column is not stored as the characters you typed -- it is stored in a binary format, parsed once when the row is written.

That changes the math on reads. With the old method, NVARCHAR(MAX), every JSON_VALUE call parses the string from scratch: the engine walks it character by character, finds your key, hands back the value, throws the parse away, and does it all again on the next row. With the json type, that work is already done, and the reads navigate the binary structure directly.

The functions you already know will work against it as-is, no syntax changes:

SELECT
    OrderID,
    Customer = JSON_VALUE(Payload, '$.customer'),
    Total = JSON_VALUE(Payload, '$.total')
FROM dbo.OrdersNew;

Identical T-SQL to what you would write against the old NVARCHAR(MAX) column. That is deliberate -- Microsoft kept the surface area the same so your existing code keeps running. What changed is underneath it.

That is the claim. Here is the measurement. Build the same document set -- 200,000 rows of a heavier document -- in both an NVARCHAR(MAX) column and a json column:

CREATE TABLE dbo.JsonString (id INT PRIMARY KEY, Payload NVARCHAR(MAX));
CREATE TABLE dbo.JsonNative (id INT PRIMARY KEY, Payload JSON);
GO

INSERT dbo.JsonString (id, Payload)
SELECT TOP (200000)
    ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),
    N'{"customer":"Acme Industries","region":"West","tier":"Gold",
"total":1200.00,"tax":96.00,"shipping":"expedited","priority":7,
"notes":"recurring quarterly order, net 30 terms on file",
"items":["Coffee","Tea","Cake","Cookies","Sandwich","Juice"],
"contact":{"name":"Jane Roe","email":"jane@acme.test","phone":"555-0100"}}'
FROM sys.all_objects a CROSS JOIN sys.all_objects b;

INSERT dbo.JsonNative (id, Payload)
SELECT id, Payload FROM dbo.JsonString;
GO

Now extract several values per row -- each JSON_VALUE call against the string column is another full parse of that document:

SET STATISTICS TIME ON;

-- The old way: re-parse the string on every call, every row
SELECT COUNT(*)
FROM dbo.JsonString
WHERE JSON_VALUE(Payload, '$.customer')      IS NOT NULL
  AND JSON_VALUE(Payload, '$.region')        IS NOT NULL
  AND JSON_VALUE(Payload, '$.tier')          IS NOT NULL
  AND JSON_VALUE(Payload, '$.contact.name')  IS NOT NULL
  AND JSON_VALUE(Payload, '$.contact.email') IS NOT NULL
  AND JSON_VALUE(Payload, '$.shipping')      IS NOT NULL;

-- The new way: document already parsed, navigated directly
SELECT COUNT(*)
FROM dbo.JsonNative
WHERE JSON_VALUE(Payload, '$.customer')      IS NOT NULL
  AND JSON_VALUE(Payload, '$.region')        IS NOT NULL
  AND JSON_VALUE(Payload, '$.tier')          IS NOT NULL
  AND JSON_VALUE(Payload, '$.contact.name')  IS NOT NULL
  AND JSON_VALUE(Payload, '$.contact.email') IS NOT NULL
  AND JSON_VALUE(Payload, '$.shipping')      IS NOT NULL;

SET STATISTICS TIME OFF;

On my box the string column ran 5,672 ms of CPU. The json column ran the identical query in 3,828 ms -- roughly a third faster, for changing one keyword in a CREATE TABLE. Same query, same data, same result. The only difference is that the string column re-parses the whole document on every one of those six JSON_VALUE calls, on every row, while the json column was parsed once on insert and just reads the structure. Your numbers will differ, but the shape will not: the more keys you pull and the bigger the document, the wider that gap gets.

A note on JSON_MODIFY

On a string column, changing one key means rewriting the entire document -- read it, modify the text, write the whole thing back. The json type adds a modify method that is built to do in-place updates where the new value fits, instead of rewriting the document in full. Very useful, and on the list of things I plan to lab, but don't miss the version note below. The modify method specifically is one of the pieces still marked preview.

Change existing column to json

You do not have to rebuild the table. ALTER TABLE ... ALTER COLUMN converts a character column in place:

-- The check constraint references Payload -- must be dropped first
ALTER TABLE dbo.OrdersOld
DROP CONSTRAINT ck_OrdersOld_IsJson_Payload;
GO

-- Now take the old string column to the real type
ALTER TABLE dbo.OrdersOld
ALTER COLUMN Payload JSON NOT NULL;
GO

One thing to know before you run that against a real table: the conversion validates every row. Any row holding text that is not well-formed JSON will fail the ALTER. That is not a bug -- it is the data type doing its job -- but it means the ALTER to json doubles as a data-quality check. On a column that has been collecting strings for years with a check constraint that may not have always been trusted, you can probably expect the first attempt to find something. That is not a bad thing.

The order of operations for a production column json change: add the new json column alongside the old one, backfill in batches, find and fix the rows that will not convert, then drop the old column when you know it's clean.

The one thing you should read first

Here is the honest version, because the headlines blur it.

The native JSON data type is generally available -- on Azure SQL Database and Azure SQL Managed Instance. For the boxed product, SQL Server 2025, Microsoft's own documentation still describes the json type as being in preview. The type ships in the box, it works under every database compatibility level, and you can build on it today in a dev environment. But 'in the box' and 'GA' are not the same thing, and the modify method and the JSON aggregate functions are explicitly called out as preview on SQL Server 2025.

Where	json type status
Azure SQL Database	Generally available
Azure SQL Managed Instance	GA (SQL Server 2025 / Always-up-to-date policy)
SQL Server 2025 (box)	In preview, per Microsoft Learn

What that means in practice: lab it, learn it, play with it. Do not bet a production schema on the preview-flagged pieces until Microsoft moves them to GA. Preview features can change between CUs, and you don't want to build on a preview feature that gets turned off later. Be sure to check the docs against your current build before you commit.

The bottom line

The NVARCHAR(MAX)-plus-ISJSON pattern was never wrong. It was just the best we had. Very much like the FOR XML PATH string trick was the best we had before STRING_AGG. SQL Server 2025 finally makes JSON a real data type instead of a string with a usage note attached: validated by the engine, stored parsed, cheaper to read and to modify.

It is not a license to turn your relational database into a document store. JSON is still for the variable-shaped data -- the flexible attribute bag, API payloads and unmodeled log records. Do not use it as a shortcut to avoid defining columns that should have been native columns from day one. But for the JSON you legitimately have, this is your upgrade. Just keep an eye on that 'preview' status -- it's a dev-box feature on SQL Server 2025 until Microsoft says otherwise.

SQL Server Consulting

Thursday, May 21, 2026

Your JSON Column Was Never a JSON Column

The old way

The new way

It is stored parsed, not as text

Change existing column to json

The one thing you should read first

The bottom line

More to Read

No comments:

Post a Comment