Three T-SQL features that have shipped over the last few releases and quietly retired patterns many of us are still using out of habit. Each replaces a stale workaround with one line of code, and in two of three cases it runs much faster, too. Take a look, try them out.
1. APPROX_COUNT_DISTINCT -- when COUNT(DISTINCT) is too expensive
Available since SQL Server 2019. COUNT(DISTINCT col) on a high-cardinality column requires SQL Server to track every unique value it has seen, which means a memory grant proportional to cardinality and a hash aggregate that loves to spill to tempdb on big tables. APPROX_COUNT_DISTINCT uses HyperLogLog instead, which trades a small accuracy hit for dramatically lower memory and faster runs.
Haven't heard of HyperLogLog before? Short story: HyperLogLog (HLL) in SQL Server is used to provide extremely fast, approximate counts of unique values (cardinality) in massive datasets using very little memory. Implemented with the APPROX_COUNT_DISTINCT function, allowing analytics on billions of rows with a roughly 2% error margin, significantly faster than COUNT(DISTINCT). Pretty cool. I use this a lot in trading data.
Build a five-million-row table with a few hundred thousand distinct customers (takes a few seconds on a laptop):
WITH N AS
(
SELECT TOP (5000000)
OrderID = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_objects a
CROSS JOIN sys.all_objects b
CROSS JOIN sys.all_objects c
)
SELECT
OrderID,
CustomerID = ABS(CHECKSUM(NEWID())) % 250000,
OrderDate = DATEADD(SECOND, OrderID, '2020-01-01'),
Amount = CAST(OrderID % 1000 AS DECIMAL(10,2))
INTO dbo.OrdersDemo
FROM N;
CREATE CLUSTERED INDEX IX_OrdersDemo_OrderID ON dbo.OrdersDemo (OrderID);
The old way:
SET STATISTICS TIME, IO ON; SELECT COUNT(DISTINCT CustomerID) AS UniqueCustomers FROM dbo.OrdersDemo;
The new way:
SELECT APPROX_COUNT_DISTINCT(CustomerID) AS UniqueCustomers FROM dbo.OrdersDemo; SET STATISTICS TIME, IO OFF;
Run both, compare CPU time and elapsed time, or even pull the actual execution plan to compare the memory grant on the root operator. You can clearly see, APPROX_COUNT_DISTINCT finishes noticeably faster with a smaller grant.
And the memory grant...
Don't use it for anything that has to be exact -- billing, regulatory counts, reconciliation. Do use it for the dozens of dashboards and exploratory queries where 'about a million' is the same answer as 'exactly 1,003,847'.
2. STRING_AGG -- retire the FOR XML PATH('') hack
Available since SQL Server 2017. Many shops still use legacy code building delimited strings with FOR XML PATH('') and STUFF(). It worked, technically. It was also suboptimal performance.
Setup -- a tiny order schema with a few items per order:
CREATE TABLE dbo.Products
(
ProductID INT PRIMARY KEY,
ProductName NVARCHAR(50)
);
CREATE TABLE dbo.Orders
(
OrderID INT PRIMARY KEY,
CustomerID INT
);
CREATE TABLE dbo.OrderItems
(
OrderID INT,
ProductID INT
);
INSERT dbo.Products VALUES
(1, 'Coffee'), (2, 'Tea'), (3, 'Cookies'), (4, 'Cake'), (5, 'Sandwich');
INSERT dbo.Orders VALUES
(1001, 50), (1002, 50), (1003, 51), (1004, 52);
INSERT dbo.OrderItems VALUES
(1001, 1), (1001, 3),
(1002, 2), (1002, 4), (1002, 5),
(1003, 1), (1003, 2),
(1004, 5);
The old way:
SELECT
o.OrderID,
o.CustomerID,
Products = STUFF((
SELECT ', ' + p.ProductName
FROM dbo.OrderItems oi JOIN dbo.Products p
ON oi.ProductID = p.ProductID
WHERE oi.OrderID = o.OrderID
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)'), 1, 2, '')
FROM dbo.Orders o
ORDER BY o.OrderID;
The new way:
SELECT
o.OrderID,
o.CustomerID,
Products = STRING_AGG(p.ProductName, ', ')
WITHIN GROUP (ORDER BY p.ProductName)
FROM dbo.Orders o JOIN dbo.OrderItems oi
ON o.OrderID = oi.OrderID JOIN dbo.Products p
ON oi.ProductID = p.ProductID
GROUP BY o.OrderID, o.CustomerID
ORDER BY o.OrderID;
Both produce the same resultset:
But look at that implicit conversion, larger plan cache and higher subtree cost in one vs the other:
3. GENERATE_SERIES -- kill the homemade tally CTE
This one was introduced in SQL Server 2022. Every DBA has copy-pasted some variant of a recursive CTE to generate a sequence of numbers or dates. The classic use case: a date-spine report that fills in zeros for days with no activity. GENERATE_SERIES replaces it with one function call.
Setup -- a sales table with deliberately sparse dates:
CREATE TABLE dbo.DailySales
(
SaleDate DATE,
Amount DECIMAL(10,2)
);
INSERT dbo.DailySales VALUES
('2026-01-02', 1200.00),
('2026-01-03', 450.00),
('2026-01-05', 2300.00),
('2026-01-08', 890.00),
('2026-01-10', 1100.00);
The old way -- recursive CTE for the date spine:
DECLARE @StartDate DATE = '2026-01-01';
;WITH Days AS
(
SELECT @StartDate AS TheDate, 0 AS n
UNION ALL
SELECT DATEADD(DAY, n + 1, @StartDate), n + 1
FROM Days
WHERE n < 9
)
SELECT
d.TheDate,
Amount = ISNULL(s.Amount, 0)
FROM Days d LEFT JOIN dbo.DailySales s
ON s.SaleDate = d.TheDate
ORDER BY d.TheDate
OPTION (MAXRECURSION 0);
The new way:
DECLARE @StartDate DATE = '2026-01-01';
SELECT
TheDate = DATEADD(DAY, gs.value, @StartDate),
Amount = ISNULL(s.Amount, 0)
FROM GENERATE_SERIES(0, 9) gs LEFT JOIN dbo.DailySales s
ON s.SaleDate = DATEADD(DAY, gs.value, @StartDate)
ORDER BY TheDate;
Both produce the same ten-row date spine with zeros filled in for the empty days, but compare the two constructs. With GENERATE_SERIES we've got no CTE, no MAXRECURSION and no helper table. GENERATE_SERIES accepts (start, stop) or (start, stop, step), supports negative steps, and returns a single column called value. The Optimizer treats it as a normal table-valued function and can parallelize work that depends on it -- something the recursive CTE cannot do.
Why we are still writing the old versions
Full disclosure: this post happened because I caught myself reaching for a recursive CTE last night before remembering the GENERATE_SERIES improvement. Habits stick. But they can be broken easily.
The recursive tally CTE was the first 'clever' T-SQL many of us wrote. The FOR XML PATH trick was THE Stack Overflow answer for a decade. COUNT(DISTINCT) still gets typed reflexively when 'roughly how many' would do.
None of these replacements are new. STRING_AGG turned eight this year. APPROX_COUNT_DISTINCT shipped in 2019, and GENERATE_SERIES has been around since 2022. Time to check them out.
More to Read
Microsoft Learn: APPROX_COUNT_DISTINCT
Microsoft Learn: STRING_AGG
Microsoft Learn: GENERATE_SERIES













