Friday, May 8, 2026

June 1. Copilot Billing Changes. SSMS Has a Problem.

Three weeks. One billing model change. One SSMS bug that may already be costing you. On June 1, 2026, GitHub Copilot moves from fixed-fee request-based billing to token-based billing. The fallback to a cheaper model when credits run out option is gone. Why? Because an autonomous agent session can cost the same as a quick chat question, and GitHub says the flat rate model is no longer sustainable. And SSMS 22's Copilot integration had a documented infinite-loop bug that was silently exhausting user request quotas. Microsoft fixed it in March — but only if you patched.

Connect those dots.

Disclaimer: not a GitHub billing expert. Just a DBA reading the announcement.

What's changing on June 1

Per GitHub's April 27 announcement, premium request units (PRUs) are out. GitHub AI Credits are in. One credit equals one US cent. Plan prices are unchanged. What changes is how those dollars get consumed.

Plan Monthly cost Monthly AI Credits
Free $0 Limited allowance
Pro $10 $10 (1,000 credits)
Pro+ $39 $39 (3,900 credits)
Business $19/seat $19 (1,900 credits)
Enterprise $39/seat $39 (3,900 credits)

Code completions and Next Edit Suggestions remain free and unmetered. Everything else — chat, agent mode, code review — draws from the credit pool, priced per token by model. Per GitHub's published rate sheet, premium models like Anthropic's Claude Opus and OpenAI's GPT-5.5 cost dramatically more per token than default models.

Two changes worth pulling out of the announcement:

The fallback to a cheaper model is gone. Today, when you exhaust your PRUs, Copilot quietly downgrades you to a smaller model so things keep working. After June 1, when credits run out, that's it — you're done. Unless an admin has set up additional usage budget, in which case you are paying overage at published rates with no cap unless one is configured.

Annual plans are also getting squeezed. Annual Pro and Pro+ subscribers stay on the request-based model until renewal, but the model multipliers are increasing on June 1, and the refund-and-cancel option is only open until May 20. That is twelve days from today.

Why DBA work eats tokens faster than dev work

Token math, briefly. The general rule is roughly four characters per token for English prose. Code and markup run denser — around three to three-and-a-half characters per token — because of all the brackets, attributes, and namespace noise.

What does that mean in practice? Here is a rough input-token cost per paste, per artifact a DBA hands to Copilot every day:

Artifact Typical size Input tokens (approx)
Simple plan XML 10-50 KB 2,500 - 12,500
Complex plan XML ~100 KB 25,000 - 33,000
Big analytical plan 500 KB+ 125,000+
sp_BlitzCache top 50 50-200 KB 12,500 - 65,000
Deadlock graph XML 5-50 KB 1,250 - 12,500
Full schema (50 tables) 100 KB+ 25,000+

And that is per paste. The DBA workflow rarely stops at one. Paste plan, ask. Paste a different plan, ask again. Call sp_BlitzCache @AI = 1. Paste sp_WhoIsActive output, ask why blocking is occurring. Paste the deadlock graph, ask which transaction was the victim. Three to five iterations is normal for any non-trivial troubleshooting session, and each one carries the full prior context with it as input tokens.

Compare that to the developer workflow Copilot was originally built around. Inline code completions. Short prompts. Small inline edits. Inline completions remain free under the new model. The DBA pattern of pasting big diagnostic XML into chat is what now costs real money — and at premium model rates, the math gets real big real fast.

Multiply your typical paste size by three to five iterations, multiply that by the per-token rate of whichever model you are using, and you have your monthly Copilot exposure. Then go look at their pricing page and do the math with your actual workflow.

The SSMS 22 Copilot bug — fixed in 22.4.1 — if you patched

A user thread on the GitHub Community board, discussion #181818, documents Copilot in SSMS 22 entering an infinite loop on certain failed or stalled query executions, sending repeated API calls in the background while showing a loading screen, and burning through users' premium request quotas overnight. One user reported 1,202 Claude Sonnet 4.5 requests in a single day attributed to the SSMS integration alone. Multiple users in the thread confirmed identical behavior, often without even using the chat. The same Copilot integration in Visual Studio and VS Code did not exhibit this issue.

The PM for GitHub Copilot in SSMS responded directly in the thread: "We are aware of this problem and are working on a resolution." The issue was also tracked on Microsoft's SSMS Developer Community.

SSMS 22.4.1, released March 25, 2026 alongside the GA of GitHub Copilot in SSMS, includes improved handling of query executions that either return no results or fail completely. That is the fix. Some users have also reported success rolling back to SSMS 22.1.0 if 22.4.1 still misbehaves in their specific environment.

If you are running an earlier SSMS 22 build with Copilot enabled and the patched 22.4.1 is not deployed, you are carrying this risk straight into the per-token billing model change on June 1. Today the bug burns PRUs — you hit a usage cap, you get warnings, you investigate. After June 1, the same bug burns your credit card balance at API rates with no fallback option. For shops on Copilot Business or Enterprise, where credits are pooled across the organization, a single unpatched user with this bug active could drain their team's monthly allocation overnight.

Read that again. One unpatched SSMS 22 user with a bad API loop, can drain the pooled credit pool for the entire team in a single day.

What to do before June 1

  • Check the preview bill. GitHub launched a preview bill experience in early May, accessible from the Billing Overview page on github.com. It shows what your April usage would have cost under the new model. Look at it before June 1 surprises you.
  • Decide on the annual plan refund window. If you are on annual Pro or Pro+, the refund-with-cancel option closes May 20. After that, you are stuck on request-based billing with the new (worse) model multipliers until renewal.
  • Set spending budgets. Admins on Business and Enterprise can set per-user budgets. A $0 user budget is the kill switch — no credit consumption for that user. Use this on accounts that do not need Copilot or where you suspect bug exposure.
  • Confirm everyone is on SSMS 22.4.1 or later. The 22.4.1 release contains the Copilot infinite-loop fix. Anyone running an earlier SSMS 22 build with the AI Assistance workload installed is exposed. Verify the version with Help, About in SSMS. If you are already on 22.4.1 and the issue persists, rolling back to SSMS 22.1.0 has resolved it for some users.
  • Train the team off paste-the-whole-thing prompts. Pasting a 500KB execution plan into chat to ask 'why is this slow' is now an expensive habit. Targeted questions against scoped context — the specific operator you are concerned about, the specific predicate, the specific wait type — cost a fraction of the same answer at published per-token model rates. Another good reminder that we should be very mindful of what we give to AI.
  • Watch the SSMS feedback site. Per the PM in discussion #181818, the SSMS feedback site is where SSMS Copilot issues should be logged. Subscribe to the relevant items if you are running SSMS 22 in production.

The bigger picture

None of this should be a surprise. The flat-rate AI buffet was the customer-acquisition phase. This is the bill phase. It was always going to come.

I have been writing about the operational and legal risks of AI tooling in production for weeks now. An AI agent deleted a production database in nine seconds. 99% of US enterprises consider themselves AI-ready while 60% admit they cannot manage their data. P2SQL injection turns plain English into damaging, destructive commands. SQL Server Ledger gives us tamper-evident logging when an agent runs amok.

Now we add a new risk to the operational and legal pile: cost. The all-you-can-eat AI subscription was a transitional pricing model. GitHub said it themselves — the per-seat model is not sustainable when an autonomous agent session can cost the same as a quick chat question. Token-based billing is what AI economics actually look like at the API layer, and it's just making its way to Copilot itself.

The shops that survive the transition cleanly will be the ones whose DBAs treat tokens like CPU cycles — finite, measurable, and worth optimizing. Targeted prompts. Scoped context. Cheaper models for the routine stuff. Premium models reserved for the questions that actually need them.

The shops that do not will find out in July, when the first usage-based bill arrives.

More to Read

GitHub Blog: GitHub Copilot is moving to usage-based billing
GitHub Docs: About billing for individual GitHub Copilot plans
GitHub Docs: Models and pricing for GitHub Copilot
GitHub Community Discussion #181818: SSMS 22 Copilot integration usage spike
Microsoft Fabric Community: SSMS 22.4.1 and GitHub Copilot in SSMS (Generally Available)
Microsoft: SSMS Feedback Site

No comments:

Post a Comment