AI Model Spending is a Black Hole

Might be a good time to find a way out before tokenmaxxing employees drive budgets up and margins down for vendors and MSPs. Plus: No really, quantum computing’s approaching fast.

Apr 12, 2026

There’s a lot we don’t know about AI.

Like how much time we have before it drowns us with zero days, how much of what it says is correct (and how many of us even care), how much we can trust the results when it aces a safety test, and whether or not it’s making us dumber.

Most of us don’t know what we’re spending on it either.

I mean, we know it’s a lot and growing fast. AI spending of all kinds will climb 44% this year to $2.5 trillion, according to Gartner, while spending specifically on AI models will soar nearly 83% to $26.4 billion. The growth is even sharper for specific AI workloads, furthermore: Average per-organization consumption of tokens for reasoning in particular rose 320x last year at OpenAI, for example.

We also know that totals like that translate to smaller yet substantial equivalents at individual businesses, some 40% of which now spend $10 million or more a year on AI, according to cloud cost management vendor CloudZero. For perspective, that’s similar to the 47% of businesses that spend the same amount on cloud computing, and the gap’s closing fast.

Yet we really don’t know how to forecast AI outlays. AI spend projections were off by 50% or more, in fact, at 20% of the companies surveyed by CloudZero, which encapsulates the reason why nicely in a press release.

“Most companies use some combination of public cloud, private cloud, third-party GPU providers, and hosted LLM APIs. These providers have different billing frameworks, send bills at different intervals, and use different data formats, making it nearly impossible for companies to get a comprehensive overview of what they’re spending—much less what exactly they’re spending it on.”

Even businesses that get all their AI from one source, moreover, often struggle to anticipate future needs. “The volatility of AI workloads—from bursty training cycles to unpredictable inferencing spikes—means that static budgeting and quarterly forecasts can’t keep up,” writes Jevin Jensen, a research vice president for infrastructure and operations at IDC, in an excellent blog post about AI FinOps. “Every new experiment, every dataset added, every prompt creates a ripple in compute, storage, and energy consumption—often in exponential amounts.”

Is it any wonder that spending businesses can’t accurately forecast and struggle merely to track tends to be less than efficient? According to IDC, 41% of organizations are wasting more than 15% of their AI budget currently.

“If you’re talking about projects that are half a million dollars, a million dollars, and if you’re a big company, multimillion dollar projects, 15% is not a small amount of money you’re wasting,” Jensen (pictured) observes.

And that’s before the global population of agents autonomously and often invisibly consuming tokens rises from 28.8 million last year, according to IDC, to a staggering 2.3 billion in 2030.

“You’re talking about having two or three agents per employee,” Jensen says. “All of a sudden you’ve got more agents than you have servers.”

Issues like this are a problem, needless to say, for end users (“I’ve run the numbers, boss, and our projected AI spend for the rest of the year is somewhere between our current run rate and infinity”). They’re even thornier ones for MSPs selling AI solutions and services with unpredictable costs to customers who want entirely predictable rates, not to mention vendors serving MSPs with similar demands.

“A big question becomes - how can you price your product such that you don’t torpedo your business into perpetual negative gross margin,” writes Jamin Ball of Altimeter Capital in a recent blog post.

Or merely declining gross margin. OpenAI itself saw margins drop from 40% to 33% last year, according to The Information, thanks to a 4x increase in inferencing expenses.

Imperfect answers

Jensen has seen this movie before. Cloud costs were devilishly hard to govern even before the Covid pandemic led businesses to put everything on-prem online instead.

“Then they got the bills one month, two months, three months later, and their CFO had a heart attack,” Jensen says.

Cloud FinOps solutions proliferated as a result. IDC, in fact, tracks 135 such systems at present, and while not all of them have AI FinOps features, those that do help businesses set spending guardrails, catch anomalies early, and design systems with cost in mind. Theoretically, they can also spare CFOs at companies with tokenmaxxing developers a fresh round of heart attacks too.

“FinOps tools cannot eliminate variability, but they can make it manageable,” says Rem Baumann (pictured), head of product at AI FinOps vendor Vantage.

Same goes with pricing. “FinOps tools break down cost per API call, per workflow, or per tenant,” Baumann says. “That allows providers to align pricing with actual cost and protect margins as usage scales.”

Unfortunately for managed service providers, Vantage makes one of just a handful of AI FinOps solutions with multi-tenancy. MSPs that don’t want or can’t afford any of them can potentially create something similar on their own, however, using the same AI that created their cost management headaches in the first place, according to Evan Rice, president and COO of managed services software maker Rev.io. Indeed, that’s exactly what he and his team have done.

“Our agents internally, including my own personal one every week, does an ROI dashboard for me that shows, ‘here’s the workflows I did. Here’s the token utilization around those workflows,’” says Rice on the latest episode of the podcast I co-host. Pull in some sales and accounting information too and you’ve got a pretty good sense for whether what you’re spending on AI adds up in light of what you’re making and saving with it.

“That’s how we framed it to our agents. You’re an employee, you’re a valued member of the team. Your salary and compensation are your token consumption. Calculate your costs and justify your continuing to do these things and show the value that you’re delivering,” Rice says. “They’ve done a pretty good job with it so far.”

I’m sure they have. Digital workers may be a lot less predictable than human ones, but they’re every bit as eager to avoid unemployment.

Follow Me on LinkedIn

Speaking of Evan Rice…

You can hear him speak yourself, intelligently and at length, about how MSPs can own the agentic layer at SMBs on the latest episode of MSP Chat. Available right here, right now.

Ready or not (and you’re probably not), quantum computing’s nearing

So, as we’ve just established, there’s a lot we don’t know about AI. Same’s true of quantum computing, where experts like David Mooter, a principal analyst at Forrester, didn’t even know with 100% certainty that it would ever come off the drawing board in a meaningful way as recently as a couple years ago.

“The error rates were just way too high. The qubit counts were just too low,” he says.

And now?

“Now I feel like it will be a reality. I’m very confident in that,” Mooter (pictured) says. “When is definitely more of an up-in-the-air question.”

And also an extremely important one, certainly for venture capital firms, which tripled their investments in quantum startups last year to $5.8 billion, and almost certainly for the rest of us too. Before I discuss why, I should make clear that quantum computers are not supercomputers that can just solve everything, because…well:

“One thing I have seen that drives me nuts is journalists who don’t know what they’re talking about will describe quantum computers as supercomputers that can just solve everything,” Mooter says. “The way I like to describe it is that quantum computers are not supercomputers. They’re different computers. They solve problems in a different way.”

That different way, without getting deep into the (quantum) mechanics of the matter, lends itself especially well to combinatorial challenges like the traveling salesman problem in which there are huge volumes of possible solutions, way more for a human to sort through ever and more than even powerful classical computers can handle quickly. Mooter points to modeling molecular interactions for drug discovery as one example and optimizing investment portfolios as another. Optimizing supply chains, energy grids, and chemical manufacturing processes are among others frequently cited.

Good news for anyone who likes the sound of improving all those areas and more: Your wait continues to get shorter and shorter. In fact, quantum computing made such strides last year as “vendors moved beyond theoretical fault‑tolerant architectures into early engineering reality,” per a recent Forrester report, that practical commercial applications of the technology could begin arriving as soon as 2030, a good five years earlier than the analyst considered likely just 15 months ago.

Developments like that have inspired pretty much all my past writing about quantum. Generative AI was supposed to be years away when it upended our entire industry late in 2022. I’m hoping we can avoid repeating that scenario this time, but also increasingly thinking we’re going to have to move a lot faster to do so.

To take one of a steadily arriving series of examples, research published days ago by Caltech and quantum startup Oratomic outlined a new approach to quantum error reduction that could cut the minimum number of qubits required to make a useful quantum computer from millions to as few as 10,000.

Consider as well the progress being made in hybrid quantum computing, which divides processing chores across classical and quantum machines. P&G used the technique last year in an experiment aimed at optimizing a manufacturing process with 10^{^114} possible arrangements of components and ingredients, more than the number of atoms in the universe.

Acting on its own, a regular computer needed six hours to come up with an answer. SAS’s quantum AI platform needed two minutes, but produced unreliable results. Working together, however, the conventional and quantum computers produced an accurate, actionable result in 12 minutes, or about 96.7% less than what the classical machine needed solo.

Time for partners to start thinking up quantum and hybrid quantum solutions then? Maybe not.

“Don’t view quantum as the next layer that will replace CPUs and GPUs,” said Omdia analyst Jay McBain in a conversation at the Canalys Forum late last year. “It won’t, and it’s going to be interesting for very few partners, maybe in the thousands total.”

But it will be very interesting, or more accurately terrifying, for every partner in the industry with respect to security, and sooner perhaps than any of us would like. Remember that 10,000 qubit breakthrough at Caltech? Well, these guys claim they’ve found a hybrid way to break RSA encryption with just 5,000.

Could be hot air, of course, but gets to the reason we probably need to be thinking harder about quantum computing than most of us are even so. The same Forrester report that said commercial uses of quantum computing could start arriving by the end of the decade said that “Q-Day,” when quantum machines break current public-key cryptography, could arrive then too. Google is getting nervous about timelines on quantum security as well.

And we’re not ready. Governments, financial institutions, utilities, and infrastructure providers are working toward adoption of post-quantum cryptography schemes capable of keeping data safe past Q-Day, says Sandy Carielli, a VP and principal analyst at Forrester. “Firms in other industries are still learning and coming up to speed and are not as far along.”

Meaning they’re late. “I’ve been telling my clients that they have to start now, because full migration will likely take them past that 2030 date,” says Carielli, who also recommends inventorying data and prioritizing it by importance and longevity, along with forming a cross-functional Q-Day team, as good steps toward partial readiness.

And partial readiness is a whole lot better than no readiness. The other similarity between AI and quantum computing, in addition to there being a lot we don’t know about both, is that both are progressing very quickly and appear likely to progress even faster in the near future. At present, only a relative handful of people have access to quantum computers, but that was once true of classical computers as well. Then universities and research institutions started deploying them, sparking a “Cambrian explosion of algorithms,” Mooter notes.

“That took decades,” he continues. “We’ll see the same thing with quantum computers, but it won’t be decades because we’re building on the foundation of classical computer science and we have a lot more funding rolling in than computers did back in the 1950s.”

So years more likely, and I suspect they’ll feel like short ones. Brace yourself now.

Over on the Business of Tech

Host Dave Sobel discusses Treeline, the a16z-backed, AI-native MSP I wrote about last week, with the author of that piece. As in me. I’ll be appearing on the show live this coming Wednesday here.

OpenText Cybersecurity takes a stab at closing the AI alignment problem

In a recent post, I shared data from GTIA indicating a modest but still worrying erosion of member satisfaction with vendor partner programs, as well as GTIA analyst Carolyn April’s hypothesis that misalignment between what partners want from channel programs in the age of AI and what those programs provide is probably to blame.

Mike DePalma, VP of business development at OpenText Cybersecurity and a member of GTIA’s channel executive council, saw that data too, and wasn’t surprised by it.

“That’s one of the biggest conversations that we’re all as vendors trying to figure out,” he says. “We’re all seeing a drop in satisfaction, and we know that we need to change, and it’s hard.”

Like April, he suspects that meeting partners where they are in relation to vendors versus where they once were has a lot to do with the issue. “The MSPs are different than they were 15 years ago,” DePalma (pictured) says. “Partner programs have stayed the same.”

The alliance agreement OpenText announced with Hatz AI last week is DePalma’s first stab at a remedy. It’s also the first example I can think of since I attended that GTIA conference of a channel chief offering a new benefit to partners specifically in response to changing needs due to AI.

And to be clear, this is all about partner sat. There’s no back end integration happening here. The deal is a referral program pure and simple, and a one-way referral program at that, with leads flowing solely from OpenText to Hatz.

“There’s nothing coming back our way,” DePalma says. “We’re looking at it as what can we add to our partner program that they’re asking for that moves the needle.”

Introductions to Hatz was not a guess. OpenText asked partners what would move the needle in focus groups and got two answers. One was more enablement resources. The other was connections with other vendors. “They’re asking for vendors to work together, to form stronger alliances, whether they’re competitors or not,” DePalma says. The Hatz AI agreement satisfies both asks.

“We want people that are partners to feel that OpenText really cares about us. They don’t just care about what they’re spending with us, they care about helping us grow,” DePalma says.

DePalma and his colleagues at OpenText are exploring lots of ways to send that message at present as they race toward readying an all-new partner program for an official launch on July 1. What they come up with will be of intense interest to OpenText partners. It will also be of some interest to channel watchers like me and channel chiefs at other vendors curious to see what a blank slate partner program for the age of AI looks like.

Also worth noting

Three things not to miss at next week’s Channel Partners Conference & Expo in Las Vegas: This session I’m moderating, this other session I’m moderating, and the Alliance of Channel Women’s opening day networking event.
Modern Threat Protection, ConnectWise’s new security platform, combines managed EDR, SIEM services, and email security. More on this coming next week.
Titan has Silicon Valley AI engineers. Shield has Silicon Valley AI engineers. Treeline has Silicon Valley AI engineers. Looks like Kaseya wants a few too.
Gradient MSP has released a major upgrade to MSP Studio, its content marketing platform, aimed at solving the industry’s reliance on generic, “canned” content.
Nerdio and Nutanix have forged an alliance to bring Microsoft-based virtual desktops to hybrid cloud environments.
The new Acronis MDR by Acronis TRU offers 24/7/365 managed detection and response services tailored for MSPs.
WatchGuard has unveiled a new endpoint pricing model aimed at giving MSPs a competitive edge through simplified and cost-effective licensing.
Cynomi has introduced a Go-to-Market Academy designed to help MSPs grow cybersecurity revenue through structured enablement and training.
Also from Cynomi: new CISO Intelligence AI agents designed to help MSPs scale cybersecurity services and business growth.
NWN, which discussed AI governance with us recently, has launched a new cybersecurity offering designed to help organizations manage risk and improve security posture.
Speaking of AI governance, 51% of MSPs say data governance and compliance challenges are the main obstacle to AI adoption, according to AvePoint and Omdia.
30.3% of incidents tracked by Blackpoint Cyber in 2025 involved malicious use of legitimate RMM tools, according to a new study by the company.
Keeper Security has expanded its privileged access management and browser isolation capabilities to support advanced secure browsing workflows.
Dashlane’s new integration with KnowBe4 aims to convert employee security awareness training into real-time, proactive defense against credential-based threats.
Norton has added AI agent protection to Norton 360.
PRM vendor Channelscaler has launched Scailyn, an AI-powered channel agent designed to accelerate partner engagement and innovation.
Wasabi Technologies says it will acquire Seagate Lyve Cloud to expand its cloud storage business.
Cisco says it will acquire Galileo to enhance its observability and AI agent monitoring capabilities.
Channel Leaders, a peer group for channel sales professionals, is now a full-blown community for channel sales professionals called the Channel Sales Association.
Alex Hesterberg is the new CEO at Devicie, who you read about here earlier.
Lindsey Westbrook is the new VP of marketing at Coro.
Alysia Vetter is the new director of PR and communications at GTIA.
Sharon Florentine is now manager and research analyst at GTIA.

Discussion about this post

Ready for more?