Could quantum tech be a game changer for AI?

  • Multiverse Computing is working on a new quantum-inspired model compression technology
  • The company developed an AI model and makes it smaller without sacrificing too much of the performance
  • The six-year-old company has raised roughly $262 million

What if large language models were just as powerful but a little less…large? That’s what Multiverse Computing is aiming to achieve using quantum technology.

It might come across as buzzword soup – AI, LLM, quantum – but Multiverse’s CTO Sam Mugel told Fierce the technology is real.

“We developed an algorithm that takes an AI model and makes it smaller without sacrificing too much of the performance,” he explained.

But how, exactly? Mugel said it’s a three-step process analogous to model pruning.

Multiverse first runs the AI model through a GPU-based algorithm it developed which simulates the quantum environment. It then reorganizes the model to put all the complexity and information it contains on as few nodes as possible and discards the parts of the model that aren’t doing very much. And finally, there’s a healing step, which is essentially a scaled down version of model training.

Put it all together and you get a resulting compressed model that can fit on an iPhone. That’s a big deal because it means the power of AI can be applied much more broadly. And as far as enterprises are concerned, it means they can run smaller AI models that are faster and cheaper than their larger counterparts.

“Everyone wants a piece of this,” Mugel said, noting Multiverse has been approached by customers from verticals including telecom, finance, defense, manufacturing, automotive and energy. The nice part, he added, is that everyone is essentially asking for the same things – chatbots, RAG systems, agents – so Multiverse doesn’t have to keep reinventing the wheel so to speak.

There’s also some serious money behind what Multiverse is doing. To date the six-year-old company has raised roughly $262 million (€223 million), with a $215 million (€189 million) Series B investment round last month following its $27.1 million (€25 million) Series A in March 2024.

Everyone wants a piece of this.
Sam Mugel, CTO, Multiverse Computing

 

And for good reason. J.Gold Associates Founder and Principal Jack Gold told Fierce that model compression is something everyone wants and is trying to achieve.

“The biggest problem with LLMs today is that they’re huge and it’s really hard to run them on small devices,” he explained. But shrinking them without sacrificing performance isn’t easy. If it were, a company like OpenAI would have come out with a solution already, he pointed out.

He added that today everyone is talking about AI running on an iPhone, but there’s big a difference between using a connection to the cloud for that and running fully locally. The latter would certainly be a feat, but Gold said for now he’s skeptical.

 

The proof is in the pudding. Show it to me running on an iPhone and then I’ll believe you.
Jack Gold, Founder and Principal, J.Gold Associates

“That’s the crux of the issue: how do I get it to a reasonable size to run fully locally,” Gold said. “The proof is in the pudding. Show it to me running on an iPhone and then I’ll believe you.”

 

For what it’s worth, Mugel said last month Multiverse met with one of its strategic investors to showcase its technology. That investor – an OEM – has been working on its own model compression technology, but found Multiverse’s to perform significantly better. They ended up investing four times as much as they’d initially planned, he said.

“That was a really great tech validation for us – the fact that a world-leading company in this had tried to exactly the same thing as we’re doing, failed at and then decided to invest in us,” Mugel concluded.