Subscribe
Sign up for timely perspectives delivered to your inbox.
Client Portfolio Manager Dan Block and Technology Sector Team Co-Lead Shaon Baqui state that reported breakthroughs by Chinese artificial intelligence (AI) startup DeepSeek should be taken within the context of its open-source platform, but any reduction in inference costs could accelerate AI adoption.
Block: Hi, I am Dan Block, Equity Client Portfolio Manager at Janus Henderson Investors. I have with me Shaon Baqui, who is an analyst and also the technology sector co-lead for our equity team.
Recently, there’s been a lot of news on artificial intelligence, from the announcement of Stargate to Meta announcing significant AI capex, and more recently to the DeepSeek announcement out of China. And we wanted to dig deeper into DeepSeek and find out what the implications are for investors and as we look forward.
So, with that, Shaon, can you give us a little bit of background on the difference between training and, you know, inference/reasoning?
Baqui: Sure. So, the idea of training is, you know, you build up these large models, using a set of parameters and data, and that’s really the process of LLM, large language model development, and it can cost anywhere from a billion to hundreds of billions of dollars. And that’s really where the hyperscalers are spending a bulk of their CapEx now and over the course of the next couple of years, is building up these large frontier models.
So, the nuance between training and inference is inference is actually running the models, right? So, at some point we do think that, as inference becomes more mainstream and we see new techniques such as, you know, these test time models, reasoning models, come to market, we do think that the market size for inference could exceed training over time.
Block: What has happened with DeepSeek that really has everyone so excited?
Baqui: Absolutely. So, DeepSeek has really created this this storm of anxiety in the markets really, because what they’ve done is they’ve come up with new ways to train and run these models at a fraction of the cost, and, frankly, compute, that some of the traditional models from the Western vendors have come up with, right?
So, you step back, and you look at the hundreds of billions of CapEx that’s being put in place, so that really throws that into flux, whether or not we need to be investing hundreds of billions of dollars of CapEx into things like compute, networking, power cooling, other kinds of infrastructure, right? So, while we haven’t really validated all of the claims that have come out of DeepSeek, we do think that there’s certainly some interesting aspects of what’s going on there on the ground.
Block: Yeah, the market reaction has been pretty swift and pretty severe. Where do we think the market has it right, and where may it be wrong?
Baqui: So, let’s step back and dissect what DeepSeek is doing kind of in two ways. So, first back in December, they announced a new model called V3. So, that’s their training model. And you know, these guys claim that they were able to produce V3 at a fraction of the cost of a Western model. So, these guys claim roughly 2,000 lower-end GPUs, and trained in 55 days and at a cost of less than $6 million. Again, to put that in perspective, OpenAI ChatGPT4 cost around $100 billion to train. And, what they did, they used a technique called distillation. So, condensing that using a technique called a mixture of experts, or MOE And you know, distillation is not a new thing; it’s been out for some time. We’ve seen OpenAI have similar sort of 3 to 7 times performance gains there.
So that’s one area. So that $6 million roughly is getting a lot of scrutiny right now. Let’s step back: How real is that $6 million number, right? And this is where we need to, as analysts, validate some of these claims that they’ve made. So we’re still in the process of that. But I think it’s important to notice there’s a big nuance here between that $6 million of CapEx versus a billion; we don’t think it’s entirely apples to apples.
So, they claim they use a technique called reinforced learning. That’s a technique where you’re actually rewarding it for providing the right answers, versus human-supervised learning. So, very efficient, very low cost. And that feedback loop’s allowing it to compete quite well. So, basically, every time you ask a question, it learns from that, and doesn’t need any human supervision, it just sort of teaches itself. So that’s resulted in an efficient model, one. And then two, they claim the pricing here is $2, just over $2 for a million tokens, versus what is out there in the market today from OpenAI, which is $60 for a million tokens.
So, drastically bringing down the cost of inferencing. We talked about inferencing at the beginning, running these models. So, when you ask are we hitting that inflection point? Well, we don’t, first of all, we don’t know how effective and how commercialized we’ll be able to make, you know, this AR1 model, right? It’s a consume model, so it runs faster and more efficiently and at lower power than a traditional model. But it relies on the same data set and the same foundational model that was created by some of the larger open-source models, such as, you know, ChatGPT, anthropic, etc. And in fact, I think what the market’s missing is, when you actually ask DeepSeek what it’s trained on, it will tell you it’s trained on OpenAI.
So, it’s not entirely apples to apples. You need these foundational models to exist in order to have these distilled models.
Block: Sure. And so, software and kind of consumer internet are looking fairly strong on this, at least in the short term. Are we hitting that inflection point where there’s been this focus, what’s this return on spend? Are we at that point?
Baqui: If they’ve achieved this big breakthrough in reasoning models, it could be a huge boon to software companies and consumer internet companies. So, we see sort of the same dynamics potentially at play in software as well as consumer internet. So, we think that as this sort of plays out, we think there’s, you know, potential beneficiaries as well here. And you know, as a team, we’ve been we’ve been focused on these two areas for a long time and continue to look at these as potential beneficiaries of this, of this breakthrough in inference, potentially. So, I think over time this could have some very positive ramifications for the AI ecosystem.
Block: Given that, and maybe taking a step back, what are some of the broader implications that we see going forward?
Baqui: Yeah, I think what’s important is not to lose sight of the big picture. So, some of the smartest people in the world have committed billions and billions of dollars of CapEx into advancing AI, right? We’ve heard from some of the largest hyperscalers in the world, Microsoft spending $80 billion, Meta committing $60 billion, Stargate could be up to $500 billion. So, I don’t think we need to lose sight of that just because we’ve seen some unvalidated claims out there. So, you know, important to see the forest for the trees here, CapEx could potentially go up, and we look forward to hearing from some of these companies as we move through earnings here in the coming weeks.
Block: Fantastic. So, a lot of very interesting, very, important events going on, changes happening. I think it does emphasize why it is important to follow the data and use a balanced investment approach, with a forward-looking approach as well. So, Shaon, thank you for joining us. We appreciate your time. And thank you all for joining us for.