Cloud Computing – Radio Free Mobile

OpenAI – Binary decision

RICHARD WINDSOR — Fri, 28 Mar 2025 09:18:59 +0000

Another $40bn found down the back of the sofa.

Another $40bn to spend on compute will ensure that OpenAI does not have to become efficient like its Chinese competitors meaning that when the money runs out, it is likely to be caught out resulting in it being forced to sell itself to a rich competitor or one of its backers.
OpenAI is raising $40bn at a valuation of around $300bn from which $10bn ($7.5bn from SoftBank and $2.5bn from an investor syndicate) is going in now.
The other $30bn will go in later this year with $22.5bn from SoftBank and $7.4bn from the investor syndicate.
Almost all of this is likely to be spent on compute as OpenAI continues to make its models bigger, train its models harder and use more data in its quest to create artificial general intelligence (AGI).
This is the point at which the machines become more intelligent than humans and take over more than 90% of all economically active tasks.
The assumption being made by OpenAI’s investors is that it will be able to reach AGI before anyone else and do so in a relatively short period of time.
This would give it exclusive access to arguably the most valuable asset ever created, and assuming that it monetises this asset, a fair valuation in the many trillions of dollars.
However, there are two caveats:
- First, No AGI: as RFM research and all of the available evidence suggests that a system that is based on statistical pattern recognition will never be truly intelligent.
- AGI requires an understanding of causality and the ability to distinguish between relationships that are merely correlated by chance and those where one affects the other.
- It is this shortcoming that prevents the machines from understanding the causal nature of the tasks that they have been asked to complete and is why they make things up, get things wrong and are generally unreliable.
- As a result, RFM Research has concluded that the approach that OpenAI is taking will never produce AGI and when this realisation hits the market, the money will dry up and OpenAI will be forcibly acquired.
- Second, Corporate Governance: OpenAI remains a hot mess of a for-profit entity being governed by a board that is tasked with overseeing the creation of AGI and its distribution to everyone for the benefit of all mankind.
- There is a large conflict here which very nearly caused the company to completely implode and may well do so again.
- Furthermore, the rapidly souring relationship between Microsoft and OpenAI means that if conflicts emerge, there is now a large and grumpy shareholder capable of making a lot of trouble.
The net result is that my view on OpenAI is very binary.
If one believes that OpenAI will create AGI and do so before anyone else, then SoftBank is making a truly great investment.
However, if one believes, and as the evidence indicates, that statistical-based systems will never create true intelligence and that the market for models is commoditising, then the real value of OpenAI is far below $300bn and may even be $0.
The real winner from this investment is CoreWeave whose difficult IPO has its first day of trading today.
This is because it appears likely that OpenAI will spend a portion of the $40bn on CoreWeave compute services taking up the slack that Microsoft has created by reducing its deal with CoreWeave.
This will be a badly needed shot in the arm as CoreWeave has had to downsize its offering as well as reduce the valuation from $32bn to $23bn.
This puts the company on 15x EV / Revenues for 2024 as it has a net debt position of ($6.6bn) which is pretty expensive for a company that provides infrastructure for AI rather than the AI itself.
However, sentiment has rapidly soured on CoreWeave meaning that if it trades badly on the opening, it could be worth having a look at.
Alternatively, Nebius which is in a similar business and at an earlier stage of its roll-out (given 2024 revenues of $118m), but has no debt and $2.5bn in net cash may be a much better and safer option.
This (plus Nvidia and inference and nuclear power) is where I would be much more interested in looking rather than in the hugely valued providers of foundation models which are showing every sign of turning into commodities.

Nvidia & Samsung – GTC Day 2

RICHARD WINDSOR — Thu, 20 Mar 2025 03:12:17 +0000

Samsung’s fate is bound up with Nvidia

GTC Update: Nvidia Dynamo – the most critical launch of 2025.

GTC 2025 is in full swing and as more details emerge about Dynamo, it is clear to me that this is by far Nvidia’s most important launch of 2025.
Dynamo is an operating system for a data centre that is producing tokens (inference) for a generative AI service.
Dynamo looks at the system of GPUs, memory and networking and works out the most efficient way to manage these resources based on the nature of the requests that are coming in.
It allows the data centre to produce as many tokens as possible for its resources, which maximises revenue given that industry standard pricing is $ per million tokens.
Dynamo makes more sense now because of the new family of “reasoning models” that improve performance by massively increasing the number of tokens that they produce per request.
This means that inference is going to quickly become the largest function of the data centre as RFM has been predicting for some time.
To be as dominant in inference, Nvidia needs more than CUDA as I think that CUDA is much less of a control point in inference than it is in training.
Enter Dynamo which promises to do for inference what CUDA has done for training.
If it is as good as Nvidia claims (see here) then Dynamo users are going to become more competitive given the efficiency improvement.
However, Dynamo is only likely to work really well on Nvidia hardware and the fact that anyone would want to run it on competing silicon does not seem to have been considered.
“In theory, you could do that as we have made it available to open source” was the response to the question, but obviously, this is not what it has been built for.
Consequently, if Dynamo proves to be very popular with clients, it is likely to raise barriers for data centre owners who are considering running inference on competing silicon.
Given that CUDA has been around for over 20 years and Dynamo is brand new, Dynamo has a long way to go if it is going to replicate CUDA’s effect in training, but the seeds have been sown.
Competitors expecting to take a bite out of Nvidia when it comes to inference need to act quickly if they don’t want the slightly open door of inference to be quickly slammed in their faces.

Samsung: Bet on HBM4.

Samsung apologised yet again at its AGM for its lack-lustre performance in memory and has promised to do better in 2025 setting the shares up for a substantial re-rating if it is successful.
Like the Note 7 that spontaneously exploded, Samsung’s memory problem is simple in that its memory offering for AI was not good enough allowing SK Hynix and Micron to humiliate it at its own game.
This has greatly damaged both Samsung’s performance and its reputation as memory for AI is one of the hottest areas in semiconductors right now.
If one takes Nvidia’s roadmap seriously, the importance of memory is only going to increase meaning that high-bandwidth memory (HBM) is going to continue to grow like wildfire.
The result of Samsung messing this up has been poor performance for some considerable time and a share price that fell by over 40%.
With a discount of this size, the market is assuming that Samsung is not coming back in HBM, but we have seen this sort of thing before.
When the Note 7 started catching fire in people’s pockets the view was that this would cost Samsung its leadership in smartphones, but the company buckled down, found its backbone and dug itself out of its hole.
There is every indication that this is exactly what is going to happen here as the problems it has had in HBM3 are fixable and I think that Samsung has the depth of character to do what it needs to get HBM4 right.
Hence, I don’t think that Samsung is going to get very far with its 12-layer HBM3E but I expect it to qualify with Nvidia for HBM4 that will be going into the Rubin chip that will become available in H2 2026.
Samsung is currently on 13.2x 2025 and 9.7x 2026 PER and I suspect that the 2026 estimate is too low.
This sets the scene for a big rally triggered by the news of qualifying with Nvidia for HBM4 which I am hopeful will come this year.
I have a position in Samsung where I am looking for around $1,600 on the global depositary receipt (GDR) that trades on the London Stock Exchange.

Nvidia GTC 2025 – Spread Betting

RICHARD WINDSOR — Wed, 19 Mar 2025 01:37:43 +0000

Nvidia is spreading its bets wide and early.

Another confident keynote from Jensen Huang where technical slip-ups were part of the show and where the best was saved for last.
Nvidia is leveraging its dominant position in AI training to move quickly into nascent adjacent markets such that when they start to develop, Nvidia will already be the go-to provider.
This is how Nvidia can keep competition at bay and still earn very high margins on the chips that it develops and sells.
During the keynote, the main announcements were:
- First, Blackwell Ultra & Rubin, where Blackwell is now in full production, Blackwell Ultra was announced and more details of the 2-year roadmap were given.
- Blackwell Ultra is an update to Blackwell which offers 50% greater AI performance than the original and upgrades both memory size and memory speed.
- Blackwell Ultra will be available in H2 2025 and I do not expect to see a repeat of the problems that we have seen with the ramp-up of Blackwell as this is an evolution rather than something brand new.
- The brand-new item appears in 2026 with the launch of Rubin which promises another big jump over Blackwell, but smaller than the jump Blackwell made over Hopper.
- Rubin is coming in H2 2026 and will offer a 3.3x AI performance gain over Blackwell Ultra as well as a doubling of memory bandwidth.
- Rubin is two dies stuck together but Rubin Ultra coming in H2 2027 is four GPUs stuck together in a single chip taking the improvement over Blackwell to 14x and 4x over Rubin (of which 2x is because there are two more GPUs).
- These kinds of gains will certainly allow Nvidia to keep its leadership and Nvidia is following its usual strategy of sharing the gains it makes with the customers.
- Hence, I would expect that Rubin will be double the price of Blackwell meaning that the customer should see a cost reduction relative to the compute output of 50% or so.
- This is where the classic “the more you buy, the more you save” tagline comes from, and this looks like it will still be the main theme of the company for a few more years.
- With data centre capex forecasted to increase to $800bn by 2028 (from $500bn in 2025) and to cross $1tn soon after, it looks like there will be plenty of money available to spend on Nvidia GPUs even as prices continue to rise.
- Second, Nvidia Dynamo which is a software toolkit aimed at optimising “reasoning” models to run inference on Nvidia silicon.
- This makes complete sense as “reasoning” is the latest trick to improve performance, but it involves massive increases in compute consumption for inference.
- This is evident in the prices that OpenAI is charging for its Deep Research service and so it makes sense to offer something that can provide an improvement for this kind of inference.
- Here Nvidia is claiming that Dynamo can increase the number of tokens generated by 30x when running DeepSeek R1.
- I suspect that Dynamo is taking advantage of some of the techniques that DeepSeek has put into its model to achieve this level of improvement and so the improvement seen with other models won’t be as good as this.
- However, there are signs everywhere that the industry is trying to reverse-engineer what DeepSeek has done, and so it is quite likely that other models will also see similar levels of gains in time.
- Third, Robotics: which I think has the potential to be a huge opportunity, but is going to take much longer than anyone thinks.
- This is what Nvidia refers to as Physical AI and the combination of Omniverse and Cosmos allows for robotic systems to be trained and tested virtually before they are ever built.
- The first robots are autonomous vehicles which Nvidia thinks are imminent but where I am considerably more cautious.
- To kick-start this market, new Cosmos models have been released which can be used together with the new blueprints for Omniverse to train robots and autonomous vehicles.
- The real win in automotive however was the announcement that GM will be using Nvidia for almost all of its AI needs from digital twins of its factories to running its autonomous cars as well as its corporate AI needs.
- Nvidia also announced the launch and availability to open-source of Isaac GROOT N1 which is a model that Nvidia says can be used to train all sorts of robots.
- With this model, Nvidia claims that the age of general robotics is here but I am more sceptical.
- Just as LLMs are not really general in that they can’t deal with situations they have not been trained for, robots have to be trained individually and then retrained if any changes are made.
- Fixing this problem is one of the big issues for robotics and so I am somewhat sceptical that GROOT N1 is the fix for this sticky problem.
- However, what we are seeing is Nvidia moving early and aggressively to cover this nascent space so that when everyone else arrives as the segment takes off, it is already the industry standard.
- Fourth, DGX Boxes: with a DGX Spark (Mac Mini size) and DGX Station (Desktop PC size) devices launch that brings Blackwell out of the datacentre.
- DGX Spark can deliver 1 PetaFLOP while DGX Station can do 20 Peta FLOPs with the same code that is used in the data centre.
- This allows developers to fine-tune their AI services at the edge before deploying them to the cloud or wherever they intend to run them.
- The big winner here is MediaTek which helped with the design of the DGX Spark and got several mentions during the keynote.
- This combined with the collaboration in automotive represents a huge profile boost for MediaTek outside of Taiwan which will help it compete in Europe and North America particularly.
The net result is that while 2024 was all about reaching a new pinnacle in performance, 2025 is all about taking that pinnacle and leveraging it as widely as possible across different industries.
We are witnessing an extension of the CUDA strategy from the silicon development platform to many other software platforms and tools that make it easy to develop AI for all industries on Nvidia hardware.
This means that competitors need to match both Nvdia’s hardware cadence and its software offering which is where most of the competition currently is falling over.
Nvidia is not standing still and is hoovering up as many partners as it can and it is the likes of Accenture, Deloitte, ENY and Cisco who will help cement Nvidia AI platforms in enterprise customers.
Nvidia is showing no sign of slowing down meaning that it is quickly expanding into any area where AI will be relevant with the strategy to become the industry standard before its competitors get out of bed.
This will help the company keep its market position but with 85%+ market share in datacentre GPUs, it will remain a hostage to end demand.
This means that there will be a few tough quarters when the inevitable correction comes, but there is still no sign of this as spending growth in the data centre remains rampant.
This combined with its reasonable valuation is why it is the only direct AI company that I would touch from an investment standpoint, but I continue to prefer the adjacencies of inference at the edge and nuclear power as the way to invest in the AI boom.

Nvidia & Meta – Safe for Now

RICHARD WINDSOR — Wed, 12 Mar 2025 05:42:28 +0000

Nvidia is not close to danger yet.

The worst-kept secret in tech is that Nvidia’s customers are all trying to reduce their dependence on Nvidia by building their own silicon, but it’s a slow process and with Nvidia’s product cadence, I don’t see it being in danger anytime soon.
Meta is working with TSMC (and I presume Arm (see here)) to develop an in-house chip that it will use for all of its AI activities including training and inference of regular machine learning and generative AI.
If successful, this would reduce or remove its dependence on Nvidia which given the size of Meta as a customer, would have significant and negative implications for Nvidia.
For a large client of Nvidia to switch to its silicon is much easier than it is for a small company as the large company has its in-house captive market to drive the economics.
It also does not have to worry about the dominance of Nvidia’s silicon development platform CUDA as it can make its systems vertically integrated and use its development tools.
It does, however, have to make the platform as good as CUDA and its in-house silicon as economically viable as the latest and greatest from Nvidia.
Hence, as always, the devil is in the details as:
- First, product cadence: which I have long argued is one of Nvidia’s key differentiators.
- Here, the latest product from Nvidia (currently Blackwell) is always at least one generation ahead of everyone else meaning that it will be the most cost-effective to operate even with Nvidia’s 70%+ gross margins.
- This is the classic build vs. buy dilemma that any company has to weigh up and, at the moment, everyone else is far enough behind to make it more cost-effective to buy Nvidia.
- Second, developers: where anyone who wants to have 3^rd party developers using their silicon has to solve the development platform problem.
- Developers already know how to use CUDA and as it is the most mature in the industry, it remains a key control point and the reason why developers prefer Nvidia.
- Consequently, with 3^rd parties, the CUDA problem needs to be overcome and given how far ahead it is, I think that unlikely that anyone will succeed in this generation.
- However, RFM Research has long argued that the developer market will move from developing on silicon to developing on foundation models as models become increasingly commoditised.
- The big foundation model providers are likely to ensure that their models can be trained optimally on Nvidia, their own silicon or anyone else as greater competition means purchasing the silicon for their data centres will cost less.
- This is how the CUDA control point may weaken, and I think that it is not until then, that we will see any real pressure on Nvidia’s business model.
Hence, I think that while Meta will have some success with its in-house silicon for its uses when it comes to 3^rd parties, it is going to be stuck with Nvidia for some time.
RFM Research has also concluded that it will take a while for developers to shift towards foundation models meaning that for a few years yet, Nvidia’s market share is unlikely to change much.
Consequently, Nvidia remains subject to the whims of demand which remains higher than it can deal with.
Hence, revenues are likely to be a factor of how much capacity it has booked at TSMC for the coming 12 months as opposed to how much customers want to buy.
The net result is that Nvidia’s short to medium visibility remains pretty good and so I do not expect any surprises in the next few earnings reports.
However, this also means that the scope for a further large run-up in the share price is limited meaning that the share price is likely to remain in line with revenue and profit growth.
Nvidia’s valuation is still relatively undemanding for the growth that it is likely to see in the next year or two, and so if I were forced to hold a direct AI investment, this would be it.
However, I still prefer the adjacencies of AI inference at the edge of the network and nuclear power to solve the energy shortage both of which remain pretty cheap and underinvested.

Alexa + & GPT4.5 – Mad Scramble

RICHARD WINDSOR — Fri, 28 Feb 2025 05:16:20 +0000

Competition heats up even more.

Amazon has finally announced the new version of Alexa but at the same time, OpenAI has released its new model GPT-4.5 to compete with the rapidly growing field of competitors which signals that the top of the market is close, if not already here.
Amazon has launched Alexa Plus which promises to offer a “conversational experience” with the new chatbot but with almost all of the 600m deployed devices incapable of running even a small language model locally, there are going to be user experience issues.
Amazon showed a series of use cases (just like OpenAI and Meta did) and the AI was very quick to respond giving the impression of speaking to a person.
However, in all of these instances, the device that the demonstrator spoke to was hard-wired into the cloud to ensure that latency was as low as possible.
The 600m devices that Amazon has deployed in the network will listen to the request and send it via WiFi and a fixed connection to a cloud connection miles away where the request will be processed and then returned to the device to deliver to the user.
This means that the latency will be several seconds at best from request to answer which will kill the user experience as Amazon demonstrated it.
This is why RFM Research has concluded that if agents are going to deliver a good user experience, a good portion of their intelligence needs to be implemented on the device itself.
This configuration provides the best economics for the service provider and the best experience for the user and could be where the majority of inference ends up happening in my opinion.
This is why I suspect that the next batch of Alexa devices that Amazon launches will have voice processing and at least some intelligence in the device itself.
This is something that Google has done on smartphones for some time and its experience is all the better for it, especially when it comes to translation.
This is one reason why I like the inference at the edge adjacency (see below).
Meanwhile, keen to be seen to be maintaining its lead, OpenAI launched its latest model GPT-4.5 which is bigger than ever (i.e. >2tn parameters) and has been designed to be more personable and easier to use rather than being a math or coding whizz.
OpenAI used the fact that GPT-4.5 is not a reasoning model to explain why it does not beat older models in benchmarks as it does marginally better in some and worse in others.
The real reason I suspect is that scaling in terms of making models bigger with more data to get better performance has hit a wall as pumping in more resources demonstrably produces smaller and smaller improvements.
GPT-4.5 does not use the latest inference tricks such as “thinking” for longer or generating multiple answers but instead is targeted at a general chatbot use case with a better user experience.
The answers that it gives are not meaningfully better than GPT-4o but they are easier to read an understand.
From the layman’s perspective, this will make some difference and is precisely what the new version of Amazon Alexa is targeting.
From the consumer perspective, this is where the AI ecosystem war will be fought and with 600m devices already deployed globally, Amazon is not starting from scratch.
This demonstrates that when it comes to generative AI, there is no moat other than the existing digital ecosystems which means that the real asset that OpenAI has is the ChatGPT user base and its global name recognition.
That does not always translate into global dominance of a mature industry and given that alongside Amazon’s 600m, Google has 2bn+, Apple 1.4bn+, Tencent 1bn+, Meta 3bn+ and so on OpenAI has a mountain to climb.
However, in the interim, I suspect that the real action will be in the enterprise where OpenAI’s early lead is much more of an advantage.
However, there is still plenty of competition here also and OpenAI is competing against others whose models are just as good and, in some cases, (Meta, Mistral & DeepSeek) make them available for free.
OpenAI has also yet to do anything about DeepSeek’s claims as the new model is bigger and consumed more resources than ever before both in terms of training and inference.
OpenAI admits both in its commentary during the launch and in Sam Altman’s tweet on X which is why it is only going to Pro users ahead of getting more GPUs online to allow it to launch to Plus users next week.
Hence, there is little sign of OpenAI becoming more efficient, especially with the possibility of SoftBank now bankrolling it to the tune of tens of billions meaning that it might get caught with its pants down when the correction comes.
The net result is that these launches continue to show competition in this space heating up which means prices coming down or more stuff being made available for free (Microsoft most recently (see here)).
This will eventually trigger a correction because the returns on investment will be much lower than promised given the price erosion.
The only real winners here are the infrastructure vendors like Nvidia, Astera Labs, Supermicro and so on who will continue to benefit as competition heats up and generative AI vendors continue to invest to stay ahead.
They are also where almost all of the money is going right now meaning that their valuations are much more reasonable than the likes of OpenAI, Mistral, Safe Superintelligence and so on.
Other adjacencies such as inference at the edge and nuclear power will also continue to benefit, but given that these have one degree of separation from the generative AI craze they will see much slower but far more sustainable value appreciation over time.
I very much prefer this to the rollercoaster that we see elsewhere and so I remain very happy to sit tight in those adjacencies.

Nvidia FQ4 25 – On the Nose

RICHARD WINDSOR — Thu, 27 Feb 2025 09:40:23 +0000

Not too little, not too much, just right.

Nvidia reported results that confirmed that the AI spending spree remains on track, but this growth is now captured in the estimates meaning that surprises will be hard to come by.
The valuation of Nvidia remains undemanding relative to the growth that it is still experiencing meaning that while the really big gains are over, this still represents growth at a reasonable price.
FQ4 25 revenue / EPS were $39.3bn / $0.89 up 12% QoQ and 78% YoY which was slightly ahead of estimates of $38.1bn / $0.85 but below the top of the estimates range.
Guidance for the FQ1 26 was also comfortably within the range with revenues of $42.1bn – $43.4bn with gross margins expected to be 70.6% – 71.0%.
This represents slowing but still very good growth, but the key take-home of these numbers is the failure to guide above the top of the estimated range.
For companies on very high multiples of earnings, this would represent a significant problem as one would expect to see a collapse in the share price of 25% – 30%.
However, Nvidia barely moved ending the after-hours trading session ending just 1.5% below where it closed just before the report.
This is precisely why Nvidia remains the safest direct investment in the AI boom, and it is the company’s ability to feed the growth straight to the bottom line that has kept the PER ratio at reasonable levels.
This, in turn, means that the rating does not need to correct when the company reports earnings in line with expectations which is precisely what we see here.
At the same time, I think that the company’s current high visibility on its earnings due to ongoing strong demand means that fiscal 2026 is going to be one of good, but not surprising growth.
Nvidia took the opportunity to address the DeepSeek question with greater efficiency being offset by higher demand and it pointed out the reasoning models which are currently all the rage consume far more compute power than the predecessors.
This makes complete sense because these models work by “reasoning” for longer or by computing several answers with a separate algorithm and then choosing the best answer.
This is what Nvidia means by a new level of model scaling and while I would argue that parameter and data scaling are close to the limits of what they can deliver, inference time has further to go.
I suspect that this too, will hit a limit of what it can deliver but at the moment, many players will be looking at how they can improve performance through increasing inference.
Furthermore, RFM Research has concluded that DeepSeek may not be that much more efficient than OpenAI when it comes to inferencing meaning that fears of an immediate collapse in demand are probably overstated.
This is a tailwind for Nvidia and the data centre capex plans for FY2026 have not been cut meaning that I do not think that DeepSeek is about to clobber demand for Nvidia’s data centre chips.
Hence, expectations for FY2026 look to be about right which combined with the good visibility that I think the company has means that there will be few surprises this year.
The net result is that the valuation of the company should remain steady meaning that share price appreciation should be roughly in line with profit growth.
This means that the share price should do reasonably well this year, but the days of blow-outs and huge price appreciation are clearly over.
This is why I think that Nvidia remains a pretty safe direct AI play, but I continue to prefer the adjacencies of inference at the edge and nuclear power both of which look like they now have further to travel than Nvidia where the story is well understood and priced in.

OpenAI – Deep Research

RICHARD WINDSOR — Mon, 03 Feb 2025 07:14:23 +0000

OpenAI tries to put RFM out of business.

OpenAI’s new Deep Research offers a new functionality that takes the grunt work out of conducting research but without insight, the ability to be contrarian and data that sits behind a paywall, I think RFM and Counterpoint Research are safe for a while yet.
Deep Research is a new tool that will take a request and scrape the internet for all the information that it can find and then assemble a report along the lines requested by the user that answers the question.
The query can take anything from 5 to 30 minutes which sounds very compute intensive, but I suspect that there is a lot of load management going on here also.
Setting expectations for a query to take a long time means that the research can be conducted when the servers have some latent capacity giving OpenAI more efficient use of its servers.
This in turn will improve the economics of OpenAI which by its own admission are not working particularly well as its highest-tier product at $200 a month is currently losing money.
Deep Research will also have the ability to include data proprietary to the user but when this will be available is not clear at this time.
The demonstration (see here) is impressive and I can see how this is going to commoditise certain aspects of the research business, but it has a number of weaknesses that it will struggle to overcome.
- First, insight: which is one of the most valuable areas of the market, industry and finance research businesses.
- Deep Research will be very good at gathering all of the bits of data together and mining the long tail, but it is likely to really struggle when it comes to working out what it all means.
- This is because like all of its predecessors, it is based on a large language model (LLM) which are based on statistical pattern recognition.
- This means that it has no understanding of causality and will not be able to distinguish between coincidental factors and those where one causes another.
- This is fundamental to being able to distinguish reality from fantasy, and all LLMs that exist today still have this limitation.
- Second, out of the box: where real discovery lies in doing something completely different to what has happened in the past or going against an established opinion or widely-held view.
- Being based on statistics, LLMs are always going to go for the most likely scenario meaning that they can never be truly creative or come up with something that no one has thought of before.
- Third, non-public data: where Deep Research only has access to publicly available data.
- Much of the really valuable factual information sits behind paywalls and is derived from non-public sources meaning that Deep Research will not have access to these datasets.
- OpenAI has said that in time, users will be able to use their own databases (i.e. research data subscriptions) as part of the inputs but even then, there will still be much of the knowledge base that it will not be able to access.
- To be fair, humans also have the same problem, but humans are much better at thinking of creative ways to get around these problems and making educated guesses as opposed to just randomly making stuff up as LLMs are prone to do.
This new product sounds great, but OpenAI was cautious to caveat the results by saying that users should check all of the sources to weed out the hallucinations.
Furthermore, as Deep Research will draw conclusions based on data that it does not know is real or fake, all of the conclusions that the system creates will immediately be suspect.
The net result is that this is an early release of a product that was already in the works to put the spotlight back on OpenAI at a time when it is seeking to raise even more money at double the valuation of 2024 ($300bn).
The fact that a lot of this money is expected to come from SoftBank, explains why this new service is being launched from Japan and OpenAI admitted in its launch video that the trip was arranged at short notice.
I think that Deep Research and its inevitable competitors could impact the lower end of the market research business meaning that junior analysts are going to be expected to produce more as well as move slightly up the value chain.
The real threat though is to outsource research businesses that currently use humans to do these sorts of tasks and they may find that demand for their services falls off a cliff.
In the absence of LLMs being able to understand cause and effect and therefore properly reason, businesses like RFM (which sells insight and opinion) and those that have proprietary data (like RFM’s partner Counterpoint Research) should remain largely unaffected.
All of OpenAI’s competitors are also likely to launch similar products (Google has already but it is not publicly available yet) and we are certain to have to incorporate these services into our workflow to remain competitive.
This means a change to the way in which we do business but RFM is not retiring just yet.

Microsoft & Meta – Headline With AI

RICHARD WINDSOR — Thu, 30 Jan 2025 07:41:43 +0000

Both relatively immune from the DeepSeek effect.

Microsoft FQ2 25 – DeepSeek on the menu

Microsoft disappointed the market with the growth of its cloud division but as the reason for this was supply constraint rather than demand, this is hardly an issue to get too worried about.
FQ2 25 revenue / EPS came in at $69.6bn / $3.23 broadly in line with consensus estimates of $68.9bn / $3.12.
AI was unsurprisingly the engine of growth with Azure AI services growing 157% YoY (a large part of which was OpenAI) to an annual run rate of $13bn or $3.25bn in actual revenues.
However, in the non-AI services, problems occurred with customers who buy Microsoft products through partners where Microsoft’s visibility on the scale and nature of their demand is not particularly good.
This small aberration was the cause of concern, and the shares fell 4% in after-hours trading which I think will soon be corrected.
Translating Ms Hood’s corporate speak into plain English leads me to interpret the problem as Microsoft not correctly anticipating where the demand would occur and investing in the wrong place.
This is not a big deal in my opinion and given Microsoft’s recent history in execution, I think this problem will rapidly be solved.
Microsoft is addressing the DeepSeek issue by offering the model on Azure for those who want it and highlighting that cheaper AI means more demand as one would reasonably expect.
I continue to think that the CCP will not be keen to simply give away the fruits of DeepSeek’s labours and would not be surprised to see that the huge cost reductions prove difficult to replicate outside of DeepSeek’s infrastructure.
This is why I continue to view DeepSeek’s innovations in terms of cost reduction and efficiencies with a healthy dose of scepticism and I await independent verification.
From Microsoft’s perspective, cheaper AI represents more adoption and in the long run, higher revenues and profits which is why we have not seen the shares spooked by DeepSeek’s products and claims.
However, this leaves the shares on a pretty full valuation of FY2025 PER of 33.9x, meaning that there is better value to be had elsewhere.

Meta Q4 24 – Profit powerhouse.

Meta has finally put the AI bugbear behind it as it is clear that Meta is now able to use AI to run its own operations more efficiently and for a company already present in the open-source community, DeepSeek is more of an opportunity than a threat.
FQ4 revenues / Adj-EPS were $48.4bn / $8.02 nicely ahead of estimates of $47.0bn / $6.76 as cost reductions in legal accruals and restructuring more than offset headcount growth in R&D and infrastructure expenses.
These reductions have now largely come to an end and Meta is likely to invest in line with revenue growth going forward.
With operating margins of 48%, despite a colossal $5.0bn (-458% EBIT margin) loss from Reality Labs demonstrates just how profitable (59.9% EBIT margin) the core business is which puts Meta in a superb position to invest.
This is a testament to Meta’s efforts over the last 5 years in AI which have seen to evolve from a laggard into one of the leaders.
The threat to Meta from DeepSeek is real in that Meta is currently the standard for open-source AI and if DeepSeek proves much cheaper to train and run than Llama, then that position may be meaningfully challenged.
However, the geopolitical environment means that trust in China as a good actor is at rock bottom meaning that many developers outside of China will be wary of using DeepSeek as a foundation and becoming dependent on it.
The small print also indicates that any data that is run through R1 will end up in China which will increase concerns still further.
This does not mean that open source copies of the model that are downloaded and finetuned will send data to China but with 671 parameters in the model there are lots of potential hiding places for backdoors and other covert functions.
If Meta can replicate DeepSeek’s innovations (big if), then it stands to benefit as it will be able to offer its services and run its business at a lower cost than previously.
Hence, I think that the threat to Meta from DeepSeek is low and Meta way well be able to turn its innovation to its advantage.
At 28.9x 2025 PER, the value story of Meta has long since been priced in, meaning that future performance is likely to be a function of fundamentals rather than a rerating by the market.
Hence, this is another one where the easy money has been made and as such, I remain fairly indifferent to the shares with much better value available elsewhere.

Stargate & AI – Magic Money Tree pt. III

RICHARD WINDSOR — Thu, 23 Jan 2025 07:09:51 +0000

A project for which hardly anyone has the money.

Stargate is an ambitious project to build up to $500bn of AI Cloud capacity but does not take into account the fact none of the current players except SoftBank and MGX have the money to complete even the first, $100bn phase.
Stargate is a joint venture formed between OpenAI, SoftBank and Oracle that intends to build a series of data centres as well as the supporting infrastructure to enable the anticipated mass adoption of AI.
The initial phase is $100bn and it seems that OpenAI and SoftBank will each put in $19bn with Oracle and MGX putting in $7bn each.
This leaves $48bn left to find for the first phase but given the glitz and sparkle that is being attached to this venture, I suspect the rest will be forthcoming without too much difficulty.
The best way to think of Stargate is as a private equity fund with OpenAI and SoftBank as the main partners which will invest in projects that build and operate AI infrastructure.
The remit is wider than just data centres and I suspect that there will also be money for investing in electricity generation which is rapidly becoming a key bottleneck for AI.
Arm, Nvidia and Microsoft are also involved as technology providers but there is no doubt that this signals the end of OpenAI’s exclusive relationship with Microsoft.
Oracle’s involvement makes it very clear that OpenAI will be running on Oracle infrastructure in some of the financed projects in a sign that Microsoft has had enough of coughing up vast amounts of cash for OpenAI.
Microsoft also made a statement relating to this project where it confirmed its continued partnership but also stated that its relationship with OpenAI had moved from an exclusive partnership to one where it has the right of first refusal (see here).
Consequently, I suspect it was asked if it wanted to part with another $20bn+ to which it replied, “No thanks”.
Microsoft has also added support for Mistral and Anthropic in its data centres, but I think this is more about giving clients choice as I don’t think that its own offering has followed suit.
The biggest question I have is where the money is going to come from as SoftBank has just $25bn of cash and cash equivalents, OpenAI has less than nothing given that it is burning cash like there is no tomorrow and Oracle has just over $10bn of cash on its balance sheet.
Where the other $400bn (roughly 3x total hyper scaler 2025 capex) will come from also remains a mystery.
The other question is demand as CES 2025 clearly demonstrated that generative AI is not ready for the consumer and all of the inference that was supposed to materialise in 2024 failed to do so.
Furthermore, I think that in the long run, most inference will be conducted at the edge of the network and not in the cloud meaning that most of the built capacity will be for training rather than inference.
Hence, there is a good chance that there is a correction in expectations and valuations and the $400bn takes far longer to materialise than anyone currently expects.
In the meantime, this makes great geopolitical theatre cementing the USA’s leadership in AI and putting China and DeepSeek back in their box.
Hence, I expect to see a lot more talk as well as some initial investments but once reality reasserts itself, Stargate will be a smaller, but probably better and more efficient investor.

Open AI – Christmas Deception

RICHARD WINDSOR — Wed, 11 Dec 2024 01:45:21 +0000

Sora is no gift and is obscuring other issues.

Open AI’s 12 days of Christmas headlines with Sora but all the fuss and buzz is creating a smoke screen for a number of more serious issues which point to the ongoing structural weaknesses at Open AI which could once again endanger its existence.
Open AI kicked off 12 days of launches and announcements on December 5^th and so far, the biggest release is the general availability of Sora.
Sora is the video generation algorithm that produces incredible-looking footage but clearly demonstrates that it has no understanding of physics or any of the footage that it is generating.
This has now been made available to paying users but even for testers, it is proving to be incredibly expensive.
For example, to generate a 20-second 1080p duration clip (the maximum) one needs to be a member of ChatGPT Pro which costs $200 per month.
The $200 gives the user 10,000 video generation credits but these credits will not last long as a 16:9 1080p 20-second video will cost 2,000 credits meaning that the user can create 5 per month.
Lower resolutions are far cheaper, but the algorithm constantly refuses to recreate anything that it thinks has even a remote possibility of getting OpenAI into trouble.
This is a classic example of the control problem that I have pointed out many times where because these models are almost impossible to control, they are prevented from undertaking many tasks that would make them much more useful.
Furthermore, although the video quality remains best in class, Sora demonstrates its lack of understanding of causality with levitating objects, strange artefacts and objects not performing the tasks with which they have been commanded.
At 1080p resolution, Sora remains the best but others like Google’s Veo, Runway and Pika are becoming available, and I suspect that prices are going to fall hard and fast.
While Sora is generating all of the attention other events are occurring that are being overlooked which I think underlines just how unstable Open AI is and how much scope there is for strife, infighting and a potential collapse.
The latest change (that critics will see as more broken promises) is OpenAI’s removal of the AGI (artificial general intelligence) clause in the agreement that governs its relationship with Microsoft.
Under the current agreement, if and when OpenAI creates AGI, (defined as the point at which machines can “outperform humans at most economically valuable work”), then Microsoft loses access to this technology.
In practice, this means that all of the profits from AGI go to the non-profit for the benefit of all mankind and Microsoft gets nothing beyond that point.
OpenAI have reiterated this many times and it is also in their charter creating the potential for yet more internal conflict and strife.
The problem is that OpenAI subscribes to the “bigger is better” philosophy of AI meaning that it is an endless bonfire of resource consumption which regularly needs huge amounts of cash to keep it going.
However, as the provider of the fuel for the fire, Microsoft has some leverage and so now according to the FT, there are discussions underway to remove that clause (see here).
This raises the likelihood that AGI will no longer be for the public good but exploited for corporate gain.
One can argue the merits of this in either direction but what matters here is the conflict that this will create and how one can not trust anything that OpenAI says.
Furthermore, the definition of AGI will be determined by OpenAI’s board which creates a massive conflict of interest and is certain to trigger a bitter and contentious lawsuit with Microsoft should the board ever make that determination.
It is almost certain that the reason for ditching this provision would be to ensure further investment from Microsoft which would naturally be very reluctant to invest further while having its return capped.
The good news is that I don’t think that this provision is going to be triggered any time soon as I think that we are no closer to AGI than we were 10 years ago but there are other issues that this slew of releases is distracting the media from.
At the beginning of 2024, Open AI watered its prohibitions towards using its technology in military and defence applications with the prohibition from using OpenAI to “harm yourself or others”.
During the year, OpenAI said that it would work with the US Pentagon on cybersecurity but not weapons and watered the provision down again to “help protect people” in a blog post in October.
On 4^th December Open AI announced that it is partnering with Anduril (defence company) which makes a range of hardware and software products designed for use on the battlefield.
Once again, the ethics and morality of these changes can be argued in both directions but it is yet another sign that the company is railing against the non-profit shackles that prevent it from becoming a gigantic global corporation.
The take-home message here is that the path that OpenAI is on remains completely clear in that it will become a proper for-profit corporation that is likely to end up being acquired by Microsoft.
This acquisition is likely to take place when the AI bubble finally bursts which could happen at any time although there is still precious little sign of it.
This is why I have little desire to be anywhere near this sector and prefer the far more reliable and reasonably priced adjacencies of inference at the edge and nuclear power.