Eco-systems – Radio Free Mobile

Artificial Intelligence – Argument to Authority

RICHARD WINDSOR — Thu, 24 Apr 2025 05:58:38 +0000

A deeply misleading industry practice.

Open AI’s latest model has been demonstrated falling short once again, highlighting that the practice of dressing up product launch press releases as scientific papers is deeply misleading and leads the public to think that these models are far more capable than they actually are.
In the Artificial Intelligence industry, the days of normal press releases are long gone, and instead, a document full of complex graphs and technical terms that look just like a scientific paper is released.
Scientific papers are held to a high standard, as when they are properly produced, they are peer reviewed and written to a standard such that their findings are reproducible.
This is why a properly produced scientific paper that has been published in a scientific journal has a high degree of credibility and should be treated with respect.
Scientific papers also follow a specific format of an abstract, an introduction, materials and methods, results and a discussion.
Loosely translated, this results in: a summary, why we did the experiment, how and what we did, what we found and what it means.
This is the fundamental method of verifying and communicating scientific progress, and when executed properly, it has been highly effective for decades.
Consequently, a scientific paper immediately attracts a higher level of credibility, which the artificial intelligence industry has hijacked and is in the process of corrupting.
10 years ago, artificial intelligence research was very open and largely academic, peer-reviewed and properly produced scientific papers were a viable method of communication, but ChatGPT’s viral success changed everything.
We are currently in an arms race to produce the best foundation model, and with billions or trillions of dollars at stake, no one has time for peer review or, in many cases, even doing the experiments properly.
Well-known AGI sceptic, Gary Marcus highlights the latest example of this in his most recent blog (see here) where claims made by Open AI about its latest model were found to be not reproducible (a grave scientific sin).
When OpenAI launched o3, it claimed that it could score 75% on a difficult benchmark called ARC, but others have been unable to repeat this finding, with the best score they could find being 56%.
This is not unique to OpenAI, but other model makers are less than forthcoming with how they trained their models and disclose only the data that paints their models in the most favourable light.
The problem here is that these findings are then dressed up as a scientific paper, where most people will assume that this means that they have been subjected to the same scrutiny and are therefore an accurate representation of reality.
Google, Anthropic, Mistral, Meta, DeepSeek, Alibaba and so on are all guilty of this practice, meaning that anything that comes out of these companies should be treated as a marketing press release and nothing more.
It is only when they appear in a peer-reviewed scientific journal should they be given a higher level of credibility, but even there, there are signs of standards slipping.
Hence, when these companies announce new models, they should be treated with the same level of scepticism that Apple is subjected to every time it mentions the phrase “Apple Intelligence”.
This practice is unfortunate because, combined with the hyperbolic commentary that accompanies a model launch, it is leading the public and the markets to believe that super-intelligent machines are just around the corner.
If this is true, then the entire industry is greatly undervalued, but unfortunately, all of the empirical evidence points to precisely the opposite.
This means that while there are still very large opportunities for AI to generate revenues and profits, the time when machines take over 90% of human-related tasks and become more intelligent than humans is as far away today as it ever was.
While this is bad news for the valuations of the AI companies, it is good news for the human race, as the machines remain way too stupid to decide that humans are a threat and eliminate us all.
A loss for science but a win for humans.

TSMC Q1 25– No Wobbles

RICHARD WINDSOR — Fri, 18 Apr 2025 06:38:28 +0000

AI freight train is still rolling.

The threat of tariffs, trade war, China restrictions and stock market volatility have been unable to dent TSMC, which reported good results and underlined that AI will continue to drive its revenues in 2025.
Q1 25 revenues / EPS were NT839bn / NT13.94, broadly in line with estimates of NT837bn / NT13.62.
TSMC confirmed that its 25% YoY growth guidance in 2025 remained intact and that it would still spend $38bn – $42bn in capex.
A meaningful part of this is due to the chips it is making for AI datacentres, where revenues are expected to double again in 2025, confirming that the mad scramble to build datacentre capacity remains on track.
Although this represents no change to estimates, the shares rose nearly 4%, clearly reflecting some alleviation of fears that there would be some impact from all of the chaos caused by US attempts to rewrite the rules of global trade.
This underpins my view from yesterday (see here), where the weakness that ASML saw in bookings was not related to a sudden drop in AI-related demand but was more a reflection of tariff nervousness and a decline in demand coming from China due to tightening restrictions.
Hence, I expect that when Nvidia, AMD and so on report in a few weeks, they will confirm this trend, meaning that the outlook for 2025 remains good.
Consequently, there is also likely to be no change in capex estimates from Google, Amazon and Microsoft when they report their calendar Q1 25 results
While this is good news for the short term, the stage is still being set for a correction as the attitude of the big cloud providers is that overbuilding is better than underbuilding in this environment.
Here, I disagree because this is precisely what the telecom operators said in 1999 and 2000 when asked if they were overbuilding fibre optic networks to support the internet.
This view was rapidly turned on its head when it turned out the internet was not fast and mature enough to handle all of the use cases that were postulated at the time and that we take for granted today.
The problem was that between 2000 and 2004, there was a dip where you could not give fibre optic capacity away, and I suspect that in AI infrastructure, something similar could easily occur.
However, I have no doubt that when the AI correction comes, it will be smaller and far less painful than the internet bubble 25 years ago.
This is because, despite its hallucinations and shortcomings, AI can deliver real services and real value to users and enterprises now.
By contrast, the internet was barely able to deliver a slow, frustrating web browsing experience 25 years ago, and so it took far longer for all the use cases that we take for granted today to materialise.
Consequently, even with its problems, AI can deliver meaningful revenues now, and so while it will not meet the lofty expectations being set by its creators, it will deliver far more than the Internet could in 2000.
The net result is that when the correction comes, AI capacity will rapidly decrease in price, meaning that it is likely to end up being cheaper to buy in the dip as opposed to building it now.
The problem is that the industry is so infested with FOMO (Fear Of Missing Out) that this strategy is currently inconceivable.
I suspect that almost no one will be willing to wait and then purchase later, but this is how the best return on investment is likely to be made.
Hence, the pick and shovels of AI are going to continue doing well in the short term, but the one I am really looking for is the one with the spine to hold off for now and then buy when the dip comes.

Tech Newsround – ASML, AMD & Global Foundries

RICHARD WINDSOR — Thu, 17 Apr 2025 06:28:45 +0000

ASML Q1 25: The tariff effect.

ASML reported reasonable Q1 25 results, but its order book was way adrift of expectations, which I take as a sign of weakness in China and uncertainty around tariffs as opposed to a sign that the AI freight train is slowing down.
Q1 25 revenues / EPS were E7.7bn / E6.00 broadly in line with expectations of E7.8bn / E5.74.
Guidance also remained unchanged with Q2 25 revenues expected to be E7.2bn – E7.7bn (E7.45bn) and FY 2025 at E30bn – E35bn.
However, the order book fell well short, coming in at E3.9bn compared to forecasts of E4.8bn.
This caused some consternation, which combined with new restrictions hitting Nvidia (see here) and AMD (see below) triggered another correction in the semiconductor sector.
ASML remained tight-lipped about China, even though its contribution to sales and orders is falling as Chinese customers become more concerned about the long-term viability of using ASML equipment.
I expect this to continue as the Department of Commerce is showing every intention of further increasing restrictions on what can be exported to China.
Tariff uncertainty has also been a contributing factor that has caused customers to delay orders, meaning that once there is visibility on how global trade is going to be conducted, this should quickly correct.
ASML remains the sole supplier of equipment capable of manufacturing advanced chips for AI in the cloud and edge, and so it is being used as a gauge for AI demand.
The weak order book looks to me to be more about China and tariff uncertainty than it is pointing to a sudden drop in AI demand, and so I do not expect that we will see related weakness in its customers and the customers of its customers.

AMD: Same game, same pain.

AMD has said that it will take up to $800m in provisions as a precaution for the chips that it will now no longer be able to sell in China, which again indicates that the long-term effect of these restrictions will be economic rather than technological.
The MI308 product that AMD has been selling in China now requires a license from the Department of Commerce, which, with a presumption of denial, effectively means that shipments will now cease and not resume.
The stated intent of these restrictions is to prevent China from developing advanced AI that could be used for the purpose to damage US interests, which at a high level has already badly failed.
This is because DeepSeek, Alibaba and others have already produced AI that competes with the leaders and are likely to continue to do so regardless of the restrictions that are placed upon China.
However, the lack of advanced silicon will mean that Chinese AI costs more to produce and run, which will greatly undermine the proposition to use China’s AI outside of its borders.
I have serious doubts whether the US Department of Commerce has thought this far ahead, and I expect that rendering China uncompetitive when it comes to other countries is going to be the main benefit of these restrictions when it comes to containing China’s rise.

Global Foundries – Bathwater Baby.

With tariffs being all the rage, one baby that has been thrown out with the bathwater is Global Foundries, whose fabs, while not leading edge, are all situated far from China’s backyard with a hefty presence in the USA.
Hence, the USA fabs in New York have suddenly become more attractive, and I suspect that there has been an upswell of inquiries about using Global Foundries to manufacture in the USA.
This makes Global Foundries an interesting one to consider as a “tariff” trade, and given that it is now trading on 18.6x 2025 PER, it has also become much more attractive on a fundamental basis.
I don’t own Global Foundries, but it increasingly looks like it is worth having a look.

China vs. USA – Art of the Deal?

RICHARD WINDSOR — Wed, 16 Apr 2025 06:37:17 +0000

The screw is given another turn.

While it is not very difficult to make an argument for banning the sales of advanced chips into China, giving Nvidia no warning and costing it $5.5bn appears to be a bit of own goal in the short term, but in the long-term, this is likely to hurt China’s competitiveness overseas.
The US Department of Commerce has announced that Nvidia’s H20, AMD’s MI308 and other chips like it will now require a license to be exported to China with immediate effect.
With a presumption of denial of licence, this will effectively halt all shipments of affected products with immediate effect.
The net result is that $5.5bn of Nvidia inventories and commitments to the H20 may now have to be scrapped unless Nvidia can sell or repurpose the chips elsewhere.
In previous instances where the rules suddenly changed, Nvidia was given enough warning and it was able to re-use the capacity it had earmarked for China elsewhere.
I suspect that this would have been the same except that H20 is materially different to the H100 (upon which it is based) as opposed to the H800 which is the same as the H100 except that its memory bandwidth is capped at 300 GBPS as opposed to 600 GBPS.
Hence, when it transpired that it would no longer be able to sell the H100 to China, the H100s could easily be sold elsewhere but the H20 is very different.
This chip has limits on its performance where it is estimated that it is capped at just 14% of the compute throughput of the H100 with the goal being to slow and limit China’s ability to produce cutting-edge AI.
This has been nothing short of a complete failure as DeepSeek R1 and Alibaba’s most recent models demonstrate that they can perform just as well (and in some cases better) than the current crop of models from the global leaders.
Furthermore, I have doubts that hitting the H20 will prevent China from producing these models as silicon does not really determine the ability to make models but more the speed and economics of their production and use.
This is why Alavan Independent and RFM Research have long argued that the Chinese are not behind in AI but that they will have great trouble competing economically.
This is crucial because it is already clear that DeepSeek does not have a sustainable competitive edge when it comes to training and so pretty soon the USA and China will be competing head to head when it comes to the economics of AI.
One only has to look at Llama 4 (see here) for evidence of this which uses the Mixture of Experts and quantisation techniques which RFM research has calculated contributed the lion’s share of the efficiency improvements claimed by DeepSeek in January 2025.
It is at this point that the silicon advantage will become apparent.
While the latest systems from Nvidia are extremely expensive, they offer a significant gain when it comes to the cost of tokens for both inference and training.
By comparison, China will soon be 2 generations or more behind meaning that its models while competitive when it comes to performance, will cost more to train and run.
This means that when it comes to expanding outside of its borders, Chinese AI is going to be more expensive than competing solutions with the added complication of being tied to the Chinese state.
This is how the latest restrictions are going to do very little in the short-term in terms of limiting China’s AI capability but will handicap its competitiveness meaning that it will be predominantly non-Chinese AI that is selected by non-Chinese countries and companies.
Given how long-term the Chinese think this could be a factor in bringing it to the negotiating table to hammer out a trade deal that both parties can be satisfied with as opposed to continuing rounds of tit-for-tat measures and rhetoric.
For Nvidia, it is unlikely to be able to lay off the H20 chips it has already made and are those that are in production and even if it can, it will be at prices below what China would have paid for them.
Hence, while there is likely to be some write-back of this provision, I suspect a good portion of it will be used and is lost to shareholders.
The good news is that this is a small bump in the road for Nvidia where demand for its products remains strong and it continues to sweep its competitors contemptuously to one side.

Tech Newsround – Nvidia and Apple

RICHARD WINDSOR — Tue, 15 Apr 2025 06:25:36 +0000

Nvidia – Window dressing

Nvidia is signing up to make products in the USA which is a move I suspect that it was already executing but it will have done itself no harm by being seen to fall in with the patriotic agenda.
Following a catch-up with the President of the United States, Nvidia has said that it will produce up to $500bn of products in the US including full AI systems in addition to the chips that TSMC is making in Arizona.
This does not look like there is very much new here as Nvidia had already committed to making chips in Arizona and it is already building facilities in Houston and Dallas with its partners.
However other partners like Amkor and SPIL are also increasing their commitment to the USA and this is where I think Nvidia is increasing its commitment.
The real reason why Nvidia is already diversifying away from Taiwan is to mitigate the risk of China interfering in the semiconductor supply chain which has been rising for some time.
This risk will only continue to increase as the USA and its allies increase trade and technological pressure on China.
Hence, we are likely to see continued diversification by all of TSMC’s customers away from Taiwan even if a trade deal is struck with China and the tariff-related chaos dies down.
Although China has retaliated with reciprocal tariffs, it has yet to take serious action against the increasing pressure being placed upon it.
This could take the form of a ban on Apple or Qualcomm shipments into China or a blockade of Taiwan which would cause real consternation.
However, these sorts of moves are very risky for China and could do as much, if not more, damage to the Chinese economy than to the US meaning that these sorts of measures can only be a last resort.
Hence, I think that a deal of some description will get done as this remains in everyone’s best interest (especially China’s) given how weak and troublesome its economy has been since the pandemic.

Apple – Privacy? What privacy?

Apple is in real trouble when it comes to AI as it is now resorting to using its users’ data to train its AI to try and catch up in a field where it remains woefully adrift.
This is also an admission that synthetic data is never as good as the real thing, and now that it is in trouble, it has been forced to compromise its long-standing position on the privacy of its users.
In a blog (see here) Apple details how it is extending its differential privacy technique so that it can improve the quality of its AI algorithms and try to close the yawning gap to its rivals.
Differential privacy is a technique where data is injected with random sequences such that it is meaningless when viewed in insolation but when aggregated the random pieces cancel each other out and the real aggregated data remains.
This protects user privacy but only using the averages means that the model that is trained will never be as good as if it were trained with the individual pieces of data.
This has been acceptable to date but now that AI is becoming more important, users are beginning to notice just how bad Apple is at AI which has been exacerbated by the less-than-successful roll-out of Apple Intelligence.
Apple has been training its AI with synthetic data, but I remain sceptical about the value of synthetic data because it depends on being a realistic simulation of reality which it almost always is not.
Hence, one quickly arrives at a garbage-in, garbage-out scenario which is where Apple has found itself.
I think that this is why it has had to come up with this convoluted way of using the real data of its users while still being able to claim that it has not violated its privacy standards.
One can argue whether or not its desperation has forced it to compromise its privacy ideals, but what is clear is that Apple is in a very difficult position when it comes to AI and it is not getting any better.
Fortunately, for the moment, this is not going to compromise the sale of iPhones, but if competing devices start sporting AI agents that everyone loves to use (big if), then Apple’s market position will come under much greater threat.
This is why Apple needs to do something but I suspect that this will not go nearly far enough to fix the issue.

Autonomous Driving – Causality Debate

RICHARD WINDSOR — Fri, 11 Apr 2025 06:53:20 +0000

I don’t think Wayve is going to make it.

Wayve’s deal with Nissan is a big shot in the arm for the “brute force” approach to autonomous driving, but Nissan has made no promises to use it beyond level 2 leading me to think that a somewhat reluctant Nissan has been coaxed into giving it a try by SoftBank.
Wayve is a UK-based autonomous driving start-up that uses a single large end-to-end model to drive the vehicle.
This means that sensor data goes in one end and driving instructions to the vehicle pop out the other.
The advantage of this is that if one can get to work, then there is no need to limit where the vehicle can go which also means that no HD map will be needed.
The dream of autonomous driving is to have software that can drive a vehicle more safely than humans under any conditions and be able to deal with situations for which it has not been explicitly trained.
This is exactly how humans do it and as long as one is prepared to exchange a large neural network for a human brain, then all should be well.
However, this is a bridge too far for me which brings us right back to RFM Research’s old chestnut of causality.
Humans can drive a vehicle safely because they understand the cause and effect of the road, while the large model merely matches inputs to statistical characteristics and estimates what the output should be in the given situation.
For example, any human would never mistake a large restaurant sign with red, yellow and green circles for a traffic light but unless the machine has been explicitly taught about that sign, it will.
This means that for situations where the dataset is both stable and finite (i.e. all outcomes can be predicted and trained for), then a neural network can perform really well.
However, the road is neither finite nor stable which makes a large neutral network a suboptimal choice to solve this problem.
This is where opinion in artificial intelligence diverges.
On the one hand, you have Elon Musk, OpenAI, SoftBank, Anthropic and so on who claim that with a big enough model, enough data and enough compute, magically, machine superintelligence will pop out at the other end.
This is the argument that keeps the money pouring in and the valuations at very high levels.
On the other hand, there are the sceptics and gadflies like Gary Marcus, RFM Research and many others who think that until a statistical-based system can truly reason, we will be as far away from superintelligent machines as we were 10 years ago.
In my opinion, the “reasoning” models are not actually reasoning but simply offering up a very good simulation of it.
This is because while the models can ace PhD level maths, they fail to reason that if A=B, then it follows that B=A.
This is the classic paradox that has plagued AI for decades in that machines can be taught to do very difficult things but fall to bits when asked to do the simple stuff.
It is not until this issue is beginning to be solved that I think the Wayve approach to autonomous driving has a chance of working in a truly commercial setting.
One can see this in how Nissan will be using Wayve’s technology starting in 2027 where it will be used for level 2 only at the outset (see here).
Level 2 is hands-on ADAS where the human is still piloting the vehicle and does not go much beyond staying in lane and adaptive cruise control.
I take this to signal a “let’s see” approach and I suspect that as SoftBank is a major investor in Wayve and is championing a collaboration between OEMs to share data and resources to achieve full autonomy with an end-to-end system, it has had some influence on Nissan when it came to taking software from Wayve.
Nissan has made no commitment that I can see to take this beyond level 2, and so I do not take this as a sign that the end-to-end large model approach is the right one.
In fact, I think this approach will end up falling short and an approach that uses a combination of rules-based software and machine learning will be the one that wins out at the end of the day.
This also means that autonomous driving components such as an HD map, lidar, radar and cameras will all be needed to help reduce the volatility of the dataset of the road as well as produce redundancy that can make the cars safer than humans.
With all of the hype and excitement around LLMs, this approach is currently not in favour, and so I suspect that it will be later rather than sooner that we begin to see fully autonomous vehicles on the road.
Hence, I think that Wayve and Deeproute.ai (which also uses this approach) will never become going concerns in their own right and will end up being acquired.

China vs. USA – Yield Debate

RICHARD WINDSOR — Thu, 10 Apr 2025 06:01:07 +0000

Yield is everything.

Relative newcomer to the semiconductor game SiCarrier made a splash at Semicon China 2025 by launching a lot of new equipment and claiming that non-optical methods may enable China to start producing 5nm chips on homemade equipment.
This has underpinned a lot of chatter with some of the commentariat now confidently predicting that China will produce 5nm chips as soon as 2026 (see here) but I remain highly sceptical.
Mr Du Lijun, President of SiCarrier (founded in 2021) stated that homemade tools could be employed to make 5nm chips where its non-optical machines could help deal with the lithography issues.
Mr Du is referring to a technique called quadruple patterning which uses multiple passes with a laser to create the very narrow lines needed to make a 5nm chip.
The problem is simple in that China is trying to draw lines that are 5nm wide with a laser with a wavelength of 193nm which is very difficult to do.
This is where the multi-patterning technique comes in and TSMC had some success with this at 7nm but soon switched to EUV because the yields were not high enough.
The EUV machines operate at a wavelength of 13.5nm making it much easier to draw the narrow lines required at 7nm and below once one has overcome the massive technical issues of creating a reliable light source at that wavelength.
ASML is the only company in the world to have cracked this problem and given that it cannot sell its EUV products to China, it has had to resort to taking what TSMC developed and advancing it.
While China has had some success at producing chips at 7nm (like TSMC did), I don’t think that it has ever been able to produce these chips economically because the yield has been too low.
Typically yields above 90% are needed to make a decent return and after several years SMIC and Huawei are still way behind that with their 7nm process.
This means that these chips have negative gross margins making them economically unsustainable.
This is something that China really cannot afford right now given the precarious state of its economy which will only get worse when China tries to take this technique to 5nm.
Multi-pattening at 5nm is something that nobody has tried to the best of my knowledge and will require at least 20% more steps than 7nm.
Hence, I suspect that it will take China far longer than the commentariat claims to get this to work at 5nm and it is unlikely to ever be economically viable.
China has demonstrated that it is adept at making do with limited resources (as it did with DeepSeek) and so I think it may find a way to make a 5nm chip.
However, I do not think that it will ever get the yields up to a point where it can make these chips economically viable.
If this were possible, I think we would have seen TSMC stick with multi-pattening at 7nm and migrate it to 5nm.
Consequently, I think China will remain limited in terms of what it can produce below 20nm as it can no longer buy or service new equipment although I think the home-grown industry may be able to fill this gap in time.
However, at the leading edge, I think China will not go beyond 14nm economically and think that even making a chip at 5nm is a stretch.
Hence, I continue to think that for the foreseeable future, China will struggle to make advanced semiconductors at high yield and low cost.
This means that the US restrictions will continue to be quite effective with very little impact coming from the availability of these new machines.
This continued rivalry means that the days of global standards are numbered and, going forward, we are likely to see one standard outside of China and another, competing, and non-compatible standard inside China.
This is bad news for everyone as two incompatible networks will generate much less value than one global network.
Consequently, long-term growth for the entire technology sector over the next 10 to 20 years will be lower than it otherwise would have been

Tech Newsround – Samsung & Microsoft

RICHARD WINDSOR — Tue, 08 Apr 2025 10:12:24 +0000

Samsung Q1 25 – Welcome relief

Samsung reported good results as smartphones and legacy DRAM fared better than expected, but the key catalyst for recovery which is the qualification of Samsung’s HBM4 at Nvidia, remains a hurdle yet to be cleared.
Preliminary Q1 25 revenues and operating profits came in at KRW79tn / KRW6.6tn ahead of estimates of KRW76.6tn / KRW5.7tn.
This was largely unexpected as demand for its new flagship smartphone seems to have been somewhat better than forecast while Chinese customers have been stockpiling commodity DRAM due to the geopolitical uncertainty.
This pertains to both the potential for further restrictions on technology imports from the US Department of Commerce and the uncertainty around the new tariff regimes.
This serves as a solid foundation for the recovery but the elephant in the room remains unaddressed.
This is Samsung’s ability to compete in high bandwidth memory (HBM) that is needed in AI data centres to support both training and inference.
Memory is often the bottleneck in the data centre rather than GPU speed or capacity, and so meeting or exceeding the required criteria is crucial to be a supplier.
This is where Samsung has gone badly wrong as its HBM3e product has not been good enough to qualify with Nvidia as a supplier.
Samsung has effectively given up on 3^rd generation and is putting all of its efforts into HBM4 which will start to take over from HBM3e towards the end of 2025 with volume in 2026.
This failure has cost Samsung 40% of its market capitalisation creating an opportunity as its history has shown that it has the depth of corporate character to recover from substantial setbacks.
This is why I own the shares and I think that over the next 12 to 24 months it will qualify with Nvidia, take share in HBM4 and return to around KRW85,000.
I own a position in Samsung Electronics and I am looking for qualification at Nvidia as the catalyst for a rally in the share price.

Microsoft Copilot – Memory marketing.

Microsoft celebrated its 50^th birthday by launching the next version of Copilot which it hopes will take Copilot from an oddity that is present mostly on the new long battery life Arm Windows laptops to something that has greater use for both consumer and enterprise.

- First, Consumer: where a slew of new features have been launched as well as the ability to customise the look and feel of the agent.
- Copilot will now be able to see what is on the screen, interact with supported apps (like Photoshop) making them easier to use and help with planning trips, shopping and so on.
- Microsoft is also emulating OpenAI (and everyone else) by launching a Deep Research function bringing it up to date with everyone else.
- Microsoft also demonstrated its agent actually doing things like filling in online forms with more abilities promised in future updates.
- These agents will now have (with user permission) the ability to become more customised to understand the user’s preferences.
- Microsoft refers to this as memory which is a neat marketing trick as short-term memory is a known problem that AI agents struggle with.
- Microsoft has not solved this problem with the new Copilot as all it is really doing is using the knowledge graph of the user’s profile to adjust the weights of the model that is being used.
- Second, Enterprise: where the launches are aimed at developers who will be developing custom agents for companies.
- This is Microsoft enhancing its play for the ecosystem and at the same time deepening its divorce from OpenAI.
- Microsoft no longer believes it needs to be at the cutting edge of AI performance and appears to me to be embracing some of the realities that its competitors continue to deny.
- The idea now is not to have one agent that does everything but many agents all of whom are trained to do one task but to a high standard.
- There is no reason why these can not work together which is what the new multi-agent framework is about.
- Using this, developers can put together specific use cases that use numerous agents that together can complete a more complex task.
- This is very similar to Nvidia’s Nvidia Inference Microservices (NIMs) and its AI foundry with the main difference being that Microsoft’s will be agnostic to the silicon it runs on.
These launches are indicative of Microsoft’s evolving philosophy towards AI where it has publicly said that AI models are commoditising and that it does not need to be at the bleeding edge.
I think that this is a result of its rapidly souring relationship with OpenAI as well as the realisation that the kind of dependency on OpenAI it was espousing was creating significant risk, especially given the precarious nature of OpenAI’s governance (see here).
Although the case for agents is yet to be proven, Microsoft is in a good position to leverage the high share of personal computing and enterprise software to be one of the major players in the AI ecosystem.
There still remains everything to play for and I suspect that 2025 will be all about wooing developers into building on these platforms that will determine the winners and losers of the AI Ecosystem.

Meta Platforms – Llama-Seek

RICHARD WINDSOR — Mon, 07 Apr 2025 07:04:04 +0000

The fightback begins.

Meta is demonstrating what RFM Research and Alavan Independent have suspected which is that DeepSeek’s methods are not that hard to replicate and that we are seeing the start of a race to the bottom in open source that will soon be replicated elsewhere.
Meta has released Llama 4 which does not show any sudden jump in performance but comes in new shapes and sizes and with new structures and techniques that make it much more efficient to train and operate.
Llama 4 (see here) is currently available in three flavours all of which use the Mixture of Experts architecture that was first used (but not invented) by DeepSeek R1 in January 2025.
I suspect that more Llama 4 variants will be released over time with some certain to come at LlamaCon on April 29th, which looks like the replacement of the old F8 developers conference last held in 2021.

- First, Llama 4 Behemoth: which is 2tn parameters in size with 16 experts of which 288bn are active when one of the experts is engaged.
- This is primarily a distillation model meaning that it is designed to be used to help run post-training on smaller models and make them more performant.
- Second Llama 4 Maverick: which is a 400bn parameter model with 128 experts of which 17bn parameters will be active when one of the experts is engaged.
- This looks like the evolution of Llama 3.2 405bn has been designed to be the flagship in terms of being used to answer requests.
- Third, Llama 4 Scout: which is currently the smallest at 109bn with 16 experts of which 17bn parameters will be active when one of the experts is engaged.
- This model has a massive 10m token context window which is 5x bigger than Gemini and is as far as I can tell by far the largest available.
- This would translate to roughly 27,000 pages or about 75 novels.
- Clearly, the main use case for this variant is to upload massive amounts of data and then be able to answer questions about it.
The performance of these models is unremarkable in that they are in line or just ahead of their relevant peers, but what Meta is really highlighting is that they are cheaper to use.
Meta claims that Maverick (400bn) costs $0.19-$0.49 per $1m input tokens compared to DeepSeek v.3.1 at $0.48 and GPT-4o at $4.38.
Digging into the footnotes, I would estimate that the real cost will be something closer to DeepSeek leaving Gemini as the cost leader although there are too many unknown parameters to have a really clear view of how this will shake out.
It looks to me like the focus of Llama 4 is on competing in the open-source community where Meta has adopted a number of the techniques that were used by DeepSeek to produce R1 in January 2025.
Specifically, these would be Mixture of Experts and quantisation where Meta is now quantising in the cloud down to FP8 and even showed some data on its smallest model running at FP4.
RFM and Alavan Independent have written about these techniques in depth and concluded that these were the two that produced most of the savings claimed by DeepSeek (see here).
These techniques are already spreading quickly across China, and I expect that the other open-source model providers will adopt these techniques almost as quickly as Meta has.
Google claims (without a shred of evidence) that it is already running more efficiently than DeepSeek R1, but I have my doubts given the huge pile of cash that has to invest in AI and the tendency of Western model makers to shoot for superintelligent machines by making models bigger with more compute rather than focusing on efficiency.
I think it may now look to adopt these techniques as it comes under pricing pressure and so the scene is set for a race to the bottom in terms of pricing.
This is yet another sign that the world of foundation models is commoditising fast meaning that pricing and developer traction will be key to building the next digital ecosystem.
There is no doubt that OpenAI currently leads this race but both Meta and Google have billions of users that they can migrate to their services meaning that OpenAI will have to fight hard to keep up.
The net result is that this brings a correction closer as returns from investment fall due to pricing pressure and service providers start to miss their targets.
I am not in a hurry to pick up any of these names which are now a bit cheaper thanks to tariff and trade war uncertainty which looks set to continue for at least a few more sessions.

Autonomous Autos – Hard Economics

RICHARD WINDSOR — Mon, 31 Mar 2025 08:08:38 +0000

Economics will decide this market.

WeRide thinks that it is mostly governments and regulations that will impact its path to profitability, but I suspect that it will be the market which decides who wins and who loses.
The world of autonomous driving is chugging along with slow progress from the pure plays and where the vehicle makers have effectively given up and focused their efforts on advanced driver assistance systems.
Some members of the automotive industry are referring to these advanced assistance systems as autonomous driving, but it is not until I can be fast asleep in the back seat and not worry for my life, will I consider the autonomous problem solved.
This is what is referred to as level 4 and level 5 while the automotive industry is focused on level 3 which means that for certain roads and certain conditions, the vehicle can take over but must remain supervised.
The automotive industry is right to focus on this as this is a product that it can sell now and is much cheaper to implement than a fully autonomous vehicle.
Meanwhile, the pure plays are pushing ahead but it is very slow going.
Waymo is the undisputed leader with Cruise having pulled out and most of the other offerings still using safety drivers, but it is a close race.
This is because it is far more difficult to go from 95% to 96% than it is to go from 0% to 90% and this trend of increasing difficulty is continuing.
This is why progress has slowed down considerably and allowed the laggards to catch up, meaning there is not much difference between the different offerings.
Hence, when the different solutions are finally good enough, they are likely to come onto the market roughly at the same time and be pretty much equivalent to one another.
This is what puts a large hole in Tesla’s business case for its robotaxi offering because it is not going to have the market to itself.
The result will be a bloodbath of cutthroat competition resulting in the price being something like $0.4 per mile rather than the $1 per mile that Tesla is currently predicting.
This is the difference between 80% gross margin and 25% gross margin on ¼ of the revenue base which makes a gigantic difference to both the profitability and fundamental value of the autonomous driving industry.
Hence, I continue to think that it is economics as opposed to regulation that will determine the fate of this industry and both PonyAI and WeRide are in for a rough time in the harsh glare of being publicly listed.
Both of these companies are experiencing very low growth and are very unprofitable as a result of not having enough revenues to cover their operating expenditure.
I think the eventual winners will be the few companies that can achieve scale and are efficient enough to be able to generate a profit despite the inevitability of low gross margins.
This will probably mean deep pockets and so the pure plays like WeRide and PonyAI are going to have to raise a lot more money or seek to be acquired.
Neither of these businesses is mature enough to warrant being listed and so it is possible that the public market was the only option left to them to raise more money.
I think that the winners from all of this are the OEMs who don’t need to spend billions developing their own solutions but instead can wait until the technology is mature and the robotaxi industry is beginning to emerge.
By that time, I suspect that the brutal economics of the Robotaxi industry will mean that there are several perfectly good solutions available for acquisition or licensing at very reasonable prices.
Timing remains very uncertain but if Nvidia is to be believed, it is just around the corner.
I take a more conservative view and stick with my long-held target of 2028 for commercial autonomy.