China – Radio Free Mobile

Tech Newsround – ASML, AMD & Global Foundries

RICHARD WINDSOR — Thu, 17 Apr 2025 06:28:45 +0000

ASML Q1 25: The tariff effect.

ASML reported reasonable Q1 25 results, but its order book was way adrift of expectations, which I take as a sign of weakness in China and uncertainty around tariffs as opposed to a sign that the AI freight train is slowing down.
Q1 25 revenues / EPS were E7.7bn / E6.00 broadly in line with expectations of E7.8bn / E5.74.
Guidance also remained unchanged with Q2 25 revenues expected to be E7.2bn – E7.7bn (E7.45bn) and FY 2025 at E30bn – E35bn.
However, the order book fell well short, coming in at E3.9bn compared to forecasts of E4.8bn.
This caused some consternation, which combined with new restrictions hitting Nvidia (see here) and AMD (see below) triggered another correction in the semiconductor sector.
ASML remained tight-lipped about China, even though its contribution to sales and orders is falling as Chinese customers become more concerned about the long-term viability of using ASML equipment.
I expect this to continue as the Department of Commerce is showing every intention of further increasing restrictions on what can be exported to China.
Tariff uncertainty has also been a contributing factor that has caused customers to delay orders, meaning that once there is visibility on how global trade is going to be conducted, this should quickly correct.
ASML remains the sole supplier of equipment capable of manufacturing advanced chips for AI in the cloud and edge, and so it is being used as a gauge for AI demand.
The weak order book looks to me to be more about China and tariff uncertainty than it is pointing to a sudden drop in AI demand, and so I do not expect that we will see related weakness in its customers and the customers of its customers.

AMD: Same game, same pain.

AMD has said that it will take up to $800m in provisions as a precaution for the chips that it will now no longer be able to sell in China, which again indicates that the long-term effect of these restrictions will be economic rather than technological.
The MI308 product that AMD has been selling in China now requires a license from the Department of Commerce, which, with a presumption of denial, effectively means that shipments will now cease and not resume.
The stated intent of these restrictions is to prevent China from developing advanced AI that could be used for the purpose to damage US interests, which at a high level has already badly failed.
This is because DeepSeek, Alibaba and others have already produced AI that competes with the leaders and are likely to continue to do so regardless of the restrictions that are placed upon China.
However, the lack of advanced silicon will mean that Chinese AI costs more to produce and run, which will greatly undermine the proposition to use China’s AI outside of its borders.
I have serious doubts whether the US Department of Commerce has thought this far ahead, and I expect that rendering China uncompetitive when it comes to other countries is going to be the main benefit of these restrictions when it comes to containing China’s rise.

Global Foundries – Bathwater Baby.

With tariffs being all the rage, one baby that has been thrown out with the bathwater is Global Foundries, whose fabs, while not leading edge, are all situated far from China’s backyard with a hefty presence in the USA.
Hence, the USA fabs in New York have suddenly become more attractive, and I suspect that there has been an upswell of inquiries about using Global Foundries to manufacture in the USA.
This makes Global Foundries an interesting one to consider as a “tariff” trade, and given that it is now trading on 18.6x 2025 PER, it has also become much more attractive on a fundamental basis.
I don’t own Global Foundries, but it increasingly looks like it is worth having a look.

China vs. USA – Art of the Deal?

RICHARD WINDSOR — Wed, 16 Apr 2025 06:37:17 +0000

The screw is given another turn.

While it is not very difficult to make an argument for banning the sales of advanced chips into China, giving Nvidia no warning and costing it $5.5bn appears to be a bit of own goal in the short term, but in the long-term, this is likely to hurt China’s competitiveness overseas.
The US Department of Commerce has announced that Nvidia’s H20, AMD’s MI308 and other chips like it will now require a license to be exported to China with immediate effect.
With a presumption of denial of licence, this will effectively halt all shipments of affected products with immediate effect.
The net result is that $5.5bn of Nvidia inventories and commitments to the H20 may now have to be scrapped unless Nvidia can sell or repurpose the chips elsewhere.
In previous instances where the rules suddenly changed, Nvidia was given enough warning and it was able to re-use the capacity it had earmarked for China elsewhere.
I suspect that this would have been the same except that H20 is materially different to the H100 (upon which it is based) as opposed to the H800 which is the same as the H100 except that its memory bandwidth is capped at 300 GBPS as opposed to 600 GBPS.
Hence, when it transpired that it would no longer be able to sell the H100 to China, the H100s could easily be sold elsewhere but the H20 is very different.
This chip has limits on its performance where it is estimated that it is capped at just 14% of the compute throughput of the H100 with the goal being to slow and limit China’s ability to produce cutting-edge AI.
This has been nothing short of a complete failure as DeepSeek R1 and Alibaba’s most recent models demonstrate that they can perform just as well (and in some cases better) than the current crop of models from the global leaders.
Furthermore, I have doubts that hitting the H20 will prevent China from producing these models as silicon does not really determine the ability to make models but more the speed and economics of their production and use.
This is why Alavan Independent and RFM Research have long argued that the Chinese are not behind in AI but that they will have great trouble competing economically.
This is crucial because it is already clear that DeepSeek does not have a sustainable competitive edge when it comes to training and so pretty soon the USA and China will be competing head to head when it comes to the economics of AI.
One only has to look at Llama 4 (see here) for evidence of this which uses the Mixture of Experts and quantisation techniques which RFM research has calculated contributed the lion’s share of the efficiency improvements claimed by DeepSeek in January 2025.
It is at this point that the silicon advantage will become apparent.
While the latest systems from Nvidia are extremely expensive, they offer a significant gain when it comes to the cost of tokens for both inference and training.
By comparison, China will soon be 2 generations or more behind meaning that its models while competitive when it comes to performance, will cost more to train and run.
This means that when it comes to expanding outside of its borders, Chinese AI is going to be more expensive than competing solutions with the added complication of being tied to the Chinese state.
This is how the latest restrictions are going to do very little in the short-term in terms of limiting China’s AI capability but will handicap its competitiveness meaning that it will be predominantly non-Chinese AI that is selected by non-Chinese countries and companies.
Given how long-term the Chinese think this could be a factor in bringing it to the negotiating table to hammer out a trade deal that both parties can be satisfied with as opposed to continuing rounds of tit-for-tat measures and rhetoric.
For Nvidia, it is unlikely to be able to lay off the H20 chips it has already made and are those that are in production and even if it can, it will be at prices below what China would have paid for them.
Hence, while there is likely to be some write-back of this provision, I suspect a good portion of it will be used and is lost to shareholders.
The good news is that this is a small bump in the road for Nvidia where demand for its products remains strong and it continues to sweep its competitors contemptuously to one side.

Tech Newsround – Nvidia and Apple

RICHARD WINDSOR — Tue, 15 Apr 2025 06:25:36 +0000

Nvidia – Window dressing

Nvidia is signing up to make products in the USA which is a move I suspect that it was already executing but it will have done itself no harm by being seen to fall in with the patriotic agenda.
Following a catch-up with the President of the United States, Nvidia has said that it will produce up to $500bn of products in the US including full AI systems in addition to the chips that TSMC is making in Arizona.
This does not look like there is very much new here as Nvidia had already committed to making chips in Arizona and it is already building facilities in Houston and Dallas with its partners.
However other partners like Amkor and SPIL are also increasing their commitment to the USA and this is where I think Nvidia is increasing its commitment.
The real reason why Nvidia is already diversifying away from Taiwan is to mitigate the risk of China interfering in the semiconductor supply chain which has been rising for some time.
This risk will only continue to increase as the USA and its allies increase trade and technological pressure on China.
Hence, we are likely to see continued diversification by all of TSMC’s customers away from Taiwan even if a trade deal is struck with China and the tariff-related chaos dies down.
Although China has retaliated with reciprocal tariffs, it has yet to take serious action against the increasing pressure being placed upon it.
This could take the form of a ban on Apple or Qualcomm shipments into China or a blockade of Taiwan which would cause real consternation.
However, these sorts of moves are very risky for China and could do as much, if not more, damage to the Chinese economy than to the US meaning that these sorts of measures can only be a last resort.
Hence, I think that a deal of some description will get done as this remains in everyone’s best interest (especially China’s) given how weak and troublesome its economy has been since the pandemic.

Apple – Privacy? What privacy?

Apple is in real trouble when it comes to AI as it is now resorting to using its users’ data to train its AI to try and catch up in a field where it remains woefully adrift.
This is also an admission that synthetic data is never as good as the real thing, and now that it is in trouble, it has been forced to compromise its long-standing position on the privacy of its users.
In a blog (see here) Apple details how it is extending its differential privacy technique so that it can improve the quality of its AI algorithms and try to close the yawning gap to its rivals.
Differential privacy is a technique where data is injected with random sequences such that it is meaningless when viewed in insolation but when aggregated the random pieces cancel each other out and the real aggregated data remains.
This protects user privacy but only using the averages means that the model that is trained will never be as good as if it were trained with the individual pieces of data.
This has been acceptable to date but now that AI is becoming more important, users are beginning to notice just how bad Apple is at AI which has been exacerbated by the less-than-successful roll-out of Apple Intelligence.
Apple has been training its AI with synthetic data, but I remain sceptical about the value of synthetic data because it depends on being a realistic simulation of reality which it almost always is not.
Hence, one quickly arrives at a garbage-in, garbage-out scenario which is where Apple has found itself.
I think that this is why it has had to come up with this convoluted way of using the real data of its users while still being able to claim that it has not violated its privacy standards.
One can argue whether or not its desperation has forced it to compromise its privacy ideals, but what is clear is that Apple is in a very difficult position when it comes to AI and it is not getting any better.
Fortunately, for the moment, this is not going to compromise the sale of iPhones, but if competing devices start sporting AI agents that everyone loves to use (big if), then Apple’s market position will come under much greater threat.
This is why Apple needs to do something but I suspect that this will not go nearly far enough to fix the issue.

China vs. USA – Yield Debate

RICHARD WINDSOR — Thu, 10 Apr 2025 06:01:07 +0000

Yield is everything.

Relative newcomer to the semiconductor game SiCarrier made a splash at Semicon China 2025 by launching a lot of new equipment and claiming that non-optical methods may enable China to start producing 5nm chips on homemade equipment.
This has underpinned a lot of chatter with some of the commentariat now confidently predicting that China will produce 5nm chips as soon as 2026 (see here) but I remain highly sceptical.
Mr Du Lijun, President of SiCarrier (founded in 2021) stated that homemade tools could be employed to make 5nm chips where its non-optical machines could help deal with the lithography issues.
Mr Du is referring to a technique called quadruple patterning which uses multiple passes with a laser to create the very narrow lines needed to make a 5nm chip.
The problem is simple in that China is trying to draw lines that are 5nm wide with a laser with a wavelength of 193nm which is very difficult to do.
This is where the multi-patterning technique comes in and TSMC had some success with this at 7nm but soon switched to EUV because the yields were not high enough.
The EUV machines operate at a wavelength of 13.5nm making it much easier to draw the narrow lines required at 7nm and below once one has overcome the massive technical issues of creating a reliable light source at that wavelength.
ASML is the only company in the world to have cracked this problem and given that it cannot sell its EUV products to China, it has had to resort to taking what TSMC developed and advancing it.
While China has had some success at producing chips at 7nm (like TSMC did), I don’t think that it has ever been able to produce these chips economically because the yield has been too low.
Typically yields above 90% are needed to make a decent return and after several years SMIC and Huawei are still way behind that with their 7nm process.
This means that these chips have negative gross margins making them economically unsustainable.
This is something that China really cannot afford right now given the precarious state of its economy which will only get worse when China tries to take this technique to 5nm.
Multi-pattening at 5nm is something that nobody has tried to the best of my knowledge and will require at least 20% more steps than 7nm.
Hence, I suspect that it will take China far longer than the commentariat claims to get this to work at 5nm and it is unlikely to ever be economically viable.
China has demonstrated that it is adept at making do with limited resources (as it did with DeepSeek) and so I think it may find a way to make a 5nm chip.
However, I do not think that it will ever get the yields up to a point where it can make these chips economically viable.
If this were possible, I think we would have seen TSMC stick with multi-pattening at 7nm and migrate it to 5nm.
Consequently, I think China will remain limited in terms of what it can produce below 20nm as it can no longer buy or service new equipment although I think the home-grown industry may be able to fill this gap in time.
However, at the leading edge, I think China will not go beyond 14nm economically and think that even making a chip at 5nm is a stretch.
Hence, I continue to think that for the foreseeable future, China will struggle to make advanced semiconductors at high yield and low cost.
This means that the US restrictions will continue to be quite effective with very little impact coming from the availability of these new machines.
This continued rivalry means that the days of global standards are numbered and, going forward, we are likely to see one standard outside of China and another, competing, and non-compatible standard inside China.
This is bad news for everyone as two incompatible networks will generate much less value than one global network.
Consequently, long-term growth for the entire technology sector over the next 10 to 20 years will be lower than it otherwise would have been

Meta Platforms – Llama-Seek

RICHARD WINDSOR — Mon, 07 Apr 2025 07:04:04 +0000

The fightback begins.

Meta is demonstrating what RFM Research and Alavan Independent have suspected which is that DeepSeek’s methods are not that hard to replicate and that we are seeing the start of a race to the bottom in open source that will soon be replicated elsewhere.
Meta has released Llama 4 which does not show any sudden jump in performance but comes in new shapes and sizes and with new structures and techniques that make it much more efficient to train and operate.
Llama 4 (see here) is currently available in three flavours all of which use the Mixture of Experts architecture that was first used (but not invented) by DeepSeek R1 in January 2025.
I suspect that more Llama 4 variants will be released over time with some certain to come at LlamaCon on April 29th, which looks like the replacement of the old F8 developers conference last held in 2021.

- First, Llama 4 Behemoth: which is 2tn parameters in size with 16 experts of which 288bn are active when one of the experts is engaged.
- This is primarily a distillation model meaning that it is designed to be used to help run post-training on smaller models and make them more performant.
- Second Llama 4 Maverick: which is a 400bn parameter model with 128 experts of which 17bn parameters will be active when one of the experts is engaged.
- This looks like the evolution of Llama 3.2 405bn has been designed to be the flagship in terms of being used to answer requests.
- Third, Llama 4 Scout: which is currently the smallest at 109bn with 16 experts of which 17bn parameters will be active when one of the experts is engaged.
- This model has a massive 10m token context window which is 5x bigger than Gemini and is as far as I can tell by far the largest available.
- This would translate to roughly 27,000 pages or about 75 novels.
- Clearly, the main use case for this variant is to upload massive amounts of data and then be able to answer questions about it.
The performance of these models is unremarkable in that they are in line or just ahead of their relevant peers, but what Meta is really highlighting is that they are cheaper to use.
Meta claims that Maverick (400bn) costs $0.19-$0.49 per $1m input tokens compared to DeepSeek v.3.1 at $0.48 and GPT-4o at $4.38.
Digging into the footnotes, I would estimate that the real cost will be something closer to DeepSeek leaving Gemini as the cost leader although there are too many unknown parameters to have a really clear view of how this will shake out.
It looks to me like the focus of Llama 4 is on competing in the open-source community where Meta has adopted a number of the techniques that were used by DeepSeek to produce R1 in January 2025.
Specifically, these would be Mixture of Experts and quantisation where Meta is now quantising in the cloud down to FP8 and even showed some data on its smallest model running at FP4.
RFM and Alavan Independent have written about these techniques in depth and concluded that these were the two that produced most of the savings claimed by DeepSeek (see here).
These techniques are already spreading quickly across China, and I expect that the other open-source model providers will adopt these techniques almost as quickly as Meta has.
Google claims (without a shred of evidence) that it is already running more efficiently than DeepSeek R1, but I have my doubts given the huge pile of cash that has to invest in AI and the tendency of Western model makers to shoot for superintelligent machines by making models bigger with more compute rather than focusing on efficiency.
I think it may now look to adopt these techniques as it comes under pricing pressure and so the scene is set for a race to the bottom in terms of pricing.
This is yet another sign that the world of foundation models is commoditising fast meaning that pricing and developer traction will be key to building the next digital ecosystem.
There is no doubt that OpenAI currently leads this race but both Meta and Google have billions of users that they can migrate to their services meaning that OpenAI will have to fight hard to keep up.
The net result is that this brings a correction closer as returns from investment fall due to pricing pressure and service providers start to miss their targets.
I am not in a hurry to pick up any of these names which are now a bit cheaper thanks to tariff and trade war uncertainty which looks set to continue for at least a few more sessions.

China vs. USA – Foundation Collapse

RICHARD WINDSOR — Wed, 26 Mar 2025 09:19:39 +0000

China will only accelerate what is already in progress.

It appears that it is now Chinese government policy to flood the open-source community with AI models which to me looks like an attempt to turn foundation models into solar panels which China can then dominate by making them cheaper than anyone else.
In solar panels, China undercut everyone else on price, gained a lot of share, and then consolidated the entire ecosystem in China such that no one else would ever be able to compete.
A similar strategy may now be underway in AI as it appears that flooding the open source with good quality models for all sorts of uses is now the official policy of the Chinese foreign ministry (see here).
I place a high level of credibility on this opinion as the National Security Law of China requires that important technology such as AI models receives a license before it can be exported meaning that contribution to the open source community must have explicit state approval.
This strategy has worked extremely well in solar panels and is underway in the automotive market, but I suspect that this is not going to work in AI.
There are four reasons for this.
- First, already commoditising: meaning that the Chinese contributions may end up simply accelerating a trend that was already in place rather than creating something new.
- This is happening because even as more and more resources are being pumped into training, the performance of models is only increasing incrementally.
- This is why when one looks at the benchmark charts, all of the models look pretty much the same.
- If all the models perform to the same level, then developers will be increasingly ambivalent as to which model they use meaning that the price ($/ 1m tokens) to use that model will come under pressure.
- This has been underway for some time meaning that a flood of Chinese models is likely to do nothing more than accelerate the process meaning no real change to the eventual outcome.
- Second, distrust: where there is widespread distrust of Chinese software in companies that are not Chinese.
- This means that even if the models are incredibly cheap to operate there will be a price insensitive wariness about putting a Chinese model inside a private corporate cloud.
- The fact that the models are available on open-source means that the source code is available and can be screened for backdoors and loopholes but I am not sure that this is going to make much difference.
- This is because at tens or hundreds of billions of parameters in size, finding a tiny back door that sends data to China will be like trying to find a needle in a haystack.
- Third, moving control point: which means that foundation models are not going to be very important as the emerging AI industry matures.
- As a result, even if the Chinese were to take control of foundation models and everyone began using them as their starting point, it would not allow the Chinese to take control of the AI industry as it has in solar panels.
- This is because, by that time, foundation models would simply be the operating system of AI with all of the value creation happening in the services created on top of them or in a combination of different services for specific use cases.
- Fourth. Not better or cheaper: The solar panel industry ended up in China as China produced cheaper panels and then once it had control of the industry, was able to make them better.
- This is not dissimilar to what is going on in electric vehicle batteries where there is real innovation in reducing the charging time required for lithium batteries.
- However, Chinese AI models in the open source are no cheaper than their Western counterparts (free) and tests indicate their performance is similar and not better.
- Furthermore, I have doubts about whether they will be cheaper to train and run as the methods pioneered by DeepSeek seem to be spreading like wildfire in China meaning that they are not that hard to replicate.
- Hence, I think Western companies will soon be able to replicate these methods, meaning that open-source models from Meta, Mistral, and so on will be equivalent or better in terms of running costs.
The net result is that a rinse and repeat of the solar panel strategy is not going to work, although it is likely to accelerate the commoditisation that is already underway, put pressure on pricing, and perhaps bring forward the time when there is a correction in AI company valuations.
I think that many companies will be very wary of using Chinese-created software in their networks and for their business operations.
There will also be very little pressure to do so as Western variants will be just as good or better and cost the same or perhaps less due to their ability to access cutting-edge silicon.
Hence, if rinse and repeat is the strategy of the Chinese state, I think that in this case, it is not going to work.
However, I continue to think that self-sufficiency is a policy that is going to continue to be aggressively pursued meaning that the Balkanisation of the Internet of which RFM and Alavan Independent have written often is still very much on the cards.
Two incompatible networks generate much less value than one large one and so the long-term growth of the entirety of the technology sector will be lower for everyone.
This is best avoided if possible but, at the moment, with no rapprochement in sight or likely, this remains little more than a pipe dream.

China AI – The Efficiency Game

RICHARD WINDSOR — Mon, 24 Mar 2025 12:28:20 +0000

China plucks the low-hanging fruit

Efficiency is spreading like wildfire through the Chinese AI industry indicating that the edge that the US has enjoyed is not as large as many thought, but also indicating that the techniques being employed may be easier than I thought to replicate.
This is a theme RFM has been keeping an eye on for a couple of years since it became clear that models would be required to run on devices at the edge of the network.
Ant Group (see here) and Tencent (see here) have released new models that they claimed are as good as the best of what the West has to offer, but tend to be smaller and trained with a fraction of the resources.
This is important as it indicates that the Chinese are a force to be reckoned with in AI and have not been limited as much as expected by their inability to source the latest and greatest chips from Nvidia and AMD.
- First, Ant Group Ling-Plus and Ling-Light: which are 290bn parameters and 16.8bn parameters respectively and are available on Hugging Face (see here) for anyone to download and use.
- Ling-Plus was compared against DeepSeek V2 (not the latest), Llama 3.1 70B (not the latest or largest), Qwen2.5-72B (latest version) and GPT-4o (not the latest).
- This is a fairly random set of peers which could easily have been selected for their ability to produce a decent chart as opposed to an objective comparison.
- However, the fact that these models have been made available to open source means that anyone can download and test them and so I suspect that the test data is real.
- These models are relevant as they represent further efforts by the Chinese AI ecosystem to produce AI that is globally competitive despite not having access to the latest and greatest silicon chips.
- Ant Group lists a series of techniques that it has used to train its models many of which sound similar to what DeepSeek outlined in its release in January.
- Second, Tencent Hunyuan T1: which is a “reasoning” model based on Tencent’s in-house foundation model Hunyuan Turbo 5.
- Tencent claims that T1 performs as well as DeepSeek R1 or OpenAI’s GPT-4.5 and o1 and puts up the usual set of benchmark comparisons (see here).
- Tencent has yet to make any of its models available to open source which given that they are increasingly powering its ecosystem, is not a big surprise.
- Consequently, there is no way to test Tencent’s claims but this should change somewhat as Tencent did say that it would make some of its models available sometime this year.
The net result is that when it comes to efficiency in AI, the Chinese are without a doubt leading the world.
How much of a lead and how sustainable that lead is are open to debate but for the moment, it is safe to say that in this area of AI, China leads the world.
I do not think that this is by design but has been caused by the fact that Chinese companies are unable to buy leading-edge silicon chips leaving them with little choice but to do more with less.
This is why China has developed this niche first but the speed with which this is spreading throughout China implies that the efficiency improvements pioneered by DeepSeek are not that difficult to replicate.
I also do not think that the Chinese AI companies are cooperating with each other outside of what they are contributing to open source as they also compete aggressively on price for the different services that they offer.
Hence, I think that the Western peers should be able to reverse engineer many of the savings that the Chinese are making should they feel inclined to do so.
RFM Research has been forecasting that there are plenty of savings to be had in terms of training and inferencing LLMs more efficiently, but that many companies outside of China have not had to bother.
This is because there has been a large oversupply of money into the sector meaning that no one has really had to worry very much about efficiency.
Instead, Western companies have pursued the dream of super-intelligent machines which I have long argued is unlikely to bear fruit any time soon.
Hence, there is likely to be a correction at some point where the focus in the West shifts from pipe dreams to commercial reality which, even without AGI, is a large and lucrative opportunity.
This is why I don’t think the correction will be nearly as bad as the Internet bubble of 1999 and 2000, but companies that can’t make money with $300bn+ valuations are going to take a big hit.
This is why Nvidia is the only direct AI company I would own as its valuation is still in the realm of sanity as it is making money and generating cash from AI right now.
However, I still prefer the adjacencies of inference at the edge and nuclear power where valuations are even cheaper and sentiment has not affected them yet.

Baidu AI – Keeping Up Appearances

RICHARD WINDSOR — Tue, 18 Mar 2025 02:37:50 +0000

Baidu joins the efficiency game.

Keen to catch up in the China AI game, Baidu has launched two newer and cheaper models, but it is not clear whether Baidu is offering further innovation on efficiency or just a cheaper price to stem market share loss to its rivals.
Baidu has released the latest version of its foundation model ERNIE and a “reasoning” model that is upon it called ERNIE X1.
- First, ERNIE 4.5: which Baidu claims is fully multimodal being able to handle videos, photos and text and beats GPT-4o on selected benchmarks.
- Baidu produced data which showed ERNIE 4.5 beating GPT-4o on multimodal benchmarks such as CCBench, ChartQA, and DocVQA but losing on MMMU.
- Over its selected benchmarks it beat 4o with an average score of 77.8 vs. 4o on 73.9.
- On text capability, it also scored very well against DeepSeek R1, GPT-4.5 and so on.
- Baidu has also moved into line with its Chinese peers and will be making its model available to open source for the first time.
- This is an accelerating trend in China and I expect that this will soon become standard procedure for Chinese models as it helps drive adoption by developers which is what winning the AI ecosystem is all about.
- ERNIE 4.5 is priced to sell at $0.55 per million input tokens and $2.20 per million output tokens which is 136x and 68x cheaper than OpenAI’s current price for GPT-4.5.
- I suspect this is more a factor of Open AI being ridiculously overpriced as opposed to ERNIE 4.5being very cheap.
- Second, ERNIE X1: which is based on ERNIE 4.5 but has been fine-tuned to “reason” with more inference time dedicated to producing multiple answers and then distilling a more detailed answer.
- Baidu claims good performance for X1 but has not provided any benchmark data, but I suspect that it will measure up reasonably well when it is properly tested.
- Here the target is clearly DeepSeek R1 and Baidu has again priced its model to sell with $0.28m per million input tokens and $1.10 per million output tokens which is roughly half what DeepSeek is currently charging for access to R1.
Baidu is pricing its AI services to attract users, but this is no indication of what gains in efficiency (if any) that it has made to be able to offer service at this price point.
Instead, Baidu may have decided that it will lose money to ensure that the likes of Alibaba or DeepSeek do not steal the fledgling AI ecosystem that it is building.
Price cuts are already common in China which are fuelling a brutal price war where I suspect only a few will be able to make money.
This is made doubly difficult by the fact that China remains cut off from the advanced silicon that would allow cost per token to fall meaning that Chinese companies will need to find other ways to become more efficient.
This is precisely what is going on and I suspect that the Chinese are going to continue to lead when it comes to finding increasingly efficient ways of running AI while maintaining leading-edge performance.
The most recent release was Alibaba’s QWB-32B model which is very small in size but quite mighty when it came to performance according to Alibaba (see here).
The net result is that despite the deluge of unsubstantiated claims, I think that the Chinese are ahead when it comes to training and inferencing AI efficiently and it is likely to stay that way.
That means that when the correction comes and everyone is forced to do more with less, China will find itself with a significant advantage.
I continue to think that this advantage lies in the methods and techniques that Chinese companies use to train and run these models which are not part of the models and weights that they are publishing to the open source.
This is how China can grow its standing in AI, see its models get adopted outside of China and at the same time retain its IP and maintain a lead.
I continue to have a position in Alibaba as it has pivoted towards AI which has had a good effect on the valuation my position should continue to reduce its losses.

Manus AI – Music Maker

RICHARD WINDSOR — Mon, 10 Mar 2025 07:42:41 +0000

Monica demonstrates an advance in orchestration, not AGI.

Manus AI is the latest AI story out of China but once the hype has died down, I suspect the real story will be how Monica has made an advance in teaching agents to execute tasks rather than making them more intelligent.
Manus AI is the latest product from a Chinese AI company called Monica based in Shenzhen that until recently has focused on providing a range of generative AI services using foundation models from OpenAI, Anthropic, DeepSeek, Google, Mistral and so on.
However, on 6^th March it previewed Manus AI which is similar to the Deep Research products that everyone is launching, but with the supposed ability to take independent action to complete tasks more thoroughly.
Once again, if this is an accurate description of Manus’ capabilities, then it represents a big step forward going from assisting humans to complete tasks to completing the task without human involvement.
Manus appears to have access to its own cloud-based computer that is running Linux with access to the command line meaning that as a superuser, it can execute tasks like a human would rather than make a series of recommendations.
This looks to me to be a big part of the innovation in taking the AI agent from recommending to doing.
In the launch video, the model is seen screening resumes, conducting real estate research and stock analysis.
The best example of the augmented functionality is real estate research where areas of New York are screened on certain criteria such as schools and crime and then goes on to write a program to calculate affordability ending up with a list of viable options.
Other examples on the website (see here) include creating a report on the film industry and analysing US naval tactics during World War II.
Manus has also been tested on the GAIA Benchmark which is designed to test an AI’s capability to do real-world tasks.
Here, it comfortably beats OpenAI’s DeepResearch on all levels of difficulty.
It is important to note that unlike DeepSeek R1 and Alibaba’s QwQ-32B, Manus is not available in open source meaning that none of these claims can be easily verified.
The architecture of Manus is also not very clear, but it appears that, like DeepSeek, it is using a mixture of experts structure but whether it has been able to make any resource savings like DeepSeek is claiming is unclear at this stage.
Yichao “Peak” Ji, Co-Founder and Chief Scientist states that Manus would not have been possible without the open source community meaning that Llama, Mistral and others have probably been heavily used and distilled in the creation of Manus.
Peak has also committed to giving back to the open-source community, but it will be a little more complex than just uploading the model.
As a mixture of experts model, Manus uses several distinct models that Monica has fine-tuned to enable the service that Manus provides.
It is some of these sub-models that have been “fine-tuned for Manus” that Monica will be contributing to open source as opposed to the whole Manus system itself.
The idea here is to drive the adoption of Manus as an AI assistant rather than to allow anyone to download the whole system tinker with it and recreate an instance on Manus on a separate system.
This could be for both business and geopolitical reasons as like DeepSeek, I suspect that the real IP in Manus is in how it was created and trained meaning that uploading some of the sub-models gives very little away.
Monica will not be making anything available to open source just yet, which is probably a sign that the company is still in the process of getting a license to export the sub-models.
This also means that 100% of Manus is running in China and everyone who uses Manus and submits data to it for analysis or execution of a task is sending data to China.
China’s national security laws mean that the Chinese state has access to all of this data whenever it wants which I suspect is going to lead to rapid bans on Manus by companies whose employees are already using AI to aid their daily tasks.
Until a few days ago, Monica was a small Chinese company that hardly anyone had heard of meaning that, like DeepSeek, it has been operating on a tiny fraction of the resources that OpenAI, Google, Anthropic and so on have access to.
While the internet is going wild for this latest AI release from China, problems are cropping up.
There are reports of crashes and failure to complete tasks, but this could be due to the server being under massive strain.
However, it has also failed to book flights, reserve tables at restaurants and was not as good at writing code as some of its Western competitors.
This tells me that Monica has created and trained Manus for a specific series of task categories but the minute that one tries to take it out of its comfort zone, it gets into trouble.
This means that it is not a step towards super-intelligent machines but is an excellent implementation of software that makes the AI we already have more useful.
RFM Research has concluded that one of the difficulties in making AI agents is all of the messy software plumbing that will allow the agents to access the apps, APIs, websites and so on to carry out tasks on the user’s behalf.
With the use of a Linux superuser command prompt, Monica appears to have made an advance in working out a good way to deal with this difficult task, but this is not a step towards superintelligent machines.
It is, however, another PR coup for China that is doing a very good job of proving that it is a contender in AI and it is here I think the struggle will be fought for the next couple of years.
The USA will find it harder to contain China in AI as it has in semiconductors although the edge in semiconductors is going to be an issue when China wants to ramp to real scale.
This is another impressive development from China but with no real progress towards super-intelligent machines, the stage is still set for some form of correction in terms of expectations and valuations across the whole AI industry.

Alibaba QwQ – 32bn Questions

RICHARD WINDSOR — Fri, 07 Mar 2025 07:46:13 +0000

More questions than answers.

Alibaba has produced a model that performs pretty well, but it is so small that China is once again challenging the Western “bigger is better” philosophy.
However, there are so few details about how this was created that it is impossible to know whether this represents another step forward for China or if it is merely a public relations exercise.
The new model is called QwQ-32B (see here) which has 32bn parameters and it performs extremely well against DeepSeek R1 (671bn), and o1-mini (estimates 100bn) on a number of the usual tests.
These tests have been carried out by Alibaba and so would have been chosen and run to make QwQ-32B look as good as possible, but at face value, it looks like an impressive achievement.
This flies directly in the face of the Western approach to generative AI which is that the bigger the models are made and the more data that is pumped through them and the more compute they expend in inference, the better they perform.
I have long thought that this approach, which has worked well for a few years, has reached the end of its usefulness and that a new approach is needed.
Furthermore, I have also been of the opinion that when it comes to innovations around AI efficiency, the Chinese would get there first (see here).
This is not because Chinese engineers are more brilliant than Western ones, but merely because export restrictions and capital limitations have forced them to do more with less.
By contrast, the Western players are flush with cash, have access to ever more powerful GPUs and have been able to focus solely on pushing the boundaries of what AI can deliver.
QwQ-32B looks important because if this kind of performance can be delivered with a 32bn parameter model, then pretty soon we will see it running on smartphones and laptops.
However, there are a lot of questions which Alibaba has not answered.
- First: Inference where we have no idea how long QwQ-32bn is running inference before it delivers its answers or how this compares to everyone else.
- Increasing compute time for inference, is a key strategy to improve models that are solving reasoning tasks.
- Hence, QwQ-32bn may be able to perform particularly well by scaling up its reasoning time which means that it is not really as efficient as Alibaba would have us believe.
- QwQ-32B is available in the open source and so 3^rd party independent testing should be able to answer this question in time.
- Second: training data, where we have no idea how much data was pumped through QwQ-32B to get it to perform at its current level.
- Experiments by DeepMind (Chinchilla) some time ago demonstrated that by training models with more data, smaller models could be made to perform better than models 4 times their size.
- If Alibaba has used this technique, then what it gains in model size it has lost in terms of training data meaning that it was not cheaper to train than its larger counterparts.
- However, it will be cheaper to run inference given its smaller size and this is where a real saving could be found.
- Third: Pruning which I have often referred to as the nuclear fusion of AI.
- This is a technique where one can remove 90% of a model and see no degradation in its performance.
- The problem with this is that it is so time-consuming to work out which parts make up the 90% of the model to remove that it is not worth the effort.
- However, once again if one is purely focused on the efficiency of inference, this is a technique worth considering.
Alibaba has not said whether it has used any of these techniques and has made no claim about how much it costs to train and how much inference it consumes and so this development needs to be taken with a dose of caution.
The model is available on Hugging Face meaning that anyone can download and test it but the same issue as there is with DeepSeek also applies here.
This is because the National Security Law of China requires that AI technology of this nature is granted a licence from the CCP before it is allowed to be exported.
Alibaba will have had to have obtained a license from the CCP to export QwQ-32B immediately raising the question of why the CCP would allow important Chinese IP to fall into the hands of its rivals.
If independent testing verifies that QwQ-32B performs as well as its much larger rivals, this will be another feather in the cap of China’s reputation as an AI powerhouse which RFM Research and Alavan Independent suspect may have been the main reason to grant Alibaba an export license.
The net result is that if QwQ-32B performs as well as promised, it will accelerate the trend of deploying models on edge devices rather than in the cloud.
It is far cheaper for a service provider like OpenAI to deploy its models on edge devices because then it does not have to pay the cost of running inference.
This is by far the biggest draw for inference at the edge and QwQ-32B promises to take performance on edge devices to another level.
The main beneficiaries here are China (whose reputation is boosted once again), Alibaba that is proving its AI chops versus DeepSeek, Qualcomm, MediaTek, Arm and Broadcom who are all selling chips that can run inference on edge devices.
I continue to like this theme as a way to invest in the current AI craze and Qualcomm is the stock that I own.
I also own Alibaba which is becoming less and less of a drag on my portfolio and remain happy to stick with it and sit out the recovery now that it seems to be finally here.