Sunday, March 22, 2026

MAR.(1st Half) 2026

MARCH 1, 2026

On February 26, I wrote that Broadcom had begun supplying its 3.5D XDSiP technology. Since then, many more articles have appeared, so I would like to summarize the situation a bit further. XDSiP stands for eXtreme Dimension System in Package, and it appears to be the name of a technology that combines three-dimensional stacked chips with 2.5D packaging using a silicon interposer. The first product to adopt this approach seems to have been FUJITSU-MONAKA, currently under development by Fujitsu. Broadcom reportedly has additional customers beyond Fujitsu, with a total of six designs underway and a target of shipping one million XDSiP chips.

Various pieces of information about FUJITSU-MONAKA have also emerged, but some clarification may be necessary. It has been officially disclosed that FUJITSU-MONAKA is a data center CPU. Since Fujitsu has also been awarded the contract for the next-generation supercomputer “Fugaku NEXT,” some reports seem to assume that FUJITSU-MONAKA will serve as its processor. However, that does not appear to be the case. Fugaku NEXT is still quite far in the future, and FUJITSU-MONAKA may be somewhat underpowered by then. The CPU for Fugaku NEXT is expected to be FUJITSU-MONAKA-X, and it has been disclosed that its computing die will be manufactured on a 1.4 nm-class process. The GPU portion of Fugaku NEXT will reportedly be handled by NVIDIA, likely connected via NVLINK-Fusion. I recall seeing reports about this around October last year.

Changing topics, there is news from Qualcomm. Ericsson and Qualcomm have reportedly formed a partnership toward 6G development. AI computing is expected to be integrated into the network itself in the 6G era. I believe SoftBank and Rakuten Mobile have also been studying the fusion of AI and RAN. Companies such as Qualcomm and SoftBank, which operate across both communications and computing domains, seem to be in a favorable position.

Another Qualcomm-related topic: at GDC (Game Developer Conference), scheduled to begin on March 9, Qualcomm had been expected to unveil a Windows handheld device. However, Qualcomm has reportedly withdrawn, and no announcement will be made. Severe memory shortages appear to be casting a shadow over the effort. This catastrophic RAM shortage has reportedly been dubbed “RAMageddon.”

Finally, in quantum computing news, IBM has reportedly installed a quantum computer at the Indian Institute of Technology (IIT) Kanpur in Uttar Pradesh. IBM, IIT, and the state government are expected to collaborate on technical initiatives in the field of quantum computing.


MARCH 2, 2026

At MWC (Mobile World Congress) 2026, currently being held in Barcelona, Intel has reportedly announced Xeon 6+, codenamed Clearwater Forest. It features 288 E-cores. Around this time last year, it had been expected that the previous generation Xeon 6E (Sierra Forest, manufactured on the Intel 3 process) would introduce a top-tier 288-core model. However, adoption of the mid-range 144-core SKU by data center customers did not progress as strongly as anticipated, and the 288-core flagship never reached the market. Now, Clearwater Forest — manufactured on the Intel 18A process — has been introduced with 288 cores as Xeon 6+.

The E-core architecture has also evolved, moving from Crestmont in Sierra Forest to Darkmont in Clearwater Forest. Darkmont is likewise used for the E-cores in Panther Lake.

The computing die of Clearwater Forest consists of twelve 24-core chiplets, totaling 288 cores. Internally, each 24-core block appears to be composed of core modules containing four cores and 4 MB of L2 cache. Six such modules form a single core die. These twelve core dies are grouped into three sets of four and mounted on three base dies. The base dies function as L3 cache memory and are manufactured on the Intel 3 process. Each base die is elongated in shape, and the bridge connecting four core dies likely houses the DDR memory controller. The I/O die is produced on the Intel 7 process and is positioned at both ends of the three base dies. Since it supports PCIe Gen5, it may share similarities with Sierra Forest.

While Sierra Forest integrated 144 cores into a single large die, Clearwater Forest adopts a base-die plus 3D packaging approach, appearing to represent significant progress in chiplet architecture. It is unclear whether Intel plans to produce a 144-core variant of Clearwater Forest, but in principle this could be achieved by keeping the same base dies and using only six core dies instead of twelve. It is also worth noting that Xeon 6E and Xeon 6+E are socket-compatible.


MARCH 3, 2026

Following on from yesterday, topics from MWC (Mobile World Congress) continue. Qualcomm has reportedly announced the Snapdragon Wear Elite for wearable devices. Although designed for wearables, it is equipped with an NPU. The integrated Hexagon NPU is said to support models with up to 2 billion parameters and deliver performance of around 10 tokens per second (TPS). It is expected to be used as an edge AI device.

The Fujitsu FUJITSU-MONAKA processor (which I will simply call MONAKA from here on), a recurring topic in recent days, was apparently showcased at MWC in the form of a wafer. MONAKA is a chiplet-based design consisting of three dies: a computing die (TSMC 2 nm), a cache die (TSMC 5 nm), and an I/O die (TSMC 5 nm). When I first saw the photo, I wondered which die the wafer represented. It turns out that it was a wafer in which computing dies had been directly bonded onto cache dies. IBM showed a 2 nm GAAFET wafer several years ago, and Intel displayed an 18A wafer at CES in January. So perhaps this was intended to demonstrate a wafer with 2 nm chips mounted onto a 5 nm wafer. I found that somewhat surprising.

There were also several topics related to silicon photonics. NVIDIA has reportedly invested $2 billion each in Lumentum and Coherent. Communication power consumption inside AI factories is expected to exceed that of traditional data centers. To address this, NVIDIA has been developing its Spectrum Ethernet switch chips. It will be interesting to see what kind of progress results from these investments.

In another development, Ayar Labs — a startup working on co-packaged optics (CPO) — has reportedly raised $500 million in funding, with support from NVIDIA and AMD. This seems to reflect the growing view that CPO will be essential for reducing communication power consumption within AI data centers. One member of Ayar Labs’ board is former Intel CEO Pat Gelsinger. The company is also said to be conducting joint development efforts with Intel and TSMC.

Another Intel-related topic: Intel has reportedly formed a partnership with India’s Infosys. India-related developments have been frequent lately, and Intel now appears to be part of that trend. The collaboration involves building enterprise AI solutions based on Infosys’ “Infosys Topaz” platform, using Intel Xeon processors, Gaudi accelerators, and AI PCs. This seems aimed at enabling AI to support corporate operations. Personally, I imagine that instead of simply streamlining expense reporting for business trips, AI might eventually handle trip planning and arrangements themselves — at least that is my hope.


MARCH 4, 2026

Meanwhile, the ongoing conflict involving the United States, Israel, and Iran is casting a significant shadow over data centers and cloud services. AWS facilities at three locations in the UAE and Bahrain have reportedly suffered damage, leading to degraded service in one region. Articles over the past few days have suggested that data centers were hit by drone attacks. If regional power grids are also affected, cloud services in those areas could become unusable. AWS is said to be advising customers to prepare failover strategies to alternative regions.

Historically, the internet is often described as having originated from U.S. military technology. The “World Wide Web,” represented by the familiar “www” at the start of URLs, was designed so that even if one location were attacked, the network could survive via alternative communication paths. Today’s internet underpins everyday life, so attacks on networks or data centers effectively amount to attacks on daily living itself. It is no longer sufficient that “some part survives.” Ideally, connectivity and communication would help create a world in which such attacks do not occur at all.

It is well known that Intel operates manufacturing facilities in Israel. The CPU division that developed Centrino (the Banias core) in the 2000s is also located there. Mellanox, which NVIDIA acquired around 2020, was likewise an Israeli company and likely continues to exist as a division within NVIDIA.

Returning to the usual technology topics, Akamai — the world’s largest content delivery network (CDN) provider — has reportedly introduced several thousand NVIDIA Blackwell GPUs. These are deployed in NVIDIA RTX PRO servers equipped with BlueField-3 DPUs. Rather than concentrating them in a single location, Akamai plans to distribute them across network points worldwide to enable decentralized cloud computing. The company is said to be promoting a vision of a globally distributed AI computing grid. I am not entirely sure, but it reminds me somewhat of Cisco’s earlier concept of fog computing.

Next, an AMD-related development. According to CEO Lisa Su, demand for CPUs has surged beyond expectations. She reportedly mentioned this during a meeting with analysts, noting that the proportion of inference workloads is increasing and driving unexpected CPU demand. It is unclear whether this refers specifically to Ryzen or EPYC CPUs, or to APUs combining CPUs and NPUs. As agent-based AI becomes more prevalent, CPU demand is likely to grow. There was also a report that AMD has partnered with U.S.-based EMS provider Flex to assemble GPU platforms, with Flex handling assembly of the Instinct platform. Given that GPU demand is already strong, this likely reflects steady expansion of manufacturing capacity.

Another AMD-related note: a report indicated that MiTAC’s AMD Instinct MI350X servers use diamond-based cooling. Specifically, a sheet containing synthetic diamond material is placed between the GPU and the heatsink. While graphene sheets — also carbon-based — already exist and offer higher thermal conductivity than diamond, they conduct electricity, posing short-circuit risks if introduced into electronic circuits. Graphene powders may also present health concerns, making them difficult to handle as materials. Semiconductor and electronics manufacturing typically evaluates materials under ESH (Environment, Safety, and Health) criteria to ensure compliance with environmental, safety, and health standards.

Finally, an Intel governance update: Intel’s chairman is expected to step down at the shareholder meeting in May. The incoming chairman reportedly has prior executive experience at Qualcomm and Google. Under the current chairman, Intel separated its manufacturing operations to establish Intel Foundry. The new chairman is expected to maintain Intel’s position as an integrated device manufacturer.


MARCH 5, 2026

It appears that NVIDIA has halted production of the H200 GPUs that were intended for export to China and has instead redirected manufacturing capacity toward Vera Rubin. Reports suggest that around 250,000 H200 chips are currently sitting in inventory. Although export approvals exist between the United States and China, various restrictions remain in place, including a reported cap limiting Chinese companies to purchasing up to 75,000 chips.

The H200 is based on a TSMC 5 nm process paired with HBM3 memory, while Vera Rubin is expected to use a 3 nm process with HBM4. This means both manufacturing lines and component supply chains differ. If production is taking place at TSMC Fab 18, which supports both 5 nm and 3 nm nodes, adjustments to manufacturing plans might be relatively manageable. However, it is unclear which fab is actually being used.

In another NVIDIA-related development, reports indicate increased production of the RTX 3060. As part of the Ampere generation, it is believed to be manufactured using Samsung’s 8 nm process, whereas Ampere A100 GPUs produced at TSMC used a 7 nm node. Memory shortages appear to have reduced the availability of consumer GPUs in the market, and the RTX 3060 is reportedly being shipped together with memory as part of installation kits.

Turning to Intel, the Intel 18A process has been regarded primarily as an internal node for products such as Panther Lake, Xeon 6+ (Clearwater Forest), and next-generation Xeon 7. However, reports suggest that Intel has begun reconsidering external foundry expansion for this node. Toward the end of last year, there were indications that NVIDIA GPUs would not adopt 18A, reportedly due to low yields. It was also suggested that Intel Foundry would begin offering external customers access starting from the 14A node. Now, there are reports that an enhanced variant called 18A-P will be made available, implying that manufacturing progress has improved both performance and yield (throughput).

From a broader market perspective, Broadcom has begun offering XDSiP services, starting with FUJITSU-MONAKA. Chiplet technologies have previously been used mainly in chips designed for internal deployment or self-sales by companies such as Intel, AMD, and NVIDIA, or for internal use by hyperscalers like AWS. Broadcom’s move to provide 3.5D chiplet-based ASIC development as a commercial service represents a significant shift. Intel Foundry may likewise be accelerating efforts to capture demand for ASICs built with 3D stacking and chiplet configurations using EMIB and Foveros.


MARCH 6, 2026

It seems that MWC concluded today (March 5 local time). Reports suggest that Samsung may release smart glasses within this year. The mention of smart glasses brings to mind the early days of Google Glass. The initial consumer version was discontinued in 2015, and although industrial versions continued afterward, they were reportedly phased out around 2023. Now Samsung appears ready to reenter the space. Samsung’s smart glasses are said to recognize what the wearer is looking at and display related information on a connected smartphone screen. One reason Google Glass was discontinued was that its built-in camera raised privacy concerns among the public. Ten years later, in an era dominated by viral content on Instagram, X (formerly Twitter), and TikTok, one wonders whether public attitudes toward privacy have changed.

Such a system — capturing visual context via smart glasses and transferring it to smartphones for information retrieval — will likely increase network traffic. In the opening keynote at MWC, Qualcomm CEO Cristiano Amon reportedly predicted that as AI agents become widespread, global network traffic could increase by three to seven times by 2034, with roughly 30 percent attributable to AI-driven communications. Communication standards will need to evolve toward higher speed and capacity, and Qualcomm-led proposals for 6G and Wi-Fi 8 were reportedly highlighted at MWC. Communications and computing are like the two wheels of a vehicle — both must advance together.

Returning to NVIDIA topics, Texas Instruments and NVIDIA have reportedly formed a partnership in robotics. TI will provide millimeter-wave sensing technology, while NVIDIA will contribute AI capabilities via Jetson Thor (built on TSMC’s 4 nm process). Integration will be achieved through the NVIDIA Holoscan Sensor Bridge system. Since the target applications involve robotics operating in environments such as indoor facilities, warehouses, or outdoor fields, this appears to be an initiative in physical AI focused on environmental sensing. In terms of the commonly described five-layer AI stack, TI sensors would represent the infrastructure layer, Jetson Thor the hardware layer, data aggregation would occur in the cloud layer, training enhancements would take place using Omniverse-based models, and the ultimate goal would be remote operation or autonomous robotics. Jetson Thor has reportedly already been adopted by companies such as Boston Dynamics, so combining drivetrain expertise with new sensing capabilities could help advance autonomy.

Another NVIDIA-related development is that the company reportedly intends not to further increase investments in AI startups such as OpenAI and Anthropic. This suggests a reassessment of financial relationships with such startups, possibly because the scale of previous investments has already become sufficiently large. Startups eventually grow beyond the “startup” phase.

Turning to AI, data centers, and regulation, SambaNova — which Intel had previously considered acquiring before opting for partnership — has reportedly introduced the SN50, positioned as well suited for building agent-based AI systems. While the semiconductor process has not been disclosed, the previous-generation SN40L used TSMC’s N5 node, leading to speculation that the SN50 may use a 3 nm process. Power consumption is said to reach a maximum of 30 kW per rack, but can be reduced below 20 kW with slightly lower performance, enabling air-cooled deployments. SoftBank is reportedly a customer, with plans to install the systems in Japanese data centers.

The U.S. government is also expected to tighten export controls on AI chips, likely covering GPUs from NVIDIA and AMD. Details remain unclear, but for example, purchasing 1,000 units of a hypothetical GB300 system might involve relatively lighter screening, whereas large-scale procurements could require prior approval from U.S. authorities. Performance-based thresholds similar to those used in the Wassenaar Arrangement may be introduced. Previously about 40 countries were targeted, but reports suggest the scope could effectively expand worldwide. Approval processes may involve explanations of intended use as well as acceptance of inspections. Companies such as Broadcom and Qualcomm may also be affected. While U.S. vendors could face constrained business opportunities, some observers believe this could create openings for Chinese firms such as Huawei.

Around the end of last month, Anthropic reportedly declined a request from the U.S. Department of Defense to allow unrestricted use of its Claude AI system. As a result, the U.S. government is said to have designated Anthropic as a “supply chain risk,” effectively issuing a warning regarding its use. AWS, which provides access to Claude, reportedly stated that usage unrelated to defense applications would not be affected, and that alternative solutions would be offered if necessary. It is also understood that the Department of Defense had already shifted certain contracts from Anthropic to OpenAI.

Another AWS-related development concerns the earlier drone attacks on data center facilities in the Middle East (UAE and Bahrain). Iran has reportedly issued a statement acknowledging the attacks as intentional, claiming the objective was to assess whether the facilities were supporting U.S. military operations. Reports also suggest that the United States retaliated by striking two data centers in Tehran.

Finally, AI cloud provider IREN has reportedly decided to deploy 50,000 NVIDIA B300 GPUs in locations including British Columbia in Canada and Texas in the United States. The company previously focused on Bitcoin mining and will continue mining at a reduced scale while shifting toward AI cloud services. Although both involve GPUs, the industries are fundamentally different, so such a strategic pivot is notable. During the NVIDIA Pascal and Volta generations, GPU mining activity was widespread, and one occasionally wonders where all those GPUs eventually ended up.


MARCH 8, 2026

It seems that die photos of Intel’s Panther Lake have surfaced. The chip appears to have been physically removed from the CPU package, and the wiring layers were then stripped away starting from the bump side. The purpose of such removal likely varies from person to person — for example, to verify whether the chip structure matches publicly disclosed specifications, or to infer undisclosed design details.

Since this type of teardown falls under reverse engineering, it is often prohibited by contract. However, if there are no contractual restrictions, the act of disassembly itself is apparently not illegal. If the chip is removed layer by layer from the bump side, the lowest layer would eventually reach the transistors. Given the nanometer-scale fabrication, observing transistors directly would require an electron microscope. Even without going that far, viewing the upper interconnect layers should provide a rough sense of functional blocks such as cores and memory.

In the past, power wiring was typically wider than signal wiring, which likely served as a visual cue when identifying block boundaries. However, Panther Lake’s computing tile is built on Intel 18A with BSPD (backside power delivery), so functional boundaries may be harder to distinguish than before.

Panther Lake is said to consist of three chiplets and one base die, but six chiplet variants and three base die variants are reportedly being manufactured. Specifically, there are two types of computing tiles (16-core and 8-core, both on Intel 18A), two types of graphics tiles (12 Xe and 4 Xe, on TSMC N3E and Intel 3 respectively), and two types of platform control tiles (PCIe configurations with 20 lanes or 12 lanes, on TSMC N6). With these combinations, three final product SKUs are expected, resulting in three corresponding base tiles.

Although three chiplets each have two variants — theoretically enabling eight combinations — only three products will actually be offered. Even in the monolithic era prior to Raptor Lake, Intel created different dies depending on core count, so this approach may simply reflect a similar product segmentation strategy.

Turning briefly to social topics, the February U.S. employment report released on Friday delivered a surprising result. While forecasts suggested an increase of around 54,000 unemployed individuals, the actual figure showed a decline of 92,000 workers. One contributing factor cited was labor disputes involving healthcare workers in California. More broadly, the growing adoption of AI is also seen as having an impact. From January to February, layoffs appear to have increased year-over-year in fields such as computing, data processing, and electronic components.

I have written several times recently that companies may eventually expand by effectively using AI and thereby create more employment opportunities. However, it will likely take time for such a cycle to emerge, and in the meantime there will probably be fluctuations in employment due to job transitions.

This leads me to consider how efficient it actually is for AI to replace human labor. Current AI data center deployments are being discussed in gigawatt-scale terms. Meanwhile, the human brain is often said to consume about 20 Wh of power. This figure has been cited for roughly 15 years and is likely based on the brain’s daily caloric consumption, typically estimated at 350–450 kcal. This probably varies more with body size than intelligence. Since 1 kcal corresponds to about 1.16 Wh, assuming a daily brain consumption of 413 kcal translates to about 480 Wh per day. Dividing by 24 hours yields approximately 20 Wh.

However, this represents an average. Energy consumption must differ between sleep and wakefulness, and even during waking hours between rest and work. Suppose we assume 10 Wh for eight hours of sleep and 25 Wh for sixteen waking hours. If we further assume that eight waking hours of ordinary activity correspond to 20 Wh, then eight hours of focused work would be around 30 Wh. Since 30 Wh corresponds to operating at 30 W for one hour, this model suggests that the human brain’s sustained working capacity might be approximated as 30 W over an eight-hour period. This makes comparison with electronic systems easier.

If a human brain operates at roughly 30 W, then an AI workload consuming 1 kW corresponds to the energy consumption of about 33 people. This does not imply equivalent productivity — it is purely an energy comparison. Furthermore, since the human limit of 30 W applies to roughly eight hours, an AI workload running continuously at 1 kW for 24 hours would correspond to about 100 person-equivalents in terms of energy consumption. Would an AI system running at 1 kW truly perform the work of 100 people? One might hope so, but the answer feels somewhat uncertain.

This raises the question of whether it is possible to measure both workload and power consumption in a comparable way. Around the middle of last month, I wrote that running OpenAI Codex on Cerebras’ WSE-3 at around 1,000 TOPS produced responsiveness comparable to human pair programming. The WSE-3 reportedly consumes about 18 kW (or around 23 kW for the full CS-3 system). Operating for eight hours would correspond to the energy consumption of roughly 600 people, or about 1,800 people over 24 hours. Could such a system realistically perform the equivalent pair-programming work of 600 or even 1,800 individuals? In terms of concurrent user accounts, numbers of that scale might be feasible.

From this perspective, current AI workloads appear to consume significantly more energy than humans, and would need to deliver productivity equivalent to many individuals in order to justify their cost. As generations advance, the gap may narrow, eventually producing a more intuitive sense of practicality. For example, if AI workloads could be measured in tens of watts while performing work equivalent to several people, they might be considered roughly comparable to human capability.

Today, token cost per GPU is a common metric. However, if AI workload power consumption and productivity could be compared directly with human labor, then alongside progress in 3D scaling as defined by IRDS, something resembling a new version of Moore’s Law might emerge. Perhaps this is overly optimistic, but it is an intriguing possibility.

Finally, from a different angle: a film about TSMC is scheduled to be screened in Japan. It was reportedly shown in Taiwan last year. The original title is “造山者 世紀的賭注” (literally “Mountain Builders: A Century’s Gamble”), while the Japanese release is titled Chip Odyssey: Taiwan’s Gamble. Screenings are planned for March in Tokyo and April in Kumamoto.


MARCH 9, 2026

Speaking of which, NVIDIA’s GTC will take place this month, and Samsung is reportedly planning to showcase SOCAMM2 memory. Each memory module is said to support up to 192 GB. SOCAMM stands for Small Outline Compression Attached Memory Module. A Compression Attached Memory Module (CAMM) refers to a memory module that is attached using a compression-based mounting mechanism. In other words, while DIMMs are inserted into slots, CAMMs are fastened with screws.

DIMMs are constrained in the number of contacts they can provide by the length of their edge connector, whereas CAMMs can increase contact density across a surface, potentially offering an advantage in terms of area efficiency — although installation may be more cumbersome. CAMMs come in various shapes and sizes, but SOCAMM is NVIDIA’s standardized format, unified into a rectangular form factor measuring 14 mm by 90 mm. With SOCAMM2, support is expected to move from LPDDR5X to LPDDR6, increasing capacity from 128 GB to 192 GB.

A size of 14 mm by 90 mm is somewhat longer and narrower than an M.2 2280 module (22 mm by 80 mm). Incidentally, a stick of chewing gum is roughly 21 mm by 76 mm.

Continuing with NVIDIA-related topics, there was an article stating that NVIDIA has invested in Nscale, a UK-based AI startup. Nscale is reportedly a spin-out from a cryptocurrency mining operator. As a provider of compute resources, the company is said to be building multiple data centers across Europe. While the amount of NVIDIA’s investment has not been disclosed, Nscale itself appears to have raised around $2 billion. I recall reading somewhere that NVIDIA is becoming something like a bank — and in a sense, that may not be far from the truth.

Earlier this month, there were also reports that NVIDIA is increasing its investments in silicon photonics. Meanwhile, Google is said to have placed large orders with Taiwan’s Innolux for optical interconnect technologies. Google uses optical switches known as Apollo OCS (Optical Circuit Switch) in its backbone networks. Apollo employs MEMS to switch silicon mirrors. Although MEMS-based switching is not extremely fast, backbone networks do not require frequent switching, making such technology viable. These switches are reportedly used in systems from Google TPUv4 through TPUv7.

Companies such as Lumentum and Coherent, which NVIDIA is said to be investing in, also handle similar MEMS-based optical switching technologies. NVIDIA currently uses co-packaged optics (CPO) in its Spectrum-X Ethernet and Quantum-X InfiniBand platforms, but it may also be considering the use of OCS for backbone switching. Packet switching in the optical domain could reduce power consumption. Lower communication power requirements appear to be one of the enablers for scaling AI data centers.

On a somewhat nostalgic note, Microsoft CEO Satya Nadella reportedly commented that Windows would not have been born without Intel, and Office would not have been born without Apple. This was said during a presentation at a Morgan Stanley conference held last week. IBM’s personal computer, which adopted Intel’s 8086 processor, used an OEM version of MS-DOS known as PC-DOS. As IBM PC clones spread worldwide, MS-DOS effectively became a de facto standard. Around the time the processor lineage reached the 80386, Windows became practically usable. These developments feel quite nostalgic.

As for Office, Excel was originally something people used on the Mac. I personally did not own a Mac, but there was a Mac in the office at the company where I later worked, dedicated to running Excel. I seem to recall that calculations were done in Excel and graphs were generated using CricketGraph. My memory is somewhat vague, but before joining the company I used a DOS-based spreadsheet called MultiPlan. IBM’s Lotus 1-2-3 was also available on the market at the time. Eventually, Excel became available on Windows, and it was bundled into the MS-Office package. That, too, is now a nostalgic story.

Since last week, articles covering the Morgan Stanley conference have been appearing almost daily. In addition to the points mentioned earlier — such as AMD CEO Lisa Su’s remarks about surging CPU demand and Intel’s interest in offering Intel 18A-P to external foundry customers — NVIDIA CEO Jensen Huang reportedly said that constraints such as RAM shortages are actually welcome, since customers tend to make more deliberate purchasing decisions under such conditions. Many industry leaders appear to have spoken at the conference.


MARCH 10, 2026

There was also an article stating that NVIDIA has partnered with a startup called Thinking Machine Lab. The company appears to be a spin-out from OpenAI and is said to focus on developing AI systems designed for collaboration with humans rather than full autonomy. Although the financial details have not been disclosed, the startup has reportedly received investment from NVIDIA. Plans are said to include building a 1-GW Vera Rubin data center, though details remain unclear.

In another NVIDIA-related development, Sustainability Magazine reportedly hosted a meeting involving corporate executives together with AWS and NVIDIA. Discussions focused on using AI models to predict the impact of climate change on infrastructure and supply chains in order to mitigate business risks. The assumption seems to be that AWS would provide the cloud platform while NVIDIA’s Earth-2 models would be used for simulation. I believe Earth-2 began rolling out in late January.

Predicting how climate change may affect regional economies — and enabling companies to make individual decisions to avoid such risks — appears to be the objective. Pure weather forecasting tends to remain in the public domain and may lack business expansion potential. However, if risk prediction, mitigation strategy design, and impact analysis on corporate performance can be computed in an integrated manner, this could clearly address corporate needs. One might describe this as a kind of enterprise-level cyber-physical system.

Continuing with AWS-related news, there were reports that AWS has introduced an external key management system (KMS) from Eviden in Europe. This is said to be one of the features of AWS’s European Sovereign Cloud. Eviden is a subsidiary of Atos. Atos is a French IT company that also handles supercomputers — I believe its HPC division used to be known as Bull. By avoiding storage of master keys in the cloud itself, the system is said to enhance security.

Finally, regarding Intel: an open-source project related to Intel’s Optane memory has reportedly been archived. Intel had already withdrawn from Optane hardware around 2022, while Micron, its co-developer, had exited the collaboration even earlier. Optane used phase-change materials based on chalcogenides as its storage element, and the term PCM (Phase Change Memory) appeared frequently in articles at the time.

The memory structure — forming storage elements at the intersections of orthogonal wiring layers — was relatively easy to conceptualize. Intel and Micron had positioned Optane as a technology to bridge the memory hierarchy gap between DRAM and flash. Around that period, similar structures were discussed under names such as RRAM and MRAM, but industrially speaking, MRAM (which can be manufactured at TSMC) may be the only one that has remained viable.


MARCH 11, 2026

It seems that IBM has once again partnered with Lam Research to develop sub-1 nm transistors. About five years ago, when IBM announced its so-called 2 nm GAA (gate-all-around) transistor, the work was also conducted under a joint development framework with Lam Research.

Current EUV processes rely on wet etching, but wet processes appear to make it difficult to adapt to High-NA EUV. By combining Lam Research’s dry resist technology with High-NA EUV, the goal is reportedly to enable sub-1 nm patterning without sacrificing throughput. The research facility involved is said to be the IBM Research NY Creates Albany NanoTech Complex in Albany, New York State. If I recall correctly, engineers from Japan’s Rapidus also traveled to IBM in Albany for EUV training.

As a side note, 1 nm corresponds to 10 angstroms. Recently, node naming such as Intel’s 18A or TSMC’s A14 has shifted to angstrom-based terminology. The newly discussed sub-1 nm range would therefore imply single-digit angstroms. However, silicon’s crystal lattice constant is 5.431 Å. Could channel lengths (minimum feature sizes) shorter than the lattice constant really exist? That seems unlikely.

The reason single-digit angstrom node names are possible is that node naming has diverged from actual channel length or minimum line width. I described that historical background in my blog entry from February 8.

Turning to today’s NVIDIA news: there are reports that NVIDIA has invested $2 billion in the AI startup Nebius. The company plans to deploy AI data centers exceeding 5 GW by 2030. This suggests that NVIDIA’s business model is no longer simply to develop GPUs and sell them, but to develop them, provide investment, ensure their utilization, enable customers to generate profits, and then recover returns from that ecosystem. In this sense, GPUs and racks may be little more than tools for expanding the AI ecosystem. That may explain why NVIDIA continues searching for startups to fund. Perhaps no one will be able to catch up anymore — although AMD is making strong efforts.

Another side note: I once heard a metaphor that the god of opportunity has thick hair in the front but is bald in the back. If you fail to grab the hair as the opportunity approaches from the front, turning around afterward will not help — your hand will simply slip off the smooth back of the head as the opportunity disappears. It is a metaphor suggesting that those who recognize opportunities first gain the greatest rewards, while late imitators often lose out. Perhaps NVIDIA has managed to seize the god of opportunity by the hair.


MARCH 12, 2026

There are reports that autonomous driving trials in central Tokyo will begin, led by Nuro, a company backed by investors including NVIDIA, Toyota, and Uber. Nuro already operates robotaxi services in places such as San Francisco. Tokyo presents unique challenges, however: its streets are narrow, and traffic runs on the left. As a result, training data and experience from overseas deployments may prove only partially applicable. The goal is said to be Level-4 autonomous driving.

In another development related to autonomous driving, Qualcomm has reportedly partnered with the UK startup Wayve on ADAS (advanced driver assistance systems). The collaboration combines Wayve’s AI Driver with Qualcomm’s Snapdragon Ride platform. The idea is to provide a pre-integrated system that allows automakers to simplify the implementation of AI-based driving functions across multiple vehicle models. This integration is expected to enable progression from hands-off driving toward eyes-off capabilities, corresponding to Level-3 autonomy. Qualcomm appears to be placing increased emphasis this year on robotics and autonomous driving, moving in a direction similar to NVIDIA’s.

In a somewhat unusual move, Meta has announced a new series of chips for both training and inference. The lineup includes MTIA 300, 400, 450, and 500, with the newly revealed chip believed to be the MTIA 400. The MTIA 300 is reportedly already in mass production. The ASIC design work is being handled by Broadcom, with plans to release successive generations every six months starting from the 400 series.

The MTIA 400 appears to consist of two main compute dies placed side by side. Each die has two HBM stacks above and below it, for a total of eight HBM stacks. Judging from the apparent dimensions — each compute die looks roughly equivalent in area to six HBM stacks — the die size may be around 600 mm². The two dies are arranged through parallel translation rather than rotational symmetry. The left edge seems to connect to a PCIe chiplet, while the right edge connects to two network chiplets.

The naming beginning at 300 likely reflects earlier internal generations — MTIA1 (built on TSMC 7 nm) and MTIA2i (on TSMC 5 nm), corresponding to the 100 and 200 series. From the 300 generation onward, TSMC 3 nm technology may be in use.

Turning to AMD, there are reports that CEO Lisa Su will visit South Korea. She is expected to discuss memory supply with Samsung. The MI400X GPU will likely require twelve HBM4 stacks. She is also scheduled to visit Naver, where discussions related to AI computing infrastructure may take place. NVIDIA GPUs, if I recall correctly, already have deployment plans reaching around 260,000 units by 2030. AMD appears to be following NVIDIA’s lead.

In the quantum computing space, IBM has announced a reference architecture for QCSC (Quantum-Centric Supercomputing). This architecture consists of four layers: applications, application middleware, system orchestration, and hardware infrastructure. For example, within conventional CPU- and GPU-based quantum workflows, tasks requiring genuinely quantum execution results could be offloaded to quantum processing units (QPUs). The architectural validity of this approach is said to have been demonstrated through joint research involving IBM’s Quantum Heron processor and Japan’s Fugaku supercomputer, as well as other international collaborations.

Finally, regarding semiconductors and geopolitics: supply reductions of helium gas and bromine — both used in semiconductor manufacturing — are reportedly occurring due to the ongoing conflict involving the United States, Israel, and Iran. Helium is exported from Qatar, while bromine is exported from Israel. South Korea and Taiwan are said to be affected to some degree but have secured sufficient supply for the time being. However, prolonged conflict remains a concern. Semiconductor manufacturing depends on global supply chains, so disruptions in one region can easily create ripple effects elsewhere. Multi-sourcing strategies provide some resilience, but they also have their limits.


MARCH 13, 2026

By the way, NVIDIA’s GTC 2026 will begin next week. I wonder if the “five-layer cake theory of the AI industry” will appear again this year. On the GPU front, the next-generation architecture known as Feynman is expected to be announced.

What is drawing particular attention is inference chips, especially those based on Groq technology, which NVIDIA acquired last year. It is said that NVIDIA’s $20 billion acquisition of Groq has dramatically increased the valuation of startups developing NPUs for inference. When Intel acquired Nervana and Habana Labs, the deals were around $400 million (2016) and $2 billion (2019), respectively. In that sense, after the Groq acquisition, inference-focused NPU startups may no longer be easily acquired.

Another point to watch is whether a successor to the Vera CPU will be announced. Since Grace and Vera CPUs are expected to be shipped as standalone products to Meta, demand for Arm-based CPUs could increase going forward.

Another NVIDIA-related development: NVIDIA has reportedly partnered with Palantir Technologies. Palantir provides data analytics tools and has been involved in aggregating data for the U.S. government. The partnership appears to involve using NVIDIA’s Blackwell GPUs for the infrastructure layer of Palantir’s AIOS-RA (AI OS Reference Architecture). This will likely be formally announced at GTC.

Palantir has built services deeply integrated into government operations, and the emerging structure seems to be one in which AI capabilities are embedded into those service platforms, while NVIDIA provides the underlying AI infrastructure. This could enable rapid AI-assisted data analysis and decision support.

Turning to Intel, there are reports that the company has commented on CPU shortages. Production appears to be shifting toward high-end and mid-range products for hyperscalers and AI data centers. In addition to cutting-edge CPUs built on Intel 3 and Intel 18A processes, supply constraints may also affect Raptor Lake generation products manufactured on Intel 7. Entry-level CPUs for consumer boxed products, notebooks, and Chromebooks are likely to be impacted.

There were articles in early February stating that both Intel and AMD had informed Chinese manufacturers of CPU delivery delays. It seems that situation is continuing. While I do not have exact figures, Intel may have been exporting a significant volume of low-end CPUs for mini-PCs. The shift toward higher-end products may have reduced supply availability for China.

In silicon photonics news, a group called OCI MSA (Optical Compute Interconnect Multi-Source Agreement) has reportedly been established. Participants are said to include AMD, Broadcom, Microsoft, Meta, NVIDIA, and OpenAI. The group appears to consist of chip vendors deploying co-packaged optics (CPO), such as NVIDIA’s Spectrum-X and Broadcom’s Tomahawk, along with hyperscalers and AI cloud providers building AI data centers with those technologies.

Standardizing specifications and enabling multi-sourcing at this stage seems important. Currently, optical links operate at around 50 Gbps × 4 wavelengths, delivering 200 Gbps per fiber (one direction). The roadmap aims to increase this to 800 Gbps and ultimately to 3.2 Tbps per fiber. The scope is expected to cover not only CPO but also pluggable optical modules.

A brief note on Apple: the company will celebrate its 50th anniversary on April 1, and numerous related articles have appeared. Apple has also reportedly extended its agreement with Qualcomm regarding the Snapdragon 5G modem-RF system. Some reports suggest delays in Apple’s efforts to develop its own 5G modem.

Finally, AWS news: AWS and Cerebras have reportedly formed a partnership. Cerebras has announced that its CS-3 system will become available through AWS Bedrock. The arrangement appears to involve role separation between AWS Trainium and Cerebras’s WSE-3 to enable high-speed inference.

Inference workloads typically involve two stages: prefill (“understanding the prompt”) and decode (“generating the response”). Prefill is primarily compute-bound and will be handled by Trainium3. Decode is memory-bandwidth-bound, leveraging the exceptional bandwidth of the WSE-3, which stores model weights in SRAM to generate tokens. While GPUs may handle hundreds of tokens per second, this architecture is said to achieve several thousand TPS, enabling high-speed inference.

OpenAI partnered with Cerebras around mid-January, and its appearance on AWS roughly two months later suggests that the “era of Cerebras” may finally be arriving.


MARCH 14, 2026

There are reports that ByteDance, the Chinese parent company of TikTok, will gain access to an AI cluster using NVIDIA’s Blackwell GPUs. ByteDance is said to have partnered with Aolani Cloud, a cloud provider in Malaysia. The cluster reportedly consists of about 36,000 B200 GPUs, likely configured as around 500 GB200 racks.

Export restrictions have limited imports of H200 GPUs into mainland China, so securing access to NVIDIA GPUs via Malaysia — where such restrictions may not apply — appears to be the strategy.

Turning to AMD and semiconductor manufacturing, there were reports that AMD has settled a lawsuit with Adeia. Adeia is said to hold a portfolio of patents related to hybrid bonding technology and had alleged that AMD infringed seven to ten patents. Hybrid bonding is a relatively recent technology and has been widely used in flash memory (SSDs).

In 2024, AMD’s 3D V-Cache was found to infringe certain patents, leading to litigation beginning around last autumn. The settlement has now been reached. Although this concerns manufacturing technology, it is somewhat curious that AMD — rather than TSMC, the manufacturer — was the party involved in the lawsuit. Details have not been disclosed, but AMD may have needed to secure licensing rights.

Before hybrid bonding, stacking chips required pads and micro-bumps between dies. Pad diameters and micro-bump diameters were typically around 30 µm, with similar spacing required. With the advent of face-to-face hybrid bonding, direct Cu-to-Cu connections became possible, eliminating the need for pads and bumps and enabling connection pitches as tight as 9 µm, as seen in technologies such as Intel’s Foveros.

Previously, fillers were required to bridge the height gaps created by bumps, and thermal conductivity considerations were also significant. These requirements have largely disappeared. Hybrid bonding appears to be an essential technology for 3D integration.


MARCH 15, 2026

It seems that Elon Musk may be entering semiconductor manufacturing. Reports indicate that the “Terafab Project” will be launched on March 21. The term tera is one thousand times larger than giga, and gigafab has often been used to refer to TSMC’s 12-inch wafer fabrication plants. Perhaps this reflects a competitive stance toward TSMC.

According to Musk, it may not be necessary to maintain cleanliness across an entire room; instead, isolating a clean environment only around the wafer might suffice. This suggests maintaining cleanliness within wafer transport boxes and process chambers. Perhaps the docking of these boxes and chambers is conceptually similar to the docking of Starship with a space station — securely connecting and disconnecting sealed clean zones.

However, in semiconductor equipment, maintenance work would still seem to require cleanroom environments. I do not know the detailed feasibility of this approach.

AMD has also announced the OpenClaw framework. This appears to be a concept for running generative AI locally on PCs. It may be seen as a response to Intel’s AI PC vision, although current AI PC implementations — such as Copilot+ Windows 11 — still assume network connectivity to data centers. An “agent PC” that operates locally could represent a more forward-looking idea.

The proposed hardware configurations include RyzenClaw and RadeonClaw. The OpenClaw stack consists of WSL2 (Windows Subsystem for Linux) and LM Studio, allowing users to run generative AI models of their choice. RyzenClaw is said to feature 128 GB of system memory, while RadeonClaw includes 32 GB of VRAM — suggesting a relatively high-end price point.

Looking at the OpenClaw configuration, the idea may be to perform reinforcement learning on WSL2 while handling inference through LM Studio, thereby constructing a personal AI agent entirely in a local environment. I also have LM Studio installed on my PC, but once model sizes exceed 8B parameters and reach 12B or 20B, even 128 GB of memory begins to feel constrained.

If one wants to perform reinforcement learning, an iGPU is likely insufficient, and a discrete GPU with around 16 GB of VRAM — along with WSL2 — would be required. (So I have not yet attempted reinforcement learning myself.) Until the practical benefits become clearer, the hardware requirements may feel somewhat excessive.


This Blog text was translated by AI from Japanese Source Blog.

Saturday, February 28, 2026

FEB.(2nd Half) 2026

 February 15, 2026

  1. Speaking of which, due to soaring DRAM prices, there are concerns that computer-related manufacturers may struggle with earnings this year. I also saw an article saying that even boxed CPUs are no longer selling well on Amazon’s online store. It seems that DIY PC users are paying the price for data centers absorbing massive amounts of memory. GPU cards have finally started to return in terms of product variety, but minimum prices have risen.

  2. A little while ago, I mentioned in a “mini update” that installing the January Windows Update — “2026-01 Security Patch (KB5074109) (26200.7623)” — could cause frame-rate drops on NVIDIA graphics cards. Apparently, applying KB5077181, which will be distributed in the February Windows Update, resolves the issue. It also includes fixes for other bugs, including—surprisingly—a “Notepad vulnerability.” I’ll just wait for it to roll out automatically.

  3. The day before yesterday, I wrote that AWS’s CEO commented that the market and industry are overreacting to AI writing code. Supporting that view, there was an article noting that about ten years ago, when AWS captured a large share of the software market, similar “doomsday” narratives emerged—but in the end, no software companies went bankrupt because of it. Thinking back, there were many transitions: freeware during the PC-communication era, GPL (now generally called OSS) in the UNIX world, Linux being released for free, and later the rise of SaaS on the web. Despite all these shifts, the software industry has not disappeared.


February 16, 2026 

  1. By the way, the India AI Impact Summit is being held this week. According to an interview with Intel’s Vice President for the India region, India generates 20% of the world’s data but owns only 2% of global server capacity. In terms of data center power capacity, the figures are said to be: the United States at 53 GW, China at 20 GW, Europe at 13 GW, and India at 1.6 GW (with an additional 1.7 GW planned by 2027). There is clearly significant room for growth.

  2. There was also a press release stating that AMD will deliver Helios systems to India. The scale is said to be 200 MW. Since one Helios rack reportedly consumes about 0.18 MW, even rounding to 0.2 MW suggests roughly 1,000 racks will be deployed. With 72 GPUs per rack, that would amount to 72,000 GPUs across 1,000 racks. These will be introduced as an AI platform by HyperVault AI Data Center Limited, a subsidiary of Tata Consultancy Services (TCS). I recall reading an article around mid-last month about AMD forming a partnership with TCS.

    Another AMD-related item: SoftBank issued a press release stating that it has jointly developed with AMD an orchestrator for partitioned use of AMD GPUs. AMD Instinct GPUs already support partitioning features. A demonstration is planned at MWC in Barcelona next month. In a similar vein, I’ve heard of Fujitsu’s AI Computing Broker, which I believe provides a mechanism to partition NVIDIA GPUs. NVIDIA offers features such as MIG (Multi-Instance GPU) and vGPU. While partitioning can help address GPU shortages, it also seems like a necessary mechanism to increase utilization of GPU servers and accelerate return on investment.


February 17, 2026

  1. At the India AI Impact Summit 2026, which began yesterday, Qualcomm reportedly exhibited a humanoid robot. Apparently, Qualcomm has a robotics and automotive division and is building a general-purpose robotics architecture covering not only humanoids but also everything from home-use robots to industrial autonomous mobile robots. I don’t recall seeing much coverage of Qualcomm beyond smartphones, PCs, and communications chips.

    Another Qualcomm-related topic: because Qualcomm holds essential patents related to the 5G communications standard, smartphone manufacturers must obtain patent licenses from Qualcomm. Naturally, these are not free, and the license fees are added to smartphone prices as royalties. In the UK, there had been a class-action lawsuit alleging that Qualcomm unfairly charged royalties and raised smartphone prices. However, since the royalties are not unreasonable per se, the lawsuit has now reportedly been withdrawn. The royalty for a 5G device is said to be around $16.

  2. Speaking of smartphones, there are reports that Arm, which provides CPU architectures, plans to establish an AI semiconductor research facility at its Austin, Texas site, with support from the Texas state government. Until recently, Arm’s business centered on selling Neoverse N/V architectures and related IP, but it has shifted toward selling CSS (Compute Subsystems). Last year, I believe Arm also announced that it would develop its own chips. Given that Cortex and other designs have long been manufactured at TSMC and Samsung, producing chips in-house should be feasible. As semiconductor manufacturing moves away from being smartphone-centric and toward HPC and AI as the main drivers, it will be interesting to see what kinds of chips Arm brings to market.

  3. On the AI front, NVIDIA issued press releases regarding GB200 and GB300. Compared to Hopper-based platforms, the cost per one million tokens is reportedly reduced to one-tenth with GB200 NVL72 and to one-thirty-fifth with GB300 NVL72. Furthermore, GB300 NVL72 is said to achieve up to 50× throughput per megawatt. GB300 MVL is being adopted by Microsoft, CoreWeave, and OCI (Oracle). CoreWeave had already announced at the end of last month that it would adopt Vera Rubin. With Vera Rubin, compared to Blackwell, the cost per million tokens for MoE (Mixture of Experts) inference is said to be one-tenth, and throughput per megawatt is 10× higher.

    Another AI-related development: Fujitsu announced an “AI-driven software development platform.” It is said to address three challenges: labor shortages, replacement of maintenance work, and moving away from the traditional man-month calculation model. Understanding the Japanese role of SE (system engineer) may be helpful here. By AI-automating practical tasks such as system construction, coding, and maintenance in response to customer IT department requirements, the expectation is that SEs—previously overwhelmed with routine work—will be able to spend more time in in-depth discussions with customers. The core AI is reportedly Takane, a Japanese-language LLM developed jointly by Fujitsu and Cohere. As I wrote a few days ago, it seems to be Fujitsu’s proprietary LLM created in collaboration with Cohere. Customer-created specifications often contain local terminology and implicit assumptions; from the description, it appears that the system is designed to handle such nuances effectively in Japanese. This may represent one concrete way AI is entering real-world practice, and it is worth watching how it develops


February 18, 2026

By the way, it appears that NVIDIA has sold all of its shares in Arm. If I recall correctly, the discussion about NVIDIA acquiring Arm dates back to around 2020. That acquisition ultimately failed, but since Arm was relisted on NASDAQ in 2023, this complete divestment means that Arm’s future stock price movements will no longer have any direct impact on NVIDIA.

I don’t think this means that Arm-architecture CPUs such as Grace or Vera will disappear, but given NVIDIA’s investment relationship with Intel, how things unfold from here is something to watch closely.

Another NVIDIA-related item: it seems NVIDIA has signed a GPU supply agreement with Meta. The deliveries will start with the current GB300, followed by Vera Rubin. Meta plans to deploy these in its multi-gigawatt AI factory “Prometheus” in 2026, and later in the 5-gigawatt AI factory “Hyperion” planned a few years down the road. In terms of GPU count, this would amount to several million units—almost impossible to visualize. Jensen Huang’s efforts at the end of January to coordinate Taiwan-wide production of GPU racks may well have been in preparation for this.


February 20, 2026  Time of Arm Server (topic.1)

  1. Speaking of which, the partnership between NVIDIA and Meta that was reported the day before yesterday appears to be having a broader impact. With NVIDIA making a major move into the data center space, there are concerns about the implications for Intel and AMD. Meta is already reportedly NVIDIA’s second-largest customer by volume, and beyond adopting GB300 and Vera Rubin for AI workloads, Meta is also introducing Arm CPUs for its data centers—raising concerns about the impact on x86. NVIDIA’s move has multiple dimensions.

    I’d like to go into a bit more detail, since I took a break from blogging yesterday.

    One aspect is NVIDIA’s positioning as a data center supplier. NVIDIA has been calling this an “AI factory,” but in reality it looks very much like a full-scale mega data center. What started as DGX SuperPODs for AI training has expanded to include inference, driving down per-token costs to promote wider adoption. By supplying systems at the rack level and even offering standalone Arm CPUs, NVIDIA has effectively become a supplier of mega-scale AI data centers.

    Looking at CPUs, Grace has 72 cores and Vera has 88 cores, both Arm CPUs. In the Arm world, AWS’s Graviton5 has 196 cores, but Vera supports two threads per core, yielding 172 threads. On the x86 side, Xeon 6E offers 144 cores per socket, with two sockets totaling 288 cores. In a superchip configuration, Vera reportedly mounts two superchips per tray; assuming a dual-socket configuration, this would amount to 344 threads. AMD’s Turin (Zen 5c) reaches 768 threads in a dual-socket setup, so there’s always something larger to compare against—but Vera appears competitive as a data center CPU for the current generation.

    As for GPUs, in supercomputers, GPUs already outnumber CPUs. When IBM and NVIDIA built ORNL’s Summit in 2018, the ratio was two IBM POWER9 CPUs to six Volta GPUs. This trend continues not only with GB300 and Vera Rubin but also with AMD’s Helios, where GPU counts exceed CPU counts. Seen this way, AI data centers are becoming structurally similar to supercomputers.

    Perhaps the most significant point is the server ecosystem shift. Fujitsu demonstrated that Arm CPUs could be used in supercomputers by integrating SVE into the A64FX used in Fugaku, which led to SVE2 being included starting with the Armv9 architecture. This boosted Arm’s presence as a server CPU beyond its traditional role in smartphones. While Arm-based PCs such as Apple’s M series and Qualcomm’s Elite have gained visibility, Arm servers using Fujitsu’s A64FX or Ampere Computing’s Altra Max and AmpereOne never achieved major market presence. NVIDIA’s move this time likely means that a vendor—NVIDIA—is supplying Arm CPUs in large volumes specifically for AI data centers. This is clearly different from merely selling architecture licenses.

    Meta (formerly Facebook) founded the Open Compute Project (OCP), a standardization body for data center hardware, together with Google and Microsoft. HPE and Supermicro are suppliers of OCP-compliant servers. In October last year, Arm, NVIDIA, and AMD all joined OCP as board members. AMD’s Helios rack is also believed to be OCP-compliant. In this context, Meta’s purchase of Arm CPUs from NVIDIA is likely intended to build OCP-based servers.

    That concludes the discussion on NVIDIA and Meta.

  2. Below are several investment-related topics. The fate of NVIDIA’s previously rumored $100 billion investment in OpenAI had become unclear, but it now appears that NVIDIA is participating in OpenAI’s funding round with a $30 billion investment. This is reportedly unrelated to the earlier $100 billion figure. In South Korea, there is also a plan to introduce 260,000 NVIDIA GPUs by 2030, following 13,000 units last year and an additional 15,000 units this year.

    With the India AI Impact Summit underway, NVIDIA has also announced investments in India. Data centers totaling 70 MW are reportedly under construction in Mumbai and Chennai, with NVIDIA systems expected to be deployed there. A few days ago, AMD announced a partnership with Tata Consultancy Services (TCS), and now OpenAI is also said to have partnered with TCS, starting with a 100 MW AI data center and eventually expanding to 1 GW. Beyond investments, Qualcomm—who exhibited a humanoid robot—has reportedly partnered with Tata Electronics on automotive semiconductor manufacturing, with an OSAT facility located in Assam.

    While not an investment or partnership, there was also an article about Fujitsu CEO Takahito Tokita giving a speech. He reportedly stated that excessive reliance on AI for specialized skills is risky, and that AI should instead support richer ideas and creativity by facilitating communication between experts and non-experts, including general employees and customers. With this philosophy at its core, industries can grow further and create more jobs. NVIDIA’s Jensen Huang and AWS’s Matt Garman appear to be expressing similar views. It increasingly feels like a picture is emerging in which AI creates more work, rather than eliminating it.

    There was also an article stating that AMD has become a guarantor for financing raised by Crusoe, a data center operator in Ohio. The loan is reportedly from Goldman Sachs, with AMD’s chips themselves used as collateral. Neither AMD nor Goldman Sachs has confirmed this.

    Google is reportedly offering a $100 million investment to Fluidstack, a cloud computing startup, likely in search of deployment opportunities for its TPUs. While Anthropic has adopted TPUs, NVIDIA still dominates overall AI compute power, and Google appears eager to expand its influence. Perhaps we are moving from an era of “build and sell” to one of “invest and get others to use.”

  3. Finally, we get to chips. There are reports that Ryzen CPUs based on AMD’s Zen 6 architecture may be delayed until 2027. Zen 6 is expected to use TSMC’s 2 nm process. Internal testing reportedly achieved 6.5 GHz operation, suggesting that 2 nm prototype chips already exist. Zen 6–based EPYC Venice is slated for use in Helios and already has manufacturing and shipping plans, so AMD is likely prioritizing that. On the consumer side, there are also concerns about severe memory shortages, making it unlikely that conditions will be ready for a Zen 6 Ryzen debut before at least 2026.

    Another AMD-related note: the Radeon RX 9060 XT reportedly reached 4.769 GHz under enhanced cooling conditions, apparently setting a world record for GPU overclocking.

  4. A brief software-related item: there was an article noting that Linux kernel 7.0 will remove the driver for Intel’s 440BX chipset. On a personal note, I believe the ASUS motherboard in the PC I built in 2000 used the 440BX. That brings back memories. The CPU was a Pentium III (Coppermine, 0.18 µm), and despite having no real software to use it with, I built it as a dual-CPU system. Even now, I’m not entirely sure what I wanted to do with it.

    Open-PGL, hosted on GitHub and rumored to have been discontinued by Intel, is reportedly being archived by the Academy Software Foundation (ASWF). ASWF is one of the projects hosted by the Linux Foundation.


February 21, 2026  Human & AI Error (topic.2)

  1. There are also rumors that Intel’s Nova Lake-S will launch in 2027. It had been expected in late 2026, but appears to slip into the following year. I just read an article yesterday suggesting that AMD’s Zen 6–based Ryzen would also arrive in 2027, and it seems Intel may be on a similar timeline. In AMD’s case, the delay is likely due to prioritizing Zen 6 EPYC, but for Nova Lake, it may be that Intel’s 18A production lines are fully booked. They are probably being used for Xeon 6+E Clearwater Forest or Xeon 7 Diamond Rapids, both aimed at data center or HPC markets. It does seem that the consumer market will face a tough environment this year.

    Another Intel-related topic: there have reportedly been layoffs of around 6,000 people at Intel’s Oregon sites, raising some concerns about corporate stability. Among researchers, Oregon (OR) is well known as Intel’s U.S. R&D base—many Intel research papers list “Intel Corp., Hillsboro, OR.” Seeing layoffs even at such a core R&D site is rather disheartening.

  2. Over the past couple of days, many articles have appeared claiming that AWS’s coding AI caused service outages. AWS’s code-generation AI tool, called Kiro, apparently also performs operational tasks. The outages reportedly occurred twice last December, but I don’t recall hearing about them at the time—apparently because they occurred in China. One outage lasted 13 hours, while the other was not visible to customers. The information seems to have come from an internal AWS leak, while the official explanation cites human error. From what I’ve read, Kiro chose to delete and rebuild systems and executed this without human intervention. The users operating Kiro had overly broad privileges, leading to widespread impact. Whether this was Kiro’s fault or the users’ fault is debatable, but there are a few troubling points: execution without human involvement, and referring to humans with broad administrative authority simply as “users.” These “users” may be Kiro users, but within AWS they are likely employees with operational responsibility.

    Executing actions without human oversight is something that should not be done, in order to prevent irreversible accidents. This is not because it “blurs responsibility,” but because automation without proper fail-safe and foolproof mechanisms has long been recognized as dangerous—ever since the days of Thomas Edison. This lesson is written in blood in the history books. AI may be intelligent, but it does not bleed, so caution is essential. AWS also announced layoffs of 16,000 employees last month, with internal emails reportedly sent before the public announcement. The fact that such internal stories are leaking may be related to the wave of workforce reductions.

    One more AWS-related item, finally a technical one: AWS has released the Hpc8a instance. Compared to the previous Hpc7a generation, memory bandwidth is reportedly improved by 42%. The CPU is listed as a 192-core AMD Zen 5–based EPYC, which likely means Zen 5c (Turin), manufactured on TSMC’s 3 nm process. Both Zen 5 and Zen 5c support two threads per core, but SMT is disabled here, resulting in 192 threads. In HPC workloads, where data is processed sequentially and synchronously, single-thread performance is critical and SMT is often less valued. In data centers, concerns such as noisy neighbors and security also make SMT less desirable. While SMT is effective for increasing throughput with fewer cores, it may no longer be essential in the era of many-core CPUs with strong single-thread performance.

  3. Finally, a political topic. A U.S. court has ruled that the Trump-era tariffs are illegal. It remains to be seen how this will affect TSMC’s massive planned investments in the United States, but the ruling does not address whether already-paid tariffs will be refunded or how such refunds would be handled. There also appear to be multiple alternative legal bases for tariffs, so the situation may not change much. Moreover, if a presidential order is issued and approved by Congress, it would become law, which seems quite possible. There is little point in reacting emotionally either way. At a press conference, former President Trump reportedly suggested that Taiwan had “stolen” semiconductor technology. In the DRAM era of the 1980s, Japan would almost certainly have been named instead. I can’t help but feel a bit sorry for Taiwan.


February 22, 2026

  1. There was an article stating that NVIDIA has partnered with Idaho National Laboratory (INL) in Idaho on the development of nuclear AI applications. NVIDIA’s technology is expected to be applied to the design, manufacturing, construction, and operation of reactors developed at INL. This appears to be part of the Genesis project and a component of U.S. energy policy. Since digital twin technology is mentioned, NVIDIA Omniverse will likely be used. One wonders whether AI training models for nuclear reactions inside reactors might eventually be developed and even released as open source—though concerns about nuclear weapons applications make that unlikely.

    Another NVIDIA-related note: for several years now, there have been occasional reports of GPU power connectors melting or burning. Possibly in response, there was an article noting that some Dell PCs use screw-fastened connectors for GPU auxiliary power. This likely helps prevent fluctuations in contact resistance caused by cable movement. High-end desktops can require power supplies exceeding 1,000 W. When large currents flow, even small increases in contact resistance can cause significant voltage drops and substantial heat generation via Joule heating. For example, at 400 W and 12 V, about 33 A flows; if contact resistance increases by just 10 mΩ due to heating, voltage drops by 0.33 V. If sensors detect this and raise voltage to compensate, current increases further, causing even greater voltage drop—leading to a runaway process that can result in burning. The trigger is variation in resistance.

  2. There was also an article about a Toronto-based startup called Taalas developing an AI chip named HC1, manufactured on TSMC’s 6 nm process. It hardwires the generative AI model Llama 3.1 8B and reportedly achieves 17,000 tokens per second (TPS)—a figure that made me do a double take. Just last week I wrote that Llama 3.1 70B achieved 2,100 TPS on Cerebras; although the parameter counts differ (8B vs. 70B), this is an order of magnitude higher. According to the company, HC1 is two orders of magnitude faster than Blackwell, with one-twentieth the token cost, and can be air-cooled. While it’s common knowledge that specialized hardware can deliver higher performance and lower power consumption, the hardwired nature means it cannot support other LLMs. I also wonder whether reinforcement learning is possible on this chip alone. 

    The company says that changing the LLM can be handled by modifying wiring layers, but this reportedly takes two months. The number of layers modified and their position in the stack matter: the finest masks are those for the transistors and the metal layers immediately above them, while upper layers follow “reverse scaling,” with wider pitch and lower mask costs. However, wider pitch reduces routing density, potentially requiring more layers and higher mask costs, and there are limits to how many layers can be added. The chip size is reportedly 815 mm²—about 28 mm × 29 mm—very close to the practical manufacturing limit. Moving to 3 nm would reduce size somewhat, but wafer costs would then come into play. Still, as an exploration of how much faster LLMs can be when hardwired, this is a very interesting approach. A hybrid system—hardwiring a core LLM and supplementing reinforcement learning with GPUs—could be compelling in terms of both performance and power efficiency.

  3. SoftBank Group CEO Masayoshi Son has reportedly proposed the “Arizona AI Mega Project,” a plan to build an AI and robotics industrial park in Arizona. The R&D hub would include semiconductor manufacturing units, housing for engineers, and a smart grid—effectively a technology development city. If robotics is included, this could serve as a hub for physical AI. SoftBank participated in the Stargate Project last year and invested in OpenAI, and through its ownership of Arm, it has decided to develop processors in-house, acquiring Ampere Computing. This may indicate that SoftBank is beginning to solidify a concrete exit strategy.


February 23, 2026

  1. By the way, it appears that Intel may be designing a unified core that integrates its P-cores and E-cores. More precisely, Intel is reportedly planning to hire project members for a unified-core initiative, and it is not entirely clear whether this means a full integration of P and E cores.

    In the past, there was a project called “Royal Core,” reportedly conceived by Jim Keller (with a CPU codename that may have been “Beast Lake”), and that design was said not to have separated P-cores and E-cores.

    In reality, the decision to split P and E cores in the 12th-generation Core i series (Alder Lake) may have been a strategy to reduce power consumption while remaining on the 10 nm node, by bringing in Atom-based single-thread cores. Alder Lake’s Intel 7 process was essentially an enhanced version of Intel’s 10 nm SuperFin technology. Since the minimum feature size remained the same, packing in more circuitry would naturally increase power consumption, which may have led Intel to adopt separate P-core and E-core architectures.

    As I wrote the other day, once single-thread performance improves and core counts increase, SMT may become less important. Starting with the second generation of Core Ultra (Lunar Lake / Arrow Lake, manufactured on TSMC 3 nm), the Lion Cove P-core has already moved to single-thread operation. Assuming process scaling continues smoothly, it may indeed become feasible to unify P-cores and E-cores again.

    Another Intel-related item: there are reports about Intel’s 3000 W power supply units for servers and data centers. With a 12 V output, that corresponds to 250 A. These units are water-cooled rather than air-cooled. GaN and SiC are used as power devices, and the units have obtained 80 PLUS Platinum certification. In data centers, cooling water lines would typically be integrated into the infrastructure, but for standalone servers, would they attach an AIO (All-in-One) radiator unit? It is an interesting detail to consider.

  2. On a slightly different note, some quantum computing news: a contract for an IBM quantum computer to be delivered to the Campania region of Italy has reportedly been suspended. The issue does not appear to lie with IBM’s proposal itself, but rather with questions raised about the bidding process. There is a project underway to establish a “Quantum Valley” in southern Italy, and this suspension may delay its progress.

February 24, 2026

  1. It seems AMD has also entered into a GPU supply partnership with Meta. Just last week, Meta finalized a GPU supply agreement with NVIDIA, and now it appears to be AMD’s turn. Meta is reportedly planning to install 6 GW worth of Helios rack systems equipped with AMD’s Instinct MI400-series GPUs. There is also information suggesting that these GPUs will be tuned specifically for Meta, raising speculation that they may be customized versions of the MI400.

    The CPU is EPYC, based on Zen 6 Venice, with Zen 7 Verano also under consideration. Assuming one Helios rack consumes about 0.2 MW, 6 GW would correspond to 30,000 racks. With 72 GPUs per rack, that comes out to a simple calculation of roughly 2.16 million GPUs. At that volume, producing customized GPUs would still make economic sense. Naturally, this cannot be completed in a single year, so the deal appears to be a multi-year contract.

    As with the OpenAI deal, this agreement also includes warrants for 160 million shares of AMD common stock. The mechanism seems to be that as Helios deployments progress, Meta receives AMD shares accordingly. Issuing all 160 million shares at once would dilute AMD’s stock, so the allocation is apparently staged. There are also conditions tied not only to GPU volume but to AMD’s stock price at each stage. As Meta purchases GPUs and AMD’s performance improves, corresponding shares are allocated. During the OpenAI deal, this structure was criticized as potentially circular trading. AMD’s aim is presumably to catch up to NVIDIA in terms of market capitalization—though at present AMD’s market cap is still less than 10% of NVIDIA’s.  

  2. Several more NVIDIA-related topics. There are renewed reports suggesting that NVIDIA may be preparing to enter the notebook PC market, with announcements of N1/N1X possibly imminent. The N1/N1X CPU+GPU itself is already on the market as GB10. Despite the name, the CPU appears to be a hybrid of Arm Cortex-X925 and Cortex-A725, rather than Grace. The C1 used in Grace was likely a successor to the X925. The GPU uses the Blackwell architecture, though its performance is presumably scaled down.

    NVIDIA’s target seems to be the market for CPUs with integrated GPUs (iGPUs). Rumors of a launch in the first half of 2026 have not yet disappeared, although there was no announcement at CES in January. Whether anything will be announced at the March GTC is still unclear. In any case, the consumer market environment this year looks unfavorable.

    Another NVIDIA item: the company has partnered with Singtel to promote AI adoption in Singapore. Singtel is the country’s major integrated telecom operator. Together, they have established an applied AI Center of Excellence for both public and private sectors. At last week’s India AI Impact Summit, although CEO Jensen Huang was absent, NVIDIA demonstrated efforts such as building a digital twin of Tata Motors’ automotive factory using the Omniverse platform and launching initiatives for robot learning with logistics companies. NVIDIA appears to be actively pushing physical AI in India.

  3. Next, an NPU-related topic. Intel reportedly considered acquiring SambaNova but has since abandoned the plan. However, Intel did participate in part of SambaNova’s funding, so the relationship has settled into an investment partnership. The two companies will also engage in technical collaboration over multiple years. They are expected to build systems combining SambaNova’s SN50 chip with Xeon processors. Intel’s consumer CPUs already include NPUs derived from its acquisition of Movidius, but SambaNova’s technology seems likely to be paired primarily with Xeon. 

  4. Finally, a software-related topic. Anthropic reportedly published a blog claiming that its Claude Code can rapidly rewrite COBOL code, traditionally used on mainframes. Following this, IBM’s stock price reportedly dropped by around 13%. Rewriting COBOL has been discussed for decades, and IBM itself has long pursued initiatives to migrate COBOL systems to Java. There have been many such efforts in the past, but I don’t recall stock prices being affected before.   
This time, Anthropic’s blog seems to have had an impact because the software industry itself has become highly sensitive to AI coding developments. It may not be entirely realistic, but if AI can just as easily rewrite code into COBOL as out of it, one could imagine a future less constrained by programming languages. Whether that is truly the case remains an open question. 

February 25, 2026

  1. By the way, Apple reportedly plans to manufacture the Mac mini in the United States. The company will open an Advanced Manufacturing Center in Houston, Texas, intended as a domestic production base for AI servers. “Sovereignty” has become a key buzzword recently. This facility is likely envisioned as a massive complex where manufacturing lines, data centers, Apple Intelligence, and human training all coexist.

    Arizona, neighboring Texas, is home to TSMC’s Fab 21, which is expected to supply the semiconductors for the Mac mini. There are also reports that Apple has signed a contract for the supply of 100 million chips. Within Texas itself, Austin hosts key companies such as Arm, TI, and Broadcom.

  2. Speaking of Arm, its recent quarterly results were strong, with particularly strong royalty revenue from the data center segment. AWS released Graviton 5 in December, and instances have since become available. It seems likely that Arm began charging AWS royalties for these chips around that time. If royalties are charged per core, Graviton 5’s 196 cores would represent a substantial amount per chip. NVIDIA’s Vera, by comparison, has two threads per core, with 88 cores and 176 threads. Whether royalties are based strictly on core count is an interesting question.

    Another Arm-related topic: Arm has partnered with the Indonesian government on engineer education, aiming to train 15,000 chip design engineers. There was recently news that Qualcomm completed the design of a 2 nm Snapdragon chip in India, and it appears that chip design engineers are in short supply globally. Designing chips requires IP macros such as basic logic gates, arithmetic units like adders and multipliers, on-chip memories like caches, and controllers for DDR and PCIe interfaces. Shortages of engineers capable of designing such IP have been discussed for several years. By training engineers familiar with the Arm architecture and Arm’s IP, this initiative aims both to alleviate talent shortages and to generate foreign revenue.

  3. ASML in the Netherlands has reportedly succeeded in boosting the power of EUV lithography light sources, increasing output from the current 600 W to 1,000 W. Market introduction is targeted around 2030, with 1,500 W and even 2,000 W also in view. It seems the industry is entering the kilowatt era. The impact on semiconductor manufacturing would be significant: wafer throughput is expected to rise from 220 wafers per hour to 330, a 50% increase.

The key question is where these systems will be deployed. Likely candidates include TSMC, Intel, Samsung (and GlobalFoundries), SK Hynix, and Japan’s Rapidus. EUV tools are now being installed in an increasing number of regions: Taiwan (TSMC), the U.S. (Arizona for TSMC; Arizona, Oregon, and Ohio for Intel; Texas for Samsung; New York for IBM and others), Ireland (Intel), Korea (Samsung and SK Hynix), Hokkaido (Rapidus), and eventually Kumamoto for TSMC. Strengthened light sources imply applicability to advanced nodes, making Taiwan and Ohio particularly realistic destinations—though this is, of course, speculation.


February 26, 2026

  1. By the way, today (February 25 in the U.S.) NVIDIA announced its earnings. Both fourth-quarter and full-year results were released. Full-year revenue reportedly reached $215.9 billion, which translates to approximately ¥32.4 trillion at an exchange rate of 150 JPUSD. This represents 65% year-on-year growth. Fourth-quarter revenue alone was $68.1 billion, up 20% quarter-on-quarter and 70% year-on-year, exceeding the annual growth rate. This suggests particularly strong growth in the latter half of the fiscal year. With 91% of revenue coming from data centers, these numbers underscore NVIDIA’s central role in the recent AI semiconductor boom.

    Despite these stellar results, NVIDIA’s stock price has not risen as much as one might expect. This does not necessarily mean there is no room for growth. Massive shipments of GB300 and Vera Rubin for data centers are planned, and while memory supply constraints exist, demand for RTX 6090 GPU cards remains, and N1/N1X products for notebooks are also waiting in the wings.

    Investor concerns seem to revolve around two opposing risks: caution toward a potential AI bubble, and fears that AI might fundamentally disrupt the software industry. Investors are caught between the possibility that AI investment becomes circular trading that eventually collapses, and the possibility that AI succeeds too well and destroys existing software business models. In either case—whether AI is “real” or “fake”—pessimism currently prevails.

    Recently, CEO Jensen Huang and AWS CEO Matt Garman have repeatedly stated that the software industry will not be destroyed and that the future is bright. Yet when news spread that Anthropic could rewrite COBOL, even IBM’s stock saw drops of 13% or even 27%. Some are even predicting the extinction of SaaS, coining the term “SaaS-pocalypse.” Leaders like ServiceNow CEO Bill McDermott and Salesforce CEO Marc Benioff have pushed back against this narrative. While customer support staffing has been reduced, AI agents have made SaaS more useful, and the number of companies using SaaS is reportedly increasing. Jensen Huang has also suggested that the market is wrong.

    AI agents will leverage SaaS and other existing software to support more companies than before. Existing software will still be needed, as will staff who support businesses in new ways. Executives likely already see a model in which AI actually increases the amount of work. When stock prices begin to rise again, that may be the point at which markets accept this view and concerns about circular trading subside.

    Another NVIDIA-related note: last month, the U.S. government reportedly approved exports of H200 GPUs to China, and Chinese companies appeared open to importing them. However, it seems that H200s have not yet entered China. The export approval is likely conditional, with strict security monitoring, and Chinese firms may be hesitant to proceed under such constraints.

  2. One CPU-related topic: Broadcom has reportedly shipped Fujitsu CPUs. The product is described as a “3.5D Face-to-Face Computing SoC,” which appears to refer to Fujitsu’s FUJITSU-MONAKA CPU under development. Beyond Fujitsu’s own materials, overviews have also been presented at TSMC symposia and Arm events.

“3.5D” likely refers to a configuration where some chiplets mounted on a silicon interposer (commonly called 2.5D integration) use 3D stacked LSIs. The computing die is manufactured using TSMC’s 2 nm process, while the cache memory die uses a 5 nm process. These are bonded face-to-face, with the cache underneath. In addition, an I/O die (5 nm) is included. AMD’s 3D V-Cache also uses face-to-face bonding, but in AMD’s case the CCD is on the bottom and the cache on top.

It feels as though Fujitsu is now the only company continuing to develop CPUs in Japan. Its previous CPU, A64FX for the Fugaku supercomputer, included HBM, but this new chip targets data centers and does not include HBM. Mass production is expected to begin in the latter half of this year. Broadcom appears eager to deploy this 3.5D SoC technology across a variety of chips.


February 27, 2026

  1. By the way, in OpenAI’s recent funding round, NVIDIA was said to be investing $30 billion, but it now appears that SoftBank invested $30 billion and Amazon invested $50 billion, bringing the total raised to $110 billion. NVIDIA’s contribution reportedly takes the form of 3 GW of dedicated inference capacity and 2 GW of training capacity.

    Amazon’s participation seems like a new development. As a result, OpenAI has secured 2 GW of capacity on AWS using Trainium 3 and, in the future, Trainium 4. On the AWS side, Amazon is set to become OpenAI’s exclusive third-party cloud distribution provider. This apparently means that when OpenAI Frontier—the OpenAI subsidiary that operates AI agent services—offers those services, they will be provided exclusively through AWS (Amazon Bedrock). The 2 GW of Trainium capacity is likely allocated for this purpose.

    Microsoft remains an existing shareholder of OpenAI and continues to collaborate with it, but OpenAI’s AI agent services will not be offered through Azure. Existing OpenAI services delivered via Azure will continue. This suggests a division of roles: stateful APIs such as AI agents on AWS, and stateless APIs on Azure.

    Another AWS-related item: there was news that AWS plans to build a data center in Louisiana. The investment is reported to be around $12 billion, though the power capacity was not specified. Infrastructure development is underway, including a power supply network incorporating 200 MW of solar generation, as well as water systems for cooling. About 87% of the cooling reportedly relies on outside air. While water cooling is used, heat exchange through radiators normally relies on ambient air. Construction is expected to create around 1,500 jobs, with approximately 540 full-time positions for data center operations. Supercomputer construction has long been considered a form of public works, and it seems data center construction has now entered the same category.

  2. Turning to NVIDIA-related topics, the ripple effects of NVIDIA’s earnings continue to spread. NVIDIA’s contribution to TSMC’s revenue is reportedly $23 billion, accounting for 19% of the total—up 7 percentage points from 12% in 2024. By contrast, Apple’s share of TSMC revenue is $20.5 billion (17%), down 5 points from 22% last year. As a result, Apple has ceded its position as TSMC’s largest customer to NVIDIA. It had been said during TSMC’s earnings call last month that NVIDIA had overtaken Apple, but these figures make it clear.

    A supercomputer using NVIDIA GPUs at Eli Lilly has reportedly gone into operation. Called “LillyPOD,” it is a DGX SuperPOD equipped with 1,016 Blackwell GPUs, construction of which began in October 2024. It is said to be the most powerful system in the pharmaceutical industry. With 72 GPUs per rack, 1,016 GPUs would amount to 14 racks plus 8 GPUs. Since each computing tray contains four GPUs, that leaves an extra two trays’ worth of GPUs—presumably installed somewhere.

    Regarding Meta: after announcing that it would receive several million GPUs from NVIDIA, and then—just two days before NVIDIA’s earnings announcement—declaring that it would also receive 6 GW worth of GPUs from AMD, Meta has now reportedly announced that it will procure TPUs from Google as well. This may be a form of diversification.

  3. Next, some AMD-related items. AMD CEO Lisa Su reportedly stated that the data center market will reach $1 trillion by 2030. According to Japan’s Ministry of Internal Affairs and Communications, the market size in 2024 was $416.1 billion. Reaching $1 trillion by 2030 would require annual growth of roughly 16%. There is no doubt that data centers are a growth industry. Terms like “mega data centers” and “hyperscalers” were already in use around 2020, but the wave of AI adoption has dramatically accelerated investment.

    As I mentioned briefly during the India AI Impact Summit two weeks ago, current data center capacity by region is approximately 53 GW in the U.S., 20 GW in China, 13 GW in Europe, and 1.6 GW in India. Japan stood at 1.37 GW as of 2024.

    AMD has also partnered with Nutanix. The deal includes a $150 million investment from AMD and technical collaboration, with $100 million allocated specifically for joint technical initiatives. Nutanix provides services that enable hybrid clouds spanning private and public cloud environments. This partnership is likely aimed at delivering a full-stack solution—including AI—on the private cloud side. Nutanix had previously supported NVIDIA GPUs, and with this partnership, AMD GPUs will now also be supported. From a confidentiality perspective, many companies avoid storing sensitive information on public clouds like AWS. It seems we are entering an era in which AI agents will also be deployed in private clouds. That said, Nutanix reportedly lowered its earnings outlook due to constraints in CPU and memory supply.

    AMD also announced Sorano, a Zen 5–based processor in the 8xx5 series. It is the successor to Siena, with core counts increased from 64 to 84. It is not yet clear whether it uses Zen 5 or Zen 5c. If it is Zen 5, it would require twelve 7-core CCDs (with one core disabled per CCD). If Zen 5c, it would require six 14-core CCDs (with two cores disabled each). Using twelve Zen 5 CCDs would physically resemble the 96-core-class EPYC 96x5 configuration. If the processor is to fit into the smaller SP6 socket rather than SP5, a configuration using six Zen 5c CCDs seems more realistic—but that remains speculative.

  4. One Intel-related item: a senior vice president at Intel Foundry has reportedly moved to Qualcomm. Just last month, it seemed that an engineer who had long led GPU development at Qualcomm moved to Intel. Such exchanges of personnel are not uncommon.

  5. Finally, two political topics. Anthropic and the U.S. Department of Defense—commonly known as the Pentagon, whose current formal designation translates roughly as the Ministry of War—appear to be in some dispute over an AI usage contract. The Department of Defense reportedly wants unrestricted use of Anthropic’s Claude AI for any lawful purpose. The contract value is said to be $200 million, and Anthropic is reportedly hesitant.

DeepSeek is preparing to release its next v4 model, but NVIDIA and AMD have reportedly been excluded from the GPU vendors’ early access list. Chinese domestic vendors such as Huawei are said to be given priority for several weeks. This appears to be a move to reduce dependence on the U.S. and protect domestic vendors.


February 28, 2026  From GPU to NPU (topic.2)

  1. There are reports regarding AMD’s Zen 7 “Grimlock Ridge.” Although this is a desktop CPU, its CCDs are expected to be used in EPYC as well. Zen 6 has not yet reached the market but uses TSMC’s 2 nm process, while Zen 7 is expected to use the A14 (1.4 nm) process. The CCD is said to have 16 cores with an area of 98 mm². This 16-core CCD is reportedly codenamed “Silverton.” Separately, there appears to be an 8-core CCD called “Silverking,” with an area of 56 mm². For reference, Zen 6 CCDs have 12 cores and an area of 76 mm².

    According to TSMC materials, the density improvement from N2 (2 nm) to A14 is about 1.2×, implying an area reduction to roughly 83%. While moving from 2 nm to 1.4 nm represents a full-node generational shift, it does not result in halving the area for the same core count as in earlier eras. Keeping 12 cores would yield an area of roughly 63 mm² (76 × 0.83). A 24-core CCD would be around 124 mm², slightly larger than previous CCDs. It appears AMD may have opted for two variants: a 16-core die (1.5× the core count) and a smaller 8-core die at roughly half that size. Powers of two like 16 and 8 may also be more convenient from a design standpoint.

    The article also mentioned APUs. In Grimlock Point/Halo, CCDs are reportedly stacked on top of the I/O die. In Zen 5’s Strix Point and Zen 6’s Medusa Point (not yet released), CCDs and I/O dies are connected using FOEB (Fan-Out Embedded Bridge). Moving to a 3D structure dramatically shortens wiring distances, improving both latency and power efficiency. Incidentally, Zen 6–based EPYC Venice also uses FOEB, and its appearance differs slightly from Zen 5–based Turin and earlier packages.

  2. Regarding yesterday’s OpenAI funding round, it appears that the 3 GW of inference-only capacity provided by NVIDIA as part of its $30 billion investment will be based on Groq, with which NVIDIA partnered last year. The partnership with Groq is reportedly a non-exclusive licensing agreement, but NVIDIA has hired Groq’s founder and built an internal development team. The extent of internalization is unclear, but announcements are expected at GTC in March.

    Groq’s LPU is expected to use 3D stacking with cache memory, and LPUs will be rack-mounted and connected using NVLink Fusion. The GPU used is reportedly “Feynman,” manufactured on TSMC’s A16 process using BSPD (backside power delivery), which TSMC refers to as Super Rail. Groq refers to its processor as an LPU—Language Processing Unit—adding yet another letter to the expanding “XPU” family.

    NVIDIA has reportedly begun shipping samples of Vera Rubin, with full-scale shipments expected in the second half of 2026. This is just a personal impression, but both NVIDIA and AMD seem to be rushing GPU sales. It appears there is a growing expectation that as various NPUs like Groq’s emerge and CPU+NPU configurations become the mainstay for low-power inference, GPUs may no longer be indispensable.

    With sufficiently trained AI models, systems may not necessarily require GPU-heavy configurations. For companies that do not perform reinforcement learning frequently, the need for GPUs may be intermittent. In such cases, GPU+CPU deployments could become cost-disadvantageous.

    Personally, I find it hard to justify buying a high-end GPU costing several hundred thousand yen just for occasional reinforcement learning. Without a clear path to monetization, it is not an attractive investment. (That said, business models already exist—and will likely grow—in which companies offer reinforcement learning as a service.) GPUs will still be needed for supercomputing applications such as drug discovery and weather prediction, so demand will not disappear. However, the explosive growth in AI semiconductor demand since last year, and the massive multi-gigawatt data center contracts being signed over multi-year periods, may reflect an anticipation that a future dominated by inference will eventually lead to saturation in training demand. From that perspective, it is not unreasonable to think that vendors feel compelled to sell GPUs aggressively now.

  3. Finally, some TSMC-related topics. TSMC’s 2 nm process capacity is reportedly fully booked for the next two years. Mass production began late last year. Publicly announced 2 nm users include AMD’s Zen 6 CCDs and Fujitsu’s FUJITSU-MONAKA computing die, both of which are likely already coming off the line. Ongoing production likely includes Apple silicon, and NVIDIA has announced plans to use 2 nm for Rubin Ultra. MediaTek is also expected to develop products on 2 nm.

Regarding fabs, TSMC reportedly plans to build a new 2 nm manufacturing site in Tainan. Procedures are expected to be completed within 2026, with production targeted to begin in 2028.

There was also news about TSMC’s withdrawal from the GaN business. TSMC announced last summer that it would exit the GaN foundry business for power devices by July 2027. After withdrawal, Rohm—TSMC’s partner in the GaN business—is expected to take over. Gallium is a semiconductor material known for quite some time and is considered one of the rare metals.

One more TSMC-related note, or rather Taiwan-related: reports say that the U.S. government, via the CIA, has warned U.S. tech companies about the risk of a Chinese invasion of Taiwan by 2027. There seems to be a growing number of articles expressing concern about the concentration of semiconductor supply in Taiwan. The February 24 edition of The New York Times reportedly ran a feature on risks surrounding Taiwan.


 This Blog text was translated by AI from Japanese Source Blog.

MAR.(1st Half) 2026

MARCH 1, 2026 On February 26, I wrote that Broadcom had begun supplying its 3.5D XDSiP technology. Since then, many more articles have appe...