MARCH 1, 2026
On February 26, I wrote that Broadcom had begun supplying its 3.5D XDSiP technology. Since then, many more articles have appeared, so I would like to summarize the situation a bit further. XDSiP stands for eXtreme Dimension System in Package, and it appears to be the name of a technology that combines three-dimensional stacked chips with 2.5D packaging using a silicon interposer. The first product to adopt this approach seems to have been FUJITSU-MONAKA, currently under development by Fujitsu. Broadcom reportedly has additional customers beyond Fujitsu, with a total of six designs underway and a target of shipping one million XDSiP chips.
Various pieces of information about FUJITSU-MONAKA have also emerged, but some clarification may be necessary. It has been officially disclosed that FUJITSU-MONAKA is a data center CPU. Since Fujitsu has also been awarded the contract for the next-generation supercomputer “Fugaku NEXT,” some reports seem to assume that FUJITSU-MONAKA will serve as its processor. However, that does not appear to be the case. Fugaku NEXT is still quite far in the future, and FUJITSU-MONAKA may be somewhat underpowered by then. The CPU for Fugaku NEXT is expected to be FUJITSU-MONAKA-X, and it has been disclosed that its computing die will be manufactured on a 1.4 nm-class process. The GPU portion of Fugaku NEXT will reportedly be handled by NVIDIA, likely connected via NVLINK-Fusion. I recall seeing reports about this around October last year.
Changing topics, there is news from Qualcomm. Ericsson and Qualcomm have reportedly formed a partnership toward 6G development. AI computing is expected to be integrated into the network itself in the 6G era. I believe SoftBank and Rakuten Mobile have also been studying the fusion of AI and RAN. Companies such as Qualcomm and SoftBank, which operate across both communications and computing domains, seem to be in a favorable position.
Another Qualcomm-related topic: at GDC (Game Developer Conference), scheduled to begin on March 9, Qualcomm had been expected to unveil a Windows handheld device. However, Qualcomm has reportedly withdrawn, and no announcement will be made. Severe memory shortages appear to be casting a shadow over the effort. This catastrophic RAM shortage has reportedly been dubbed “RAMageddon.”
Finally, in quantum computing news, IBM has reportedly installed a quantum computer at the Indian Institute of Technology (IIT) Kanpur in Uttar Pradesh. IBM, IIT, and the state government are expected to collaborate on technical initiatives in the field of quantum computing.
MARCH 2, 2026
At MWC (Mobile World Congress) 2026, currently being held in Barcelona, Intel has reportedly announced Xeon 6+, codenamed Clearwater Forest. It features 288 E-cores. Around this time last year, it had been expected that the previous generation Xeon 6E (Sierra Forest, manufactured on the Intel 3 process) would introduce a top-tier 288-core model. However, adoption of the mid-range 144-core SKU by data center customers did not progress as strongly as anticipated, and the 288-core flagship never reached the market. Now, Clearwater Forest — manufactured on the Intel 18A process — has been introduced with 288 cores as Xeon 6+.
The E-core architecture has also evolved, moving from Crestmont in Sierra Forest to Darkmont in Clearwater Forest. Darkmont is likewise used for the E-cores in Panther Lake.
The computing die of Clearwater Forest consists of twelve 24-core chiplets, totaling 288 cores. Internally, each 24-core block appears to be composed of core modules containing four cores and 4 MB of L2 cache. Six such modules form a single core die. These twelve core dies are grouped into three sets of four and mounted on three base dies. The base dies function as L3 cache memory and are manufactured on the Intel 3 process. Each base die is elongated in shape, and the bridge connecting four core dies likely houses the DDR memory controller. The I/O die is produced on the Intel 7 process and is positioned at both ends of the three base dies. Since it supports PCIe Gen5, it may share similarities with Sierra Forest.
While Sierra Forest integrated 144 cores into a single large die, Clearwater Forest adopts a base-die plus 3D packaging approach, appearing to represent significant progress in chiplet architecture. It is unclear whether Intel plans to produce a 144-core variant of Clearwater Forest, but in principle this could be achieved by keeping the same base dies and using only six core dies instead of twelve. It is also worth noting that Xeon 6E and Xeon 6+E are socket-compatible.
MARCH 3, 2026
Following on from yesterday, topics from MWC (Mobile World Congress) continue. Qualcomm has reportedly announced the Snapdragon Wear Elite for wearable devices. Although designed for wearables, it is equipped with an NPU. The integrated Hexagon NPU is said to support models with up to 2 billion parameters and deliver performance of around 10 tokens per second (TPS). It is expected to be used as an edge AI device.
The Fujitsu FUJITSU-MONAKA processor (which I will simply call MONAKA from here on), a recurring topic in recent days, was apparently showcased at MWC in the form of a wafer. MONAKA is a chiplet-based design consisting of three dies: a computing die (TSMC 2 nm), a cache die (TSMC 5 nm), and an I/O die (TSMC 5 nm). When I first saw the photo, I wondered which die the wafer represented. It turns out that it was a wafer in which computing dies had been directly bonded onto cache dies. IBM showed a 2 nm GAAFET wafer several years ago, and Intel displayed an 18A wafer at CES in January. So perhaps this was intended to demonstrate a wafer with 2 nm chips mounted onto a 5 nm wafer. I found that somewhat surprising.
There were also several topics related to silicon photonics. NVIDIA has reportedly invested $2 billion each in Lumentum and Coherent. Communication power consumption inside AI factories is expected to exceed that of traditional data centers. To address this, NVIDIA has been developing its Spectrum Ethernet switch chips. It will be interesting to see what kind of progress results from these investments.
In another development, Ayar Labs — a startup working on co-packaged optics (CPO) — has reportedly raised $500 million in funding, with support from NVIDIA and AMD. This seems to reflect the growing view that CPO will be essential for reducing communication power consumption within AI data centers. One member of Ayar Labs’ board is former Intel CEO Pat Gelsinger. The company is also said to be conducting joint development efforts with Intel and TSMC.
Another Intel-related topic: Intel has reportedly formed a partnership with India’s Infosys. India-related developments have been frequent lately, and Intel now appears to be part of that trend. The collaboration involves building enterprise AI solutions based on Infosys’ “Infosys Topaz” platform, using Intel Xeon processors, Gaudi accelerators, and AI PCs. This seems aimed at enabling AI to support corporate operations. Personally, I imagine that instead of simply streamlining expense reporting for business trips, AI might eventually handle trip planning and arrangements themselves — at least that is my hope.
MARCH 4, 2026
Meanwhile, the ongoing conflict involving the United States, Israel, and Iran is casting a significant shadow over data centers and cloud services. AWS facilities at three locations in the UAE and Bahrain have reportedly suffered damage, leading to degraded service in one region. Articles over the past few days have suggested that data centers were hit by drone attacks. If regional power grids are also affected, cloud services in those areas could become unusable. AWS is said to be advising customers to prepare failover strategies to alternative regions.
Historically, the internet is often described as having originated from U.S. military technology. The “World Wide Web,” represented by the familiar “www” at the start of URLs, was designed so that even if one location were attacked, the network could survive via alternative communication paths. Today’s internet underpins everyday life, so attacks on networks or data centers effectively amount to attacks on daily living itself. It is no longer sufficient that “some part survives.” Ideally, connectivity and communication would help create a world in which such attacks do not occur at all.
It is well known that Intel operates manufacturing facilities in Israel. The CPU division that developed Centrino (the Banias core) in the 2000s is also located there. Mellanox, which NVIDIA acquired around 2020, was likewise an Israeli company and likely continues to exist as a division within NVIDIA.
Returning to the usual technology topics, Akamai — the world’s largest content delivery network (CDN) provider — has reportedly introduced several thousand NVIDIA Blackwell GPUs. These are deployed in NVIDIA RTX PRO servers equipped with BlueField-3 DPUs. Rather than concentrating them in a single location, Akamai plans to distribute them across network points worldwide to enable decentralized cloud computing. The company is said to be promoting a vision of a globally distributed AI computing grid. I am not entirely sure, but it reminds me somewhat of Cisco’s earlier concept of fog computing.
Next, an AMD-related development. According to CEO Lisa Su, demand for CPUs has surged beyond expectations. She reportedly mentioned this during a meeting with analysts, noting that the proportion of inference workloads is increasing and driving unexpected CPU demand. It is unclear whether this refers specifically to Ryzen or EPYC CPUs, or to APUs combining CPUs and NPUs. As agent-based AI becomes more prevalent, CPU demand is likely to grow. There was also a report that AMD has partnered with U.S.-based EMS provider Flex to assemble GPU platforms, with Flex handling assembly of the Instinct platform. Given that GPU demand is already strong, this likely reflects steady expansion of manufacturing capacity.
Another AMD-related note: a report indicated that MiTAC’s AMD Instinct MI350X servers use diamond-based cooling. Specifically, a sheet containing synthetic diamond material is placed between the GPU and the heatsink. While graphene sheets — also carbon-based — already exist and offer higher thermal conductivity than diamond, they conduct electricity, posing short-circuit risks if introduced into electronic circuits. Graphene powders may also present health concerns, making them difficult to handle as materials. Semiconductor and electronics manufacturing typically evaluates materials under ESH (Environment, Safety, and Health) criteria to ensure compliance with environmental, safety, and health standards.
Finally, an Intel governance update: Intel’s chairman is expected to step down at the shareholder meeting in May. The incoming chairman reportedly has prior executive experience at Qualcomm and Google. Under the current chairman, Intel separated its manufacturing operations to establish Intel Foundry. The new chairman is expected to maintain Intel’s position as an integrated device manufacturer.
MARCH 5, 2026
It appears that NVIDIA has halted production of the H200 GPUs that were intended for export to China and has instead redirected manufacturing capacity toward Vera Rubin. Reports suggest that around 250,000 H200 chips are currently sitting in inventory. Although export approvals exist between the United States and China, various restrictions remain in place, including a reported cap limiting Chinese companies to purchasing up to 75,000 chips.
The H200 is based on a TSMC 5 nm process paired with HBM3 memory, while Vera Rubin is expected to use a 3 nm process with HBM4. This means both manufacturing lines and component supply chains differ. If production is taking place at TSMC Fab 18, which supports both 5 nm and 3 nm nodes, adjustments to manufacturing plans might be relatively manageable. However, it is unclear which fab is actually being used.
In another NVIDIA-related development, reports indicate increased production of the RTX 3060. As part of the Ampere generation, it is believed to be manufactured using Samsung’s 8 nm process, whereas Ampere A100 GPUs produced at TSMC used a 7 nm node. Memory shortages appear to have reduced the availability of consumer GPUs in the market, and the RTX 3060 is reportedly being shipped together with memory as part of installation kits.
Turning to Intel, the Intel 18A process has been regarded primarily as an internal node for products such as Panther Lake, Xeon 6+ (Clearwater Forest), and next-generation Xeon 7. However, reports suggest that Intel has begun reconsidering external foundry expansion for this node. Toward the end of last year, there were indications that NVIDIA GPUs would not adopt 18A, reportedly due to low yields. It was also suggested that Intel Foundry would begin offering external customers access starting from the 14A node. Now, there are reports that an enhanced variant called 18A-P will be made available, implying that manufacturing progress has improved both performance and yield (throughput).
From a broader market perspective, Broadcom has begun offering XDSiP services, starting with FUJITSU-MONAKA. Chiplet technologies have previously been used mainly in chips designed for internal deployment or self-sales by companies such as Intel, AMD, and NVIDIA, or for internal use by hyperscalers like AWS. Broadcom’s move to provide 3.5D chiplet-based ASIC development as a commercial service represents a significant shift. Intel Foundry may likewise be accelerating efforts to capture demand for ASICs built with 3D stacking and chiplet configurations using EMIB and Foveros.
MARCH 6, 2026
It seems that MWC concluded today (March 5 local time). Reports suggest that Samsung may release smart glasses within this year. The mention of smart glasses brings to mind the early days of Google Glass. The initial consumer version was discontinued in 2015, and although industrial versions continued afterward, they were reportedly phased out around 2023. Now Samsung appears ready to reenter the space. Samsung’s smart glasses are said to recognize what the wearer is looking at and display related information on a connected smartphone screen. One reason Google Glass was discontinued was that its built-in camera raised privacy concerns among the public. Ten years later, in an era dominated by viral content on Instagram, X (formerly Twitter), and TikTok, one wonders whether public attitudes toward privacy have changed.
Such a system — capturing visual context via smart glasses and transferring it to smartphones for information retrieval — will likely increase network traffic. In the opening keynote at MWC, Qualcomm CEO Cristiano Amon reportedly predicted that as AI agents become widespread, global network traffic could increase by three to seven times by 2034, with roughly 30 percent attributable to AI-driven communications. Communication standards will need to evolve toward higher speed and capacity, and Qualcomm-led proposals for 6G and Wi-Fi 8 were reportedly highlighted at MWC. Communications and computing are like the two wheels of a vehicle — both must advance together.
Returning to NVIDIA topics, Texas Instruments and NVIDIA have reportedly formed a partnership in robotics. TI will provide millimeter-wave sensing technology, while NVIDIA will contribute AI capabilities via Jetson Thor (built on TSMC’s 4 nm process). Integration will be achieved through the NVIDIA Holoscan Sensor Bridge system. Since the target applications involve robotics operating in environments such as indoor facilities, warehouses, or outdoor fields, this appears to be an initiative in physical AI focused on environmental sensing. In terms of the commonly described five-layer AI stack, TI sensors would represent the infrastructure layer, Jetson Thor the hardware layer, data aggregation would occur in the cloud layer, training enhancements would take place using Omniverse-based models, and the ultimate goal would be remote operation or autonomous robotics. Jetson Thor has reportedly already been adopted by companies such as Boston Dynamics, so combining drivetrain expertise with new sensing capabilities could help advance autonomy.
Another NVIDIA-related development is that the company reportedly intends not to further increase investments in AI startups such as OpenAI and Anthropic. This suggests a reassessment of financial relationships with such startups, possibly because the scale of previous investments has already become sufficiently large. Startups eventually grow beyond the “startup” phase.
Turning to AI, data centers, and regulation, SambaNova — which Intel had previously considered acquiring before opting for partnership — has reportedly introduced the SN50, positioned as well suited for building agent-based AI systems. While the semiconductor process has not been disclosed, the previous-generation SN40L used TSMC’s N5 node, leading to speculation that the SN50 may use a 3 nm process. Power consumption is said to reach a maximum of 30 kW per rack, but can be reduced below 20 kW with slightly lower performance, enabling air-cooled deployments. SoftBank is reportedly a customer, with plans to install the systems in Japanese data centers.
The U.S. government is also expected to tighten export controls on AI chips, likely covering GPUs from NVIDIA and AMD. Details remain unclear, but for example, purchasing 1,000 units of a hypothetical GB300 system might involve relatively lighter screening, whereas large-scale procurements could require prior approval from U.S. authorities. Performance-based thresholds similar to those used in the Wassenaar Arrangement may be introduced. Previously about 40 countries were targeted, but reports suggest the scope could effectively expand worldwide. Approval processes may involve explanations of intended use as well as acceptance of inspections. Companies such as Broadcom and Qualcomm may also be affected. While U.S. vendors could face constrained business opportunities, some observers believe this could create openings for Chinese firms such as Huawei.
Around the end of last month, Anthropic reportedly declined a request from the U.S. Department of Defense to allow unrestricted use of its Claude AI system. As a result, the U.S. government is said to have designated Anthropic as a “supply chain risk,” effectively issuing a warning regarding its use. AWS, which provides access to Claude, reportedly stated that usage unrelated to defense applications would not be affected, and that alternative solutions would be offered if necessary. It is also understood that the Department of Defense had already shifted certain contracts from Anthropic to OpenAI.
Another AWS-related development concerns the earlier drone attacks on data center facilities in the Middle East (UAE and Bahrain). Iran has reportedly issued a statement acknowledging the attacks as intentional, claiming the objective was to assess whether the facilities were supporting U.S. military operations. Reports also suggest that the United States retaliated by striking two data centers in Tehran.
Finally, AI cloud provider IREN has reportedly decided to deploy 50,000 NVIDIA B300 GPUs in locations including British Columbia in Canada and Texas in the United States. The company previously focused on Bitcoin mining and will continue mining at a reduced scale while shifting toward AI cloud services. Although both involve GPUs, the industries are fundamentally different, so such a strategic pivot is notable. During the NVIDIA Pascal and Volta generations, GPU mining activity was widespread, and one occasionally wonders where all those GPUs eventually ended up.
MARCH 8, 2026
It seems that die photos of Intel’s Panther Lake have surfaced. The chip appears to have been physically removed from the CPU package, and the wiring layers were then stripped away starting from the bump side. The purpose of such removal likely varies from person to person — for example, to verify whether the chip structure matches publicly disclosed specifications, or to infer undisclosed design details.
Since this type of teardown falls under reverse engineering, it is often prohibited by contract. However, if there are no contractual restrictions, the act of disassembly itself is apparently not illegal. If the chip is removed layer by layer from the bump side, the lowest layer would eventually reach the transistors. Given the nanometer-scale fabrication, observing transistors directly would require an electron microscope. Even without going that far, viewing the upper interconnect layers should provide a rough sense of functional blocks such as cores and memory.
In the past, power wiring was typically wider than signal wiring, which likely served as a visual cue when identifying block boundaries. However, Panther Lake’s computing tile is built on Intel 18A with BSPD (backside power delivery), so functional boundaries may be harder to distinguish than before.
Panther Lake is said to consist of three chiplets and one base die, but six chiplet variants and three base die variants are reportedly being manufactured. Specifically, there are two types of computing tiles (16-core and 8-core, both on Intel 18A), two types of graphics tiles (12 Xe and 4 Xe, on TSMC N3E and Intel 3 respectively), and two types of platform control tiles (PCIe configurations with 20 lanes or 12 lanes, on TSMC N6). With these combinations, three final product SKUs are expected, resulting in three corresponding base tiles.
Although three chiplets each have two variants — theoretically enabling eight combinations — only three products will actually be offered. Even in the monolithic era prior to Raptor Lake, Intel created different dies depending on core count, so this approach may simply reflect a similar product segmentation strategy.
Turning briefly to social topics, the February U.S. employment report released on Friday delivered a surprising result. While forecasts suggested an increase of around 54,000 unemployed individuals, the actual figure showed a decline of 92,000 workers. One contributing factor cited was labor disputes involving healthcare workers in California. More broadly, the growing adoption of AI is also seen as having an impact. From January to February, layoffs appear to have increased year-over-year in fields such as computing, data processing, and electronic components.
I have written several times recently that companies may eventually expand by effectively using AI and thereby create more employment opportunities. However, it will likely take time for such a cycle to emerge, and in the meantime there will probably be fluctuations in employment due to job transitions.
This leads me to consider how efficient it actually is for AI to replace human labor. Current AI data center deployments are being discussed in gigawatt-scale terms. Meanwhile, the human brain is often said to consume about 20 Wh of power. This figure has been cited for roughly 15 years and is likely based on the brain’s daily caloric consumption, typically estimated at 350–450 kcal. This probably varies more with body size than intelligence. Since 1 kcal corresponds to about 1.16 Wh, assuming a daily brain consumption of 413 kcal translates to about 480 Wh per day. Dividing by 24 hours yields approximately 20 Wh.
However, this represents an average. Energy consumption must differ between sleep and wakefulness, and even during waking hours between rest and work. Suppose we assume 10 Wh for eight hours of sleep and 25 Wh for sixteen waking hours. If we further assume that eight waking hours of ordinary activity correspond to 20 Wh, then eight hours of focused work would be around 30 Wh. Since 30 Wh corresponds to operating at 30 W for one hour, this model suggests that the human brain’s sustained working capacity might be approximated as 30 W over an eight-hour period. This makes comparison with electronic systems easier.
If a human brain operates at roughly 30 W, then an AI workload consuming 1 kW corresponds to the energy consumption of about 33 people. This does not imply equivalent productivity — it is purely an energy comparison. Furthermore, since the human limit of 30 W applies to roughly eight hours, an AI workload running continuously at 1 kW for 24 hours would correspond to about 100 person-equivalents in terms of energy consumption. Would an AI system running at 1 kW truly perform the work of 100 people? One might hope so, but the answer feels somewhat uncertain.
This raises the question of whether it is possible to measure both workload and power consumption in a comparable way. Around the middle of last month, I wrote that running OpenAI Codex on Cerebras’ WSE-3 at around 1,000 TOPS produced responsiveness comparable to human pair programming. The WSE-3 reportedly consumes about 18 kW (or around 23 kW for the full CS-3 system). Operating for eight hours would correspond to the energy consumption of roughly 600 people, or about 1,800 people over 24 hours. Could such a system realistically perform the equivalent pair-programming work of 600 or even 1,800 individuals? In terms of concurrent user accounts, numbers of that scale might be feasible.
From this perspective, current AI workloads appear to consume significantly more energy than humans, and would need to deliver productivity equivalent to many individuals in order to justify their cost. As generations advance, the gap may narrow, eventually producing a more intuitive sense of practicality. For example, if AI workloads could be measured in tens of watts while performing work equivalent to several people, they might be considered roughly comparable to human capability.
Today, token cost per GPU is a common metric. However, if AI workload power consumption and productivity could be compared directly with human labor, then alongside progress in 3D scaling as defined by IRDS, something resembling a new version of Moore’s Law might emerge. Perhaps this is overly optimistic, but it is an intriguing possibility.
Finally, from a different angle: a film about TSMC is scheduled to be screened in Japan. It was reportedly shown in Taiwan last year. The original title is “造山者 世紀的賭注” (literally “Mountain Builders: A Century’s Gamble”), while the Japanese release is titled Chip Odyssey: Taiwan’s Gamble. Screenings are planned for March in Tokyo and April in Kumamoto.
MARCH 9, 2026
Speaking of which, NVIDIA’s GTC will take place this month, and Samsung is reportedly planning to showcase SOCAMM2 memory. Each memory module is said to support up to 192 GB. SOCAMM stands for Small Outline Compression Attached Memory Module. A Compression Attached Memory Module (CAMM) refers to a memory module that is attached using a compression-based mounting mechanism. In other words, while DIMMs are inserted into slots, CAMMs are fastened with screws.
DIMMs are constrained in the number of contacts they can provide by the length of their edge connector, whereas CAMMs can increase contact density across a surface, potentially offering an advantage in terms of area efficiency — although installation may be more cumbersome. CAMMs come in various shapes and sizes, but SOCAMM is NVIDIA’s standardized format, unified into a rectangular form factor measuring 14 mm by 90 mm. With SOCAMM2, support is expected to move from LPDDR5X to LPDDR6, increasing capacity from 128 GB to 192 GB.
A size of 14 mm by 90 mm is somewhat longer and narrower than an M.2 2280 module (22 mm by 80 mm). Incidentally, a stick of chewing gum is roughly 21 mm by 76 mm.
Continuing with NVIDIA-related topics, there was an article stating that NVIDIA has invested in Nscale, a UK-based AI startup. Nscale is reportedly a spin-out from a cryptocurrency mining operator. As a provider of compute resources, the company is said to be building multiple data centers across Europe. While the amount of NVIDIA’s investment has not been disclosed, Nscale itself appears to have raised around $2 billion. I recall reading somewhere that NVIDIA is becoming something like a bank — and in a sense, that may not be far from the truth.
Earlier this month, there were also reports that NVIDIA is increasing its investments in silicon photonics. Meanwhile, Google is said to have placed large orders with Taiwan’s Innolux for optical interconnect technologies. Google uses optical switches known as Apollo OCS (Optical Circuit Switch) in its backbone networks. Apollo employs MEMS to switch silicon mirrors. Although MEMS-based switching is not extremely fast, backbone networks do not require frequent switching, making such technology viable. These switches are reportedly used in systems from Google TPUv4 through TPUv7.
Companies such as Lumentum and Coherent, which NVIDIA is said to be investing in, also handle similar MEMS-based optical switching technologies. NVIDIA currently uses co-packaged optics (CPO) in its Spectrum-X Ethernet and Quantum-X InfiniBand platforms, but it may also be considering the use of OCS for backbone switching. Packet switching in the optical domain could reduce power consumption. Lower communication power requirements appear to be one of the enablers for scaling AI data centers.
On a somewhat nostalgic note, Microsoft CEO Satya Nadella reportedly commented that Windows would not have been born without Intel, and Office would not have been born without Apple. This was said during a presentation at a Morgan Stanley conference held last week. IBM’s personal computer, which adopted Intel’s 8086 processor, used an OEM version of MS-DOS known as PC-DOS. As IBM PC clones spread worldwide, MS-DOS effectively became a de facto standard. Around the time the processor lineage reached the 80386, Windows became practically usable. These developments feel quite nostalgic.
As for Office, Excel was originally something people used on the Mac. I personally did not own a Mac, but there was a Mac in the office at the company where I later worked, dedicated to running Excel. I seem to recall that calculations were done in Excel and graphs were generated using CricketGraph. My memory is somewhat vague, but before joining the company I used a DOS-based spreadsheet called MultiPlan. IBM’s Lotus 1-2-3 was also available on the market at the time. Eventually, Excel became available on Windows, and it was bundled into the MS-Office package. That, too, is now a nostalgic story.
Since last week, articles covering the Morgan Stanley conference have been appearing almost daily. In addition to the points mentioned earlier — such as AMD CEO Lisa Su’s remarks about surging CPU demand and Intel’s interest in offering Intel 18A-P to external foundry customers — NVIDIA CEO Jensen Huang reportedly said that constraints such as RAM shortages are actually welcome, since customers tend to make more deliberate purchasing decisions under such conditions. Many industry leaders appear to have spoken at the conference.
MARCH 10, 2026
There was also an article stating that NVIDIA has partnered with a startup called Thinking Machine Lab. The company appears to be a spin-out from OpenAI and is said to focus on developing AI systems designed for collaboration with humans rather than full autonomy. Although the financial details have not been disclosed, the startup has reportedly received investment from NVIDIA. Plans are said to include building a 1-GW Vera Rubin data center, though details remain unclear.
In another NVIDIA-related development, Sustainability Magazine reportedly hosted a meeting involving corporate executives together with AWS and NVIDIA. Discussions focused on using AI models to predict the impact of climate change on infrastructure and supply chains in order to mitigate business risks. The assumption seems to be that AWS would provide the cloud platform while NVIDIA’s Earth-2 models would be used for simulation. I believe Earth-2 began rolling out in late January.
Predicting how climate change may affect regional economies — and enabling companies to make individual decisions to avoid such risks — appears to be the objective. Pure weather forecasting tends to remain in the public domain and may lack business expansion potential. However, if risk prediction, mitigation strategy design, and impact analysis on corporate performance can be computed in an integrated manner, this could clearly address corporate needs. One might describe this as a kind of enterprise-level cyber-physical system.
Continuing with AWS-related news, there were reports that AWS has introduced an external key management system (KMS) from Eviden in Europe. This is said to be one of the features of AWS’s European Sovereign Cloud. Eviden is a subsidiary of Atos. Atos is a French IT company that also handles supercomputers — I believe its HPC division used to be known as Bull. By avoiding storage of master keys in the cloud itself, the system is said to enhance security.
Finally, regarding Intel: an open-source project related to Intel’s Optane memory has reportedly been archived. Intel had already withdrawn from Optane hardware around 2022, while Micron, its co-developer, had exited the collaboration even earlier. Optane used phase-change materials based on chalcogenides as its storage element, and the term PCM (Phase Change Memory) appeared frequently in articles at the time.
The memory structure — forming storage elements at the intersections of orthogonal wiring layers — was relatively easy to conceptualize. Intel and Micron had positioned Optane as a technology to bridge the memory hierarchy gap between DRAM and flash. Around that period, similar structures were discussed under names such as RRAM and MRAM, but industrially speaking, MRAM (which can be manufactured at TSMC) may be the only one that has remained viable.
MARCH 11, 2026
It seems that IBM has once again partnered with Lam Research to develop sub-1 nm transistors. About five years ago, when IBM announced its so-called 2 nm GAA (gate-all-around) transistor, the work was also conducted under a joint development framework with Lam Research.
Current EUV processes rely on wet etching, but wet processes appear to make it difficult to adapt to High-NA EUV. By combining Lam Research’s dry resist technology with High-NA EUV, the goal is reportedly to enable sub-1 nm patterning without sacrificing throughput. The research facility involved is said to be the IBM Research NY Creates Albany NanoTech Complex in Albany, New York State. If I recall correctly, engineers from Japan’s Rapidus also traveled to IBM in Albany for EUV training.
As a side note, 1 nm corresponds to 10 angstroms. Recently, node naming such as Intel’s 18A or TSMC’s A14 has shifted to angstrom-based terminology. The newly discussed sub-1 nm range would therefore imply single-digit angstroms. However, silicon’s crystal lattice constant is 5.431 Å. Could channel lengths (minimum feature sizes) shorter than the lattice constant really exist? That seems unlikely.
The reason single-digit angstrom node names are possible is that node naming has diverged from actual channel length or minimum line width. I described that historical background in my blog entry from February 8.
Turning to today’s NVIDIA news: there are reports that NVIDIA has invested $2 billion in the AI startup Nebius. The company plans to deploy AI data centers exceeding 5 GW by 2030. This suggests that NVIDIA’s business model is no longer simply to develop GPUs and sell them, but to develop them, provide investment, ensure their utilization, enable customers to generate profits, and then recover returns from that ecosystem. In this sense, GPUs and racks may be little more than tools for expanding the AI ecosystem. That may explain why NVIDIA continues searching for startups to fund. Perhaps no one will be able to catch up anymore — although AMD is making strong efforts.
Another side note: I once heard a metaphor that the god of opportunity has thick hair in the front but is bald in the back. If you fail to grab the hair as the opportunity approaches from the front, turning around afterward will not help — your hand will simply slip off the smooth back of the head as the opportunity disappears. It is a metaphor suggesting that those who recognize opportunities first gain the greatest rewards, while late imitators often lose out. Perhaps NVIDIA has managed to seize the god of opportunity by the hair.
MARCH 12, 2026
There are reports that autonomous driving trials in central Tokyo will begin, led by Nuro, a company backed by investors including NVIDIA, Toyota, and Uber. Nuro already operates robotaxi services in places such as San Francisco. Tokyo presents unique challenges, however: its streets are narrow, and traffic runs on the left. As a result, training data and experience from overseas deployments may prove only partially applicable. The goal is said to be Level-4 autonomous driving.
In another development related to autonomous driving, Qualcomm has reportedly partnered with the UK startup Wayve on ADAS (advanced driver assistance systems). The collaboration combines Wayve’s AI Driver with Qualcomm’s Snapdragon Ride platform. The idea is to provide a pre-integrated system that allows automakers to simplify the implementation of AI-based driving functions across multiple vehicle models. This integration is expected to enable progression from hands-off driving toward eyes-off capabilities, corresponding to Level-3 autonomy. Qualcomm appears to be placing increased emphasis this year on robotics and autonomous driving, moving in a direction similar to NVIDIA’s.
In a somewhat unusual move, Meta has announced a new series of chips for both training and inference. The lineup includes MTIA 300, 400, 450, and 500, with the newly revealed chip believed to be the MTIA 400. The MTIA 300 is reportedly already in mass production. The ASIC design work is being handled by Broadcom, with plans to release successive generations every six months starting from the 400 series.
The MTIA 400 appears to consist of two main compute dies placed side by side. Each die has two HBM stacks above and below it, for a total of eight HBM stacks. Judging from the apparent dimensions — each compute die looks roughly equivalent in area to six HBM stacks — the die size may be around 600 mm². The two dies are arranged through parallel translation rather than rotational symmetry. The left edge seems to connect to a PCIe chiplet, while the right edge connects to two network chiplets.
The naming beginning at 300 likely reflects earlier internal generations — MTIA1 (built on TSMC 7 nm) and MTIA2i (on TSMC 5 nm), corresponding to the 100 and 200 series. From the 300 generation onward, TSMC 3 nm technology may be in use.
Turning to AMD, there are reports that CEO Lisa Su will visit South Korea. She is expected to discuss memory supply with Samsung. The MI400X GPU will likely require twelve HBM4 stacks. She is also scheduled to visit Naver, where discussions related to AI computing infrastructure may take place. NVIDIA GPUs, if I recall correctly, already have deployment plans reaching around 260,000 units by 2030. AMD appears to be following NVIDIA’s lead.
In the quantum computing space, IBM has announced a reference architecture for QCSC (Quantum-Centric Supercomputing). This architecture consists of four layers: applications, application middleware, system orchestration, and hardware infrastructure. For example, within conventional CPU- and GPU-based quantum workflows, tasks requiring genuinely quantum execution results could be offloaded to quantum processing units (QPUs). The architectural validity of this approach is said to have been demonstrated through joint research involving IBM’s Quantum Heron processor and Japan’s Fugaku supercomputer, as well as other international collaborations.
Finally, regarding semiconductors and geopolitics: supply reductions of helium gas and bromine — both used in semiconductor manufacturing — are reportedly occurring due to the ongoing conflict involving the United States, Israel, and Iran. Helium is exported from Qatar, while bromine is exported from Israel. South Korea and Taiwan are said to be affected to some degree but have secured sufficient supply for the time being. However, prolonged conflict remains a concern. Semiconductor manufacturing depends on global supply chains, so disruptions in one region can easily create ripple effects elsewhere. Multi-sourcing strategies provide some resilience, but they also have their limits.
MARCH 13, 2026
By the way, NVIDIA’s GTC 2026 will begin next week. I wonder if the “five-layer cake theory of the AI industry” will appear again this year. On the GPU front, the next-generation architecture known as Feynman is expected to be announced.
What is drawing particular attention is inference chips, especially those based on Groq technology, which NVIDIA acquired last year. It is said that NVIDIA’s $20 billion acquisition of Groq has dramatically increased the valuation of startups developing NPUs for inference. When Intel acquired Nervana and Habana Labs, the deals were around $400 million (2016) and $2 billion (2019), respectively. In that sense, after the Groq acquisition, inference-focused NPU startups may no longer be easily acquired.
Another point to watch is whether a successor to the Vera CPU will be announced. Since Grace and Vera CPUs are expected to be shipped as standalone products to Meta, demand for Arm-based CPUs could increase going forward.
Another NVIDIA-related development: NVIDIA has reportedly partnered with Palantir Technologies. Palantir provides data analytics tools and has been involved in aggregating data for the U.S. government. The partnership appears to involve using NVIDIA’s Blackwell GPUs for the infrastructure layer of Palantir’s AIOS-RA (AI OS Reference Architecture). This will likely be formally announced at GTC.
Palantir has built services deeply integrated into government operations, and the emerging structure seems to be one in which AI capabilities are embedded into those service platforms, while NVIDIA provides the underlying AI infrastructure. This could enable rapid AI-assisted data analysis and decision support.
Turning to Intel, there are reports that the company has commented on CPU shortages. Production appears to be shifting toward high-end and mid-range products for hyperscalers and AI data centers. In addition to cutting-edge CPUs built on Intel 3 and Intel 18A processes, supply constraints may also affect Raptor Lake generation products manufactured on Intel 7. Entry-level CPUs for consumer boxed products, notebooks, and Chromebooks are likely to be impacted.
There were articles in early February stating that both Intel and AMD had informed Chinese manufacturers of CPU delivery delays. It seems that situation is continuing. While I do not have exact figures, Intel may have been exporting a significant volume of low-end CPUs for mini-PCs. The shift toward higher-end products may have reduced supply availability for China.
In silicon photonics news, a group called OCI MSA (Optical Compute Interconnect Multi-Source Agreement) has reportedly been established. Participants are said to include AMD, Broadcom, Microsoft, Meta, NVIDIA, and OpenAI. The group appears to consist of chip vendors deploying co-packaged optics (CPO), such as NVIDIA’s Spectrum-X and Broadcom’s Tomahawk, along with hyperscalers and AI cloud providers building AI data centers with those technologies.
Standardizing specifications and enabling multi-sourcing at this stage seems important. Currently, optical links operate at around 50 Gbps × 4 wavelengths, delivering 200 Gbps per fiber (one direction). The roadmap aims to increase this to 800 Gbps and ultimately to 3.2 Tbps per fiber. The scope is expected to cover not only CPO but also pluggable optical modules.
A brief note on Apple: the company will celebrate its 50th anniversary on April 1, and numerous related articles have appeared. Apple has also reportedly extended its agreement with Qualcomm regarding the Snapdragon 5G modem-RF system. Some reports suggest delays in Apple’s efforts to develop its own 5G modem.
Finally, AWS news: AWS and Cerebras have reportedly formed a partnership. Cerebras has announced that its CS-3 system will become available through AWS Bedrock. The arrangement appears to involve role separation between AWS Trainium and Cerebras’s WSE-3 to enable high-speed inference.
Inference workloads typically involve two stages: prefill (“understanding the prompt”) and decode (“generating the response”). Prefill is primarily compute-bound and will be handled by Trainium3. Decode is memory-bandwidth-bound, leveraging the exceptional bandwidth of the WSE-3, which stores model weights in SRAM to generate tokens. While GPUs may handle hundreds of tokens per second, this architecture is said to achieve several thousand TPS, enabling high-speed inference.
OpenAI partnered with Cerebras around mid-January, and its appearance on AWS roughly two months later suggests that the “era of Cerebras” may finally be arriving.
MARCH 14, 2026
There are reports that ByteDance, the Chinese parent company of TikTok, will gain access to an AI cluster using NVIDIA’s Blackwell GPUs. ByteDance is said to have partnered with Aolani Cloud, a cloud provider in Malaysia. The cluster reportedly consists of about 36,000 B200 GPUs, likely configured as around 500 GB200 racks.
Export restrictions have limited imports of H200 GPUs into mainland China, so securing access to NVIDIA GPUs via Malaysia — where such restrictions may not apply — appears to be the strategy.
Turning to AMD and semiconductor manufacturing, there were reports that AMD has settled a lawsuit with Adeia. Adeia is said to hold a portfolio of patents related to hybrid bonding technology and had alleged that AMD infringed seven to ten patents. Hybrid bonding is a relatively recent technology and has been widely used in flash memory (SSDs).
In 2024, AMD’s 3D V-Cache was found to infringe certain patents, leading to litigation beginning around last autumn. The settlement has now been reached. Although this concerns manufacturing technology, it is somewhat curious that AMD — rather than TSMC, the manufacturer — was the party involved in the lawsuit. Details have not been disclosed, but AMD may have needed to secure licensing rights.
Before hybrid bonding, stacking chips required pads and micro-bumps between dies. Pad diameters and micro-bump diameters were typically around 30 µm, with similar spacing required. With the advent of face-to-face hybrid bonding, direct Cu-to-Cu connections became possible, eliminating the need for pads and bumps and enabling connection pitches as tight as 9 µm, as seen in technologies such as Intel’s Foveros.
Previously, fillers were required to bridge the height gaps created by bumps, and thermal conductivity considerations were also significant. These requirements have largely disappeared. Hybrid bonding appears to be an essential technology for 3D integration.
MARCH 15, 2026
It seems that Elon Musk may be entering semiconductor manufacturing. Reports indicate that the “Terafab Project” will be launched on March 21. The term tera is one thousand times larger than giga, and gigafab has often been used to refer to TSMC’s 12-inch wafer fabrication plants. Perhaps this reflects a competitive stance toward TSMC.
According to Musk, it may not be necessary to maintain cleanliness across an entire room; instead, isolating a clean environment only around the wafer might suffice. This suggests maintaining cleanliness within wafer transport boxes and process chambers. Perhaps the docking of these boxes and chambers is conceptually similar to the docking of Starship with a space station — securely connecting and disconnecting sealed clean zones.
However, in semiconductor equipment, maintenance work would still seem to require cleanroom environments. I do not know the detailed feasibility of this approach.
AMD has also announced the OpenClaw framework. This appears to be a concept for running generative AI locally on PCs. It may be seen as a response to Intel’s AI PC vision, although current AI PC implementations — such as Copilot+ Windows 11 — still assume network connectivity to data centers. An “agent PC” that operates locally could represent a more forward-looking idea.
The proposed hardware configurations include RyzenClaw and RadeonClaw. The OpenClaw stack consists of WSL2 (Windows Subsystem for Linux) and LM Studio, allowing users to run generative AI models of their choice. RyzenClaw is said to feature 128 GB of system memory, while RadeonClaw includes 32 GB of VRAM — suggesting a relatively high-end price point.
Looking at the OpenClaw configuration, the idea may be to perform reinforcement learning on WSL2 while handling inference through LM Studio, thereby constructing a personal AI agent entirely in a local environment. I also have LM Studio installed on my PC, but once model sizes exceed 8B parameters and reach 12B or 20B, even 128 GB of memory begins to feel constrained.
If one wants to perform reinforcement learning, an iGPU is likely insufficient, and a discrete GPU with around 16 GB of VRAM — along with WSL2 — would be required. (So I have not yet attempted reinforcement learning myself.) Until the practical benefits become clearer, the hardware requirements may feel somewhat excessive.
This Blog text was translated by AI from Japanese Source Blog.