This piece outlines the progression of ideas behind my work on AI infrastructure. The current expression of that thinking is captured here → The Great AI Memory Foundry: A Foundational Thesis.
What follows is a collection of writing on how AI systems behave outside of controlled environments, where data access, memory, and system constraints begin to matter more than the model itself.
What started as a series of individual observations has developed into a broader view: the next limiting factor in AI is not compute alone, but how data moves and how memory is accessed across increasingly complex systems.
The pieces below reflect that progression, from early signals and architectural pressure points to a more complete view of what is actually limiting AI at scale.
1. Early Observations: Performance, Storage, and Data Movement
Early observations on performance, latency, and the role of memory-adjacent technologies that pointed to a shift away from compute alone.
2. The Real Constraint: Data Movement and System Boundaries
Where the underlying issue becomes clearer. The problem is not just models, but how data moves through systems, how memory is accessed, and how those boundaries impact real-world AI performance.
- AI Market Inflection Point: Hype vs Reality
- The AI Shift: The Pace of Change is Staggering
- AI Leadership Doesn’t Stop at Chips
- It was always about data movement
- What Three Decades of Data Infrastructure Taught Me About AI
2a. From Experimentation to Execution: AI PoC Purgatory
A recurring pattern across enterprise AI adoption is the gap between successful pilots and production deployment. The models may work in controlled environments, but the breakdown happens when they have to operate inside real systems with real constraints.
3. The Memory Shift
Where the core thesis begins to take shape. The architecture is no longer purely compute-centric. Memory, bandwidth, and locality are becoming the real pressure points.
4. The Breaking Point: The Nebula Gap
The next-order constraint appears inside the system itself, where memory and data movement bottlenecks now show up within GPUs and across distributed environments.
5. Market Reality and Misinterpretation
A more recent view of how the market continues to misunderstand the underlying constraints, especially around memory, supply dynamics, and what is actually limiting performance in production AI systems.
5a. Mitigation vs. Resolution
Many of the industry’s current responses are not solutions. They are mitigations. Bandwidth, interconnects, and data movement optimizations may buy time, but they do not resolve the deeper issue of memory locality and system balance.
