AMD’s Next-Gen Silicon Leak: 192GB Unified Memory and the First PRO 3D V-Cache CPU

The landscape for local AI agent development is shifting rapidly from “GPU-only” workflows toward more integrated, high-memory-density solutions. For builders at AgentRigs, the hardware requirements for running large language models (LLMs) and complex agentic loops—which often involve high memory bandwidth and massive VRAM requirements—are becoming the primary bottleneck.

Recent leaks from PassMark benchmarks have revealed two significant developments in AMD’s upcoming “PRO” lineup: the Ryzen AI Max+ PRO 495 APU and the Ryzen 9 PRO 9965X3D. These chips represent two different but equally vital paths for AI builders: massive unified memory for model capacity and high-performance 3D V-Cache for orchestration efficiency [1], [2].

The Ryzen AI Max+ PRO 495: A Unified Memory Powerhouse

The most striking revelation for the AI community is the emergence of the “Strix Halo” silicon under the branding of the Ryzen AI Max+ PRO 495. This chip is not a traditional CPU but a high-performance APU (Accelerated Processing Unit) designed to bridge the gap between mobile efficiency and workstation-class memory capacity.

Breaking the VRAM Barrier

The standout specification for the Ryzen AI Max+ PRO 495 is the support for up to 192GB of unified memory [1]. In a traditional PC build, AI enthusiasts are often forced to choose between consumer GPUs with limited VRAM (typically 8GB to 24GB) or prohibitively expensive enterprise cards like the NVIDIA RTX 6000 Ada or H100.

With 192GB of unified memory, the AI Max+ PRO 495 changes the math for local LLM execution. Because the memory is unified, the integrated GPU (iGPU) can theoretically access a vast majority of that pool for model weights, similar to the architecture found in Apple’s M-series Ultra chips.

What 192GB of Unified Memory Enables:

  • Large Model Execution: Running a Llama-3 70B model at high quantization (or even FP16) becomes possible on a single-chip system without offloading to slower system RAM.
  • Massive Context Windows: Large context windows (128k tokens and beyond) consume significant amounts of KV cache memory. This APU provides the headroom needed to maintain long-term “memory” for agents without swapping to disk.
  • Multi-Model Pipelines: AI agent builders often need to run a primary LLM alongside smaller “expert” models for vision, coding, or tool-calling. A 192GB pool allows these to reside in memory simultaneously for near-instant switching.

Performance Expectations

Leaked PassMark data suggests that while the AI Max+ PRO 495 retains a core configuration similar to its predecessors (likely 16 Zen 5 cores), it offers a modest performance uplift over previous Strix Halo iterations [1]. This suggests that AMD is focusing on refining the interconnects and memory controller efficiency—critical for the high bandwidth required by its RDNA 3.5+ graphics compute units—rather than just increasing raw clock speeds.

FeatureRyzen AI Max+ PRO 495 (Leaked)
ArchitectureZen 5 (CPU) / RDNA 3.5+ (GPU)
Max Memory SupportUp to 192GB Unified LPDDR5X
Target MarketHigh-end Mobile Workstations / SFF AI Rigs
Core Count16 Cores / 32 Threads

The Ryzen 9 PRO 9965X3D: Enterprise-Grade 3D V-Cache

While the AI Max+ focuses on memory capacity, the newly spotted Ryzen 9 PRO 9965X3D targets processing efficiency and low-latency execution. This marks the first time AMD has brought its 3D V-Cache technology—previously reserved for gaming and standard consumer chips—to the “PRO” lineup [2].

Why 3D V-Cache Matters for AI Agents

For those building AI agents, the “agentic loop” involves more than just model inference. It involves high-frequency logic processing, including:

  1. Orchestration: Python-based frameworks like LangChain or CrewAI managing complex logic.
  2. Vector Database Queries: Searching through local embeddings (RAG) to provide context.
  3. Tool-Calling: Executing code, parsing JSON, or making API calls based on model output.

These tasks are often instruction-heavy and benefit significantly from a large L3 cache. The 3D V-Cache technology stacks additional memory directly on top of the CPU die, drastically reducing the time the processor spends waiting for data from the system RAM, which in turn reduces “stutter” in the agent’s reasoning chain.

Specifications and Efficiency

The 9965X3D is a 16-core, 32-thread beast based on the Zen 5 architecture. According to PassMark leaks, it performs very similarly to the consumer-grade Ryzen 9 9950X3D [2]. However, the “PRO” designation implies a few key differences critical for professional agent builders:

  • Lower TDP & Thermal Stability: The PRO variant likely operates at a more conservative Thermal Design Power (TDP) to ensure stability in 24/7 workstation environments [2]. This is a boon for builders looking for high compute density in quiet, air-cooled setups.
  • Enhanced Security: AMD PRO technologies include full system memory encryption (AMD Memory Guard), which is vital if your AI agent is handling sensitive personal data or proprietary company information.
  • Manageability: For teams deploying multiple “Agent Rigs” across an office, the PRO management features allow for easier remote deployment and hardware-level monitoring.

Comparative Analysis: Which Path for Your Agent Rig?

Choosing between these two leaked chips depends entirely on where your specific bottleneck lies in the AI development lifecycle.

The Case for the AI Max+ PRO 495

If your primary goal is running the largest possible models locally with a minimal footprint, the 192GB unified memory capacity of the AI Max+ PRO 495 is unbeatable. It effectively turns a laptop or a small-form-factor (SFF) PC into a “Mini-Mac Studio,” providing the VRAM equivalent of eight RTX 4090s (in terms of capacity, not raw TFLOPS) in a single socket.

The Case for the Ryzen 9 PRO 9965X3D

If you already have a powerful discrete GPU (like an RTX 4090 or dual 3090s) and your bottleneck is the speed at which your agent can process logic, query databases, and manage multi-threaded tasks, the 9965X3D is the superior choice. The 16 Zen 5 cores combined with the massive L3 cache ensure that the “brain” of your agent rig never bottlenecks the GPU.

Hardware Synergy in the Zen 5 Era

Both of these chips benefit from the underlying Zen 5 architecture improvements. Zen 5 introduces enhanced AVX-512 performance, which is a critical instruction set for CPU-based AI acceleration, data preprocessing, and vector math.

Furthermore, the shift toward higher memory capacities in the PRO line suggests that AMD recognizes the “Local AI” movement isn’t just a hobbyist niche but a professional requirement. Builders can now look forward to systems that don’t just excel at “gaming” or “office work,” but are purpose-built for the high-bandwidth, high-memory-pressure environments that modern AI agents demand.

Final Thoughts for Builders

The leak of the Ryzen AI Max+ PRO 495 and the Ryzen 9 PRO 9965X3D signals a maturing market for AI hardware. We are moving away from a world where AI builders have to “make do” with gaming hardware and into an era of specialized silicon.

The potential for a 192GB unified memory APU [1] could democratize access to 70B+ parameter models for developers on the move, while the introduction of 3D V-Cache to the PRO line [2] ensures that the orchestration layer of our agents remains as fast as the models they control. As these chips move from leaked benchmarks to retail shelves, the possibilities for localized, private, and powerful AI agents will expand exponentially. For the AgentRigs community, the next generation of builds just got a lot more interesting.


Sources & Further Reading