PCIe 6.4/CXL 3.2 Fabric Switch Sample is Out Now! - Request the Silicon Sample via[email protected]
Learn More
Logo
  • About
  • Product
  • Technology
  • Newsroom
  • Careers
VisionLeadershipHistoryMembership

Hardware

PanSwitchPanRetimer

Silicon IP

LAU IPController IP

Custom Silicon & Solutions

PanEndpointPanFabricTotal AI Solution
Technical ReportsTech BlogPublications
EnglishKorean
CareersPositionsApply
Contact Us
  1. Back to Publications
  2. /
  3. Compute Can't Handle the Truth: Why Communication Tax Prioritizes Memory and Interconnects in Modern AI Infrastructure

Featured Publication

Compute Can't Handle the Truth: Why Communication Tax Prioritizes Memory and Interconnects in Modern AI Infrastructure

Compute Can't Handle the Truth: Why Communication Tax Prioritizes Memory and Interconnects in Modern AI Infrastructure

Myoungsoo Jung

arXiv (Technical Report)

2025

Research Areas
Coherent Interconnect
Read PaperGoogle Scholar

Abstract

Modern AI workloads such as large language models (LLMs) and retrieval-augmented generation (RAG) impose severe demands on memory, communication bandwidth, and resource flexibility. Traditional GPU-centric architectures struggle to scale due to growing inter-GPU communication overheads. This report introduces key AI concepts and explains how Transformers revolutionized data representation in LLMs. We analyze large-scale AI hardware and data center designs, identifying scalability bottlenecks in hierarchical systems. To address these, we propose a modular data center architecture based on Compute Express Link (CXL) that enables disaggregated scaling of memory, compute, and accelerators. We further explore accelerator-optimized interconnects-collectively termed XLink (e.g., UALink, NVLink, NVLink Fusion)-and introduce a hybrid CXL-over-XLink design to reduce long-distance data transfers while preserving memory coherence. We also propose a hierarchical memory model that combines local and pooled memory, and evaluate lightweight CXL implementations, HBM, and silicon photonics for efficient scaling. Our evaluations demonstrate improved scalability, throughput, and flexibility in AI infrastructure.


Related Publications
Featured
MPI-over-CXL: Enhancing Communication Efficiency in Distributed HPC SystemsSPICE • 2025
Coherent Interconnect
Operating Systems
+1 more
Featured
ScalePool: Hybrid XLink-CXL Fabric for Composable Resource Disaggregation in Unified Scale-up DomainsDIMES • 2025
Coherent Interconnect
Architecture
CXL Topology-Aware and Expander-Driven Prefetching: Unlocking SSD PerformanceIEEE Micro • 2025
Coherent Interconnect
Machine Learning
+1 more
View All Publications
Logo

Building the future of AI infrastructure with innovative semiconductor solutions.

Privacy Policy© 2025 Panmnesia, Inc.
All rights reserved.
About
VisionLeadershipHistoryMembership
Product

Hardware

PanSwitchPanRetimer

Silicon IP

LAU IPController IP

Custom Silicon & Solutions

PanEndpointPanFabricTotal AI Solution
Technology
Technical ReportsTech BlogPublications
Newsroom
EnglishKorean
Careers
CareersPositionsApply
Logo

Building the future of AI infrastructure

Quick Access
AboutProductsCareersNews
Technical ReportsPublications

About

▼

VisionLeadershipHistoryMembership

Products

▼

PanSwitchPanRetimerLAU IPController IPPanEndpointPanFabricTotal AI Solution

Technology

▼

Technical ReportsTech BlogPublications

Newsroom

▼

EnglishKorean

Careers

▼

CareersPositionsApply
Privacy Policy© 2025 Panmnesia, Inc.