Articles with "cxl memory" as a keyword



OASIS: Outlier-Aware KV Cache Clustering for Scaling LLM Inference in CXL Memory Systems

Sign Up to like & get
recommendations!
Published in 2025 at "IEEE Computer Architecture Letters"

DOI: 10.1109/lca.2025.3567844

Abstract: The key-value (KV) cache in large language models (LLMs) now necessitates a substantial amount of memory capacity as its size proportionally grows with the context’s size. Recently, Compute-Express Link (CXL) memory becomes a promising method… read more here.

Keywords: memory; cxl memory; llm inference; oasis outlier ... See more keywords

Memory Pooling With CXL

Sign Up to like & get
recommendations!
Published in 2023 at "IEEE Micro"

DOI: 10.1109/mm.2023.3237491

Abstract: Compute Express Link (CXL) has recently attracted great attention thanks to its excellent hardware heterogeneity management and resource disaggregation capabilities. Even though there is yet no commercially available product or platform integrating CXL into memory… read more here.

Keywords: memory; cxl memory; pooling cxl; memory pooling ... See more keywords

Improving Key-Value Cache Performance With Heterogeneous Memory Tiering: A Case Study of Compute-Express-Link-Based Memory Expansion

Sign Up to like & get
recommendations!
Published in 2025 at "IEEE Micro"

DOI: 10.1109/mm.2024.3358861

Abstract: Compute Express Link (CXL) memory brings extra bandwidth and capacity via Peripheral-Component-Interconnect-Express-based memory expansion beyond double-data-rate-based dynamic random-access memory. This article introduces the CXL 2.0 memory expansion solution, which incorporates two parts: 1) a CXL… read more here.

Keywords: memory expansion; memory; cxl memory; express ... See more keywords