# Rack-Scale Servers for the Post-Moore Era

# Pooria Poorsarvi Tehrani, Mohammad Arman Soleimani

#### Motivation

## **Current Solutions**

•Hardware resources are underutilized and wasted in data centers



#### **Compute Express Link (CXL)**

Interconnect between processors and devices such as memory and accelerators

- Coherent access to system and device memory
- Enables memory pooling and sharing devices V Example: Pond [Li '23] saves 7% DRAM using a shared memory pool

### •Hardware Resource Disaggregation is a promising solution

- Lower maintenance costs V
- Easier resource pooling
- Isolated points of failure V
- Requires rearchitecting hardware and software
- Needs new tools and simulators for research
- Cost of CXL hardware diminishes cost saved from pooling [Levis '23]
- High latency hurts performance
- Coherency needs might be different from what CXL implements

## Rack-Scale Computing



# Proof of Concept

- Goal: Flexible Multi-Node Design ullet
  - Slot in different simulators
  - Easily extend to add new nodes
  - Easily modify existing nodes
- Support different levels of simulation fidelity
- PoC scope: emulate just the CPU and disaggregated memory with the PICs on

- •Disaggregate hardware resources across a rack
- •Use intra-rack fabrics for rapid data movement
- •Communicate among servers via Pod Interface Chip (PIC)
- •Many unclear hardware and software design elements





## How do we design a simulation infrastructure to use for deeper studies into rack-scale design?



- Investigate higher bandwidth serial fabrics
- Expand simulator functionality

