A few months ago, I posted a piece about PLDA expanding its support for two emerging protocol standards: CXL™ and Gen-Z™. The Compute Express Link (CXL) specification defines a set of three protocols that run on top of the PCIe PHY layer. The current revision of the CXL (2.0) specification runs with the PCIe 5.0 PHY layer at a maximum link rate of 32GT/s per lane. There are a lot of parts to this specification and multiple implementation options, so a comprehensive support package will significantly help adoption. This is why PLDA brings flexible support for compute express link (CXL) to SoC and FPGA designers.
The three previously mentioned protocols that make up CXL are:
- CXL.io: which is very similar to traditional PCIe and is responsible for discovery, configuration, and all the other things that PCIe is responsible for
- CXL.cache: which gives CXL devices coherent, low latency access to shared host memory
- CXL.mem: which gives the host processor access to shared device memory
CXL defines 3 types of devices that leverage different combinations of these protocols depending on the use case.
As shown in figure 1, a Type 1 device combines CXL.io + CXL.cache channels. Typical Type 1 devices may include PGAS NICs (with shared global address space) or NICs with atomics.
Figure 2 illustrates a Type 2 device combining all 3 channels.
Type 2 devices may include accelerators with memory such as GPUs and other dense computation devices.
Figure 3 shows a Type 3 device with CXL.io and CXL.mem channels. A typical Type 3 device may be used for memory bandwidth expansion or memory capacity expansion with Storage-Class Memory
The goal of CXL is to maintain memory coherency between the CPU memory space and memory on attached devices, which improves performance and lowers complexity and cost. CXL.cache and CXL.mem support this strategy. To implement CXL into a complex SoC, an interface will be required to transfer packets between the user application and the protocol controller. Various interconnect technologies are available:
AMBA AXI: is a parallel high performance synchronous, high frequency, multi-master, multi-slave communication interface which is mainly designed for on-chip communication. It has been widely used across the industry and for many projects. The AMBA® AXI™ protocol is typically chosen to reduce time to market and ease integration.
CXL-cache/mem Protocol Interface (CPI): CPI allows mapping of different protocols on the same physical wires. The spec is a public-access protocol which has been defined by Intel and totally fits with the CXL spec. It is designed for CXL.cache and CXL.mem and allows mapping of CXL.cache and CXL.mem on the same wires. It is a lightweight low-latency protocol.
AMBA CXS: is a streaming protocol that enables the transmission of packets with high bandwidth between the user application and the protocol controller. Via the CXS interconnect, the designer can bypass the controller’s transaction layer which can reduce the latency. CXS specifications have been designed by Arm, to be implemented seamlessly with Arm-based System-on-Chip solutions.
Each of these interfaces has its benefits and own use cases.
Here are some implementation examples:
- Option 1 (Figure 4): The designer chooses CPI for cache & mem channels:
This is the most generic option providing lowest latency and highest flexibility. It allows designers to implement custom memory and cache management that may be independent from the CPU architecture.
- Option 2 (Figure 5): The designer chooses CPI for Cache channel and AMBA AXI for mem channel.
This option allows for custom cache management while configuration and memory management is managed by the CPU subsystem via the NoC. It can be an interesting option for prototyping CXL.mem on SoC or FPGA with built-in AMBA AXI interconnect.
- Option 3 (Figure 6): The designer chooses CXS
This option is specific to Arm based SoC and allows seamless connection to the Arm CoreLink Coherent Mesh Network interconnect and Arm CPU subsystem. It allows support for coherent communication via CXL (to the CPU), and CCIX (to accelerators).
PLDA has designed a highly flexible IP to meet all the needs of CXL implementation in a complex SOC or FPGA. Flexibility is a fundamental part of the DNA of PLDA, and the company has deep domain expertise in PCIe. So, XpressLINK-SOC naturally fits in the roadmap to support designers who need to implement CXL in a complex design. This parameterized soft IP product supports all the device types and many interconnect options.
In addition, PLDA has made a unique effort to design a CXL IP that supports FPGA from Xilinx® and INTEL®, enabling designers to prototype and bring up easily their CXL systems. An insurance of reliability for the final design.
- The AMBA AXI Protocol Specification for CXL.io traffic
- Either the Intel CXL-cache/mem Protocol Interface (CPI), the AMBA CXS Interface or the AMBA AXI Protocol Specification for CXL.mem
- Either a CPI interface or the AMBA CXS Protocol Specification for CXL.cache traffic.