Sunway SW26010-Pro is China’s most powerful supercomputer chip to date

Why it matters: The mysterious CPU architecture employed by Beijing for its supercomputers has now been detailed. A new version of the Sunway SW26010 processor greatly improves the chip’s ability to crunch numbers, but it likely won’t mean the end of Western prowess in the HPC business, at least not yet.

Sunway SW26010 Pro is China’s latest homemade processor for High Performance Computing (HPC) machines, a chip previously used in the Sunway TaihuLight supercomputer. The Sunway (or Shenwei) CPU series is based on a seemingly custom RISC ISA, and it employs a manycore architecture to provide the high degree of parallel processing needed in HPC workloads.

Chinese researchers have provided some juicy, previously unknown details about the Sunway processors during the SC23 International Conference for High Performance Computing, highlighting how the technology has been rapidly evolving during the past few years. Sunway SW26010 Pro is seemingly four times more powerful than the SW26010 chip. It runs faster and has more cores with wider vector widths.

Each Sunway SW26010 Pro chip seemingly has a maximum throughput in double-precision floating-point format (FP64) of 13.8 TFLOPS, which would be a pretty remarkable result when an AMD EPYC 9654 CPU has a peak FP64 performance of around 5.4 TFLOPS. The chip employs the same base 64-bit RISC architecture as the previous generation, providing some key enhancements here and there, of course.

Each SW26010-Pro CPU includes a whopping total of 384 computing cores, which are packed in six different core groups (CG). A separate management processing element (MPE) provides a superscalar, out-of-order core with a vector engine to manage the computing traffic, which ultimately goes through a meager 128-bit DDR4-3200 memory interface.

The chip tries everything it can to reduce data movement between cores, and it does so with a 2.25 GHz clock for the computing cores and a 2.10 clock for the MPE. The previous Sunway SW26010 CPU boasted a 1.45GHz clock for both cores and MPE. The previously employed DDR3 memory controller has also been replaced with DDR4 memory, which increases the total amount of RAM supported by one CPU from 32GB to 96GB.

While SW26010-Pro is seemingly capable of providing a significant technology improvement to China’s HPC research, the unusual choices for cache and memory interface will likely provide no outstanding results to the country’s advanced research. The new SW26010 Pro-based supercomputer seems to be designed with the ultimate goal of winning big in the TOP500 list, not solving modern computing problems faster.

Source link

Survival of the fittest? New study shows how cancer cells use cell competition to evade the body’s defenses

The Fens of eastern England once held vast woodlands