Gráfica GPU made in China

New Chinese GPU arrives to challenge Nvidia's AI dominance but falls woefully short - Loongson unveils LG200 GPGPU, up to 1 Tflops of performance per node​

At a high level, the Loongson LG200 is a highly parallel processor akin to AI and HPC GPUs by AMD and Nvidia. Loongson's LG200 supports the OpenCL 3.0 application programming interface (API) for compute, which is good enough for high-performance workloads including AI and HPC.
The block diagram of Loongson's LG200 depicts a processor organized in four clusters, each featuring 16 small ALUs, four bigger ALUs, and one huge ALU or a special-purpose unit. Unfortunately, we cannot draw any conclusions from analyzing the diagram, as it is light on actual technical detail.

Loongson has yet to disclose the specifications of its LG200 processor. We know it supports INT8 data format for AI workloads and probably FP32 and FP64 for graphics workloads, respectively. Also, Loongson claims that the LG200's compute performance is from 256 GFLOPS to 1 TFLOPS per node, though it didn't disclose the precision it used for the metric.


Even if the company used FP64 for its performance claims, the processor is dramatically slower than modern GPUs.
https://www.tomshardware.com/pc-com...pc-gpu-up-to-1-tflops-of-performance-per-node


Blacklisted Chinese GPU developer secures $280 million in funding - Biren gets cash infusion from Guangzhou government-backed investors​

Biren Technology, a Chinese AI GPU designer, has recently obtained an investment of ¥2 billion (approximately $280 million USD) from investors backed by the Guangzhou government, reports Bloomberg. This investment follows the company's inclusion into the U.S. government's Entity List over a year ago and layoffs to cut costs. With $280 million, the company has enough funds for ongoing operations.
The addition of Biren to the U.S. Department of Commerce's Entity List posed significant challenges to the company, limiting Biren's access to TSMC's leading-edge process technologies. Biren's key challenge is ensuring a steady supply of its AI GPUs, possibly from China-based fab SMIC. To do so, Biren has to redesign its BR104 ASIC for SMIC's 2nd generation 7nm-class process technology or develop a new chip from scratch. Meanwhile, whether a Biren ASIC made by SMIC will be as competitive as its BR104 made by TSMC remains to be seen.
he addition of Biren to the U.S. Department of Commerce's Entity List posed significant challenges to the company, limiting Biren's access to TSMC's leading-edge process technologies. Biren's key challenge is ensuring a steady supply of its AI GPUs, possibly from China-based fab SMIC. To do so, Biren has to redesign its BR104 ASIC for SMIC's 2nd generation 7nm-class process technology or develop a new chip from scratch.
https://www.tomshardware.com/pc-com...on-from-guangzhou-government-backed-investors
 

Nvidia's biggest Chinese competitor unveils cutting-edge new AI GPUs — Moore Threads S4000 AI GPU and Intelligent Computing Center server clusters using 1,000 of the new AI GPUs​

Screenshot-2023-12-20-at-02-28-32-Nvidia-s-biggest-Chinese-competitor-unveils-cutting-edge-new-AI-GP.png

Although Moore Threads didn't reveal everything there is to know about its S4000 GPU, it's certainly a major improvement over the S2000 and S3000. Compared to the S2000, the S4000 has over twice the FP32 performance, five times the INT8 performance, 50% more VRAM, and presumably lots more memory bandwidth too. The new flagship also makes use of the second generation MUSA (Moore Threads Unified System Architecture) architecture, while the S2000/S3000 used the first generation architecture.
The S4000 also has critical GPU-to-GPU data capabilities, with a 240 GB/s data link from one card to another and RDMA support. This is a far cry from NVLink's 900 GB/s bandwidth on Hopper, but the S4000 is presumably a much weaker GPU, making such a high amount of bandwidth overkill.
Alongside the S4000, Moore Threads also revealed its KUAE Intelligent Computing Center. The company describes it as a "full-stack solution integrating software and hardware," with the full-featured S4000 GPU as the centerpiece. KUAE clusters use MCCX D800 GPU servers, which each have eight S4000 cards. Moore Threads says each KUAE Kilocard Cluster has 1,000 GPUs, which means a total of 125 MCCX D800 servers per cluster.

On the software side, Moore Threads claims KUAE supports mainstream large language models like GPT and frameworks like DeepSpeed. The company's MUSIFY tool apparently allows the S4000 to work with the CUDA software ecosystem based on Nvidia GPUs, which saves Moore Threads and China's software industry from having to reinvent the wheel.
https://www.tomshardware.com/pc-com...server-clusters-using-1000-of-the-new-ai-gpus
 
O artigo está em alemão, traduzido pelo Google


Moore Threads MTT-S80 & S30 in the test: The China graphics cards in the iGPU test track are as fast as they are​

Since the end of 2022, the MTT series from the manufacturer Moore Threads has been available gaming graphics cards from Chinese development and manufacturing. It was last said that new drivers had significantly increased their performance. ComputerBase has snapped and measured the top model MTT-S80 and the small MTT-S30.

Screenshot-2024-03-11-at-21-11-31-Moore-Threads-MTT-S80-MTT-S30-China-GPUs-im-Test.png


ComputerBase can understand this using an MTT-S80 imported from China in early 2023. From slightly over 3,365 points, the current driver (initially 240.50, then 240.60) is up to over 8147 points, which corresponds to an increase of 140 percent (factor 2.4).

And what does it mean for the performance in games?

To find out, ComputerBase has snatched the course most recently used for the iGPUs of the Ryzen-8000G processors. It should also allow good orientation because a Radeon RX 6400 (768 shaders) or GeForce GTX 1650 (896 shaders) was included.

In addition to the MTT-S80, the significantly smaller MTT-S30 was also used. Both went through the tests with the 240.60 driver on a Ryzen 7 5800X with 16 GB DDR4-3200CL14 on an MSI B550 Tomahawk (AM4). The motherboard continues to be a decisive influence: The Moore Threads graphics cards only work with selected models.

What is going on?


But before it came to testing, the question arises first: Which title from the iGPU test course runs with current drivers on the Moore Threads graphics cards? Answer: Using four of the eleven games (alternative) DirectX 11, the rest is out with DirectX 12. And how fast are the two Moore Threads graphics cards about this reduced four-game course?

Screenshot-2024-03-11-at-21-11-56-Moore-Threads-MTT-S80-MTT-S30-China-GPUs-im-Test.png


On average, the MTT-S80 barely beats the iGPU of the AMD Ryzen 7 5700G with 8-CUs Vega-iGPU from 2021, the current 8-CUs-RDNA-3-iGPU in the Ryzen 5 8600G is 60 percent ahead. With a view to the 4.096 shaders of the MTT-S80-GPU, of course, this is still alarmingly little performance in practice. The MTT-S30, which is also aimed at business PCs rather than gaming PCs, reaches a quarter of the performance level. The result of the frametimes is comparable.

Screenshot-2024-03-11-at-21-18-01-Moore-Threads-MTT-S80-MTT-S30-China-GPUs-im-Test.png

https://www.computerbase.de/2024-03/moore-threads-mtt-s80-mtt-s30-china-gpus-test/
 
Back
Topo