The first traces of Moore Threads' GPU programming software stack, dubbed MUSA, have surfaced online, furthering the nation's pursuit of tech-autarky. MUSA serves as an alternative to Nvidia's CUDA environment, compatible with the domestic MUSA MTT GPU lineup. Any open-source pedigree of the SDK has not been mentioned, so it is likely proprietary and won't be of much benefit to developers outside China.
The U.S. has implemented a series of export restrictions on China, including: advanced AI chips, high-bandwidth memory (HBM), manufacturing equipment, and silicon wafers from leading players like Intel, TSMC, and Samsung. In a bid to reduce reliance on Western hardware, China is hard at work developing its semiconductor ecosystem with in-house silicon, fab equipment, memory, CPUs, and even GPUs. The latter is of great importance, as modern-day machine learning (sometimes under the buzzword banner of AI) is largely accelerated by parallel computing, something which GPUs excel at.
A strong GPU programming ecosystem offers high-level abstraction, ready-to-use libraries, documentation, and profiling tools. With high-performance Nvidia GPU exports still in limbo, Moore Threads is offering an alternative to CUDA.
MUSA provides a built-in compiler (MCC), runtime libraries (MUSA Runtime), a comprehensive list of specialized libraries (MUSA-X), debuggers, and profilers. To ensure compatibility with already written CUDA code, the MUSA SDK also includes Musify, a tool that translates CUDA code for the MUSA environment, likely by translating PTX code at runtime, similar to zLUDA.
The MUSA SDK version 4.0.1 is compatible with x86 processors from Intel (on Ubuntu) and Hygon (on Kylin). Moore Threads is demonstrating the prowess of its stack through several demonstrations on its website, including speech synthesis, AI-image generation, image processing, AI-powered 3D face modeling, just to name a few. You can actually try out a bunch of these demos right now (though you might need an account), some of which are reportedly running on Moore Threads' MTT S3000 datacenter GPUs.
Despite CUDA's clear advantage in terms of advancement, maturity, and support, MUSA could find many indigenous customers in small-scale environments, evolving over time. AI developers and researchers envision a heterogeneous future, championing the adoption of hardware-agnostic and open-source platforms. Breaking free from CUDA's reign requires superior alternatives, with ROCm being a key contender. However, AMD's hardware support still trails behind Nvidia.