AMD’s Radeon RX 7900 XTX has set a new benchmark by outperforming NVIDIA’s GeForce RTX 4090 in running the DeepSeek R1 AI model during inference tests.
### AMD Swiftly Enhances Support for DeepSeek’s R1 LLM Models, Delivering Unmatched Performance
DeepSeek’s latest AI model has taken the tech world by storm. And though many are curious about the computing power needed to train it, it turns out that everyday consumers can achieve impressive results using AMD’s “RDNA 3” Radeon RX 7900 XTX GPU. Heavily backed by Team Red, DeepSeek’s R1 inference benchmarks have revealed that the flagship RX 7000 series GPU holds its own and even surpasses NVIDIA’s offerings across various models.
DeepSeek performing very well on @AMDRadeon 7900 XTX. Learn how to run on Radeon GPUs and Ryzen AI APUs here: https://wccftech.com/amd-radeon-rx-7900-xtx-beats-nvidia-geforce-rtx-4090-in-deepseeks-ai-inference-benchmark/
— David McAfee (@McAfeeDavid_AMD) January 29, 2025
The use of consumer GPUs for AI workloads has become increasingly feasible and appealing, mainly because they offer good performance-to-cost ratios when compared to more traditional AI accelerators. Furthermore, running models locally provides an added layer of privacy, a significant concern when dealing with models like those from DeepSeek. Fortunately, AMD has released a comprehensive guide on setting up DeepSeek R1 distillations on their hardware. Here’s a quick rundown:
1. Ensure your system is running the 25.1.1 Optional or newer Adrenalin driver.
2. Download LM Studio 0.3.8 or later from lmstudio.ai/ryzenai.
3. Install LM Studio and bypass the initial onboarding screen.
4. Head over to the discover tab.
5. Select your DeepSeek R1 Distill. Smaller distills like Qwen 1.5B are incredibly fast and great for beginners, whereas larger ones enhance reasoning capabilities. All available options are highly competent.
6. On the right panel, ensure “Q4 K M” quantization is selected, then hit “Download.”
7. Once done, return to the chat tab, pick the DeepSeek R1 distill from the dropdown, and ensure “manually select parameters” is checked.
8. In the GPU offload layers section, slide all the way to max.
9. Click “model load.”
10. Engage with a reasoning model that runs entirely on your local AMD setup!
If you hit any snags, AMD has thoughtfully provided a tutorial on YouTube, breaking down each step in detail. By following these, you can run DeepSeek’s language models on your local machine, ensuring your data remains private. As we look forward to the next wave of GPUs from both NVIDIA and AMD, we anticipate a significant boost in inference power, thanks to the integrated AI engines designed to handle these demanding tasks.