QwQ-32B is a 32 billion parameter language model achieves comparable performance to DeepSeek-R1 with 671 billion parameters, using reinforcement learning for scaling

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · edit-2 23 hours ago

QwQ-32B is a 32 billion parameter language model achieves comparable performance to DeepSeek-R1 with 671 billion parameters, using reinforcement learning for scaling

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · edit-2 23 hours ago

can grab it here

I find it absolutely wild how quickly we went from needing a full blown data centre to run models of this scale to being able to run them on a laptop.

QwQ-32B is a 32 billion parameter language model achieves comparable performance to DeepSeek-R1 with 671 billion parameters, using reinforcement learning for scaling

QwQ-32B is a 32 billion parameter language model achieves comparable performance to DeepSeek-R1 with 671 billion parameters, using reinforcement learning for scaling

QwQ-32B: Embracing the Power of Reinforcement Learning