DeepReinforce Open-Sources Ornith-1.0 Coding Models with Self-Learning RL Scaffolds
Decision Brief
What changedDeepReinforce releases Ornith-1.0, an open-source coding model family based on Gemma 4 and Qwen 3.5, featuring self-learned RL scaffolds, with the 397B flagship scoring 82.4 on SWE-Bench Verified.
Why it mattersAI builders should note this self-learning scaffold approach, as it may disrupt current RL frameworks reliant on fixed harnesses.
Who should careOpen-source model users
Affected stackQwen
Builder actionEvaluate
Source confidenceMedium · Reliable media or first-hand reporting
DeepReinforce launched Ornith-1.0, an open-source coding model family built on Gemma 4 and Qwen 3.5. Its key innovation is that models learn reasoning scaffolds themselves during RL training, instead of using fixed harnesses. The 397B parameter flagship model achieves 82.4% on SWE-Bench Verified, and all weights are open-sourced under MIT license.
Summary basis: official / RSS sourceUnless it says 'full article read', this summary is based only on publicly available content — it never pretends to have read restricted originals.
Sources
- MarkTechPost
Fast research-paper and ML tooling summaries, useful for infra and agent updates.
- MarkTechPost