Artificial Intelligence Papers

Articles & links

Bayesian control for coding agents Theodore Papamarkou, Vladislav Smirnov, Viktor Mazanov, Artem Vazhentsev, Preslav Nakov, Timothy Baldwin, Artem Shelmanov https://t.co/1EUIZ7fmTy [𝚌𝚜.𝙰𝙸 𝚌𝚜.𝙲𝙻] https://t.co/5sFFzguwnn

Bayesian control for coding agents arxiv.org
View on Bluesky Β· β™₯ 0 ↻ 0 ↩ 0 Β· 2 from the directory shared this Β· 17h ago

Reinforcement Learning Towards Broadly and Persistently Beneficial Models Akshay V. Jagadeesh, Rahul K. Arora, Khaled Saab, Ali Malik, Mikhail Trofimov, Foivos Tsimpourlas, Johannes Heidecke, Karan Singhal https://t.co/Bd6xrwj5BN [𝚌𝚜.𝙰𝙸 𝚌𝚜.𝙲𝙻] https://t.co/4FcHmEfg80

Reinforcement Learning Towards Broadly and Persistently Beneficial Models arxiv.org
View on Bluesky Β· β™₯ 0 ↻ 0 ↩ 0 Β· 2 from the directory shared this Β· 1d ago

SPIRAL: Learning to Search and Aggregate Jubayer Ibn Hamid, Ifdita Hasan Orney, Michael Y. Li, Omar Shaikh, Yoonho Lee, Dorsa Sadigh, Chelsea Finn, Noah Goodman https://t.co/CRBpj1Mjhk [𝚌𝚜.𝙰𝙸] https://t.co/kVEHyMHKpK

SPIRAL: Learning to Search and Aggregate arxiv.org
View on Bluesky Β· β™₯ 0 ↻ 0 ↩ 0 Β· 2 from the directory shared this Β· 1d ago

DART: Draft-Agreement Routing for Training-Free Adaptive Thinking Budgets in Hybrid Reasoning Models Jungseob Lee, Seongtae Hong, Seungjun Lee, Jaehyung Seo, Junyoung Son, Sugyeong Eo, Chanjun Park, … https://t.co/4OfjMyUCwq [𝚌𝚜.𝙰𝙸 𝚌𝚜.𝙲𝙻] πŸ’¬Code: https://t.co/X5kZyWhmRU https:/…

DART: Draft-Agreement Routing for Training-Free Adaptive Thinking Budgets in Hybrid Reasoning Models arxiv.org
View on Bluesky Β· β™₯ 0 ↻ 0 ↩ 0 Β· 2 from the directory shared this Β· 1d ago

Agent-as-a-Router: Agentic Model Routing for Coding Tasks Pengfei Zhou, Zhiwei Tang, Yixing Ma, Jiasheng Tang, Yizeng Han, Zhenglin Wan, Fanqing Meng, Wei Wang, Bohan Zhuang, Wangbo Zhao, Yang You https://t.co/LA6OAwFlc2 [𝚌𝚜.𝙰𝙸] https://t.co/vf4PUnEp8r

Agent-as-a-Router: Agentic Model Routing for Coding Tasks arxiv.org
View on Bluesky Β· β™₯ 0 ↻ 0 ↩ 0 Β· 2 from the directory shared this Β· 2d ago

Beyond Penalizing Mistakes: Stabilizing Efficiency Training in Large Reasoning Models via Adaptive Correct-Only Rewards Jungseob Lee, Seungyoon Lee, Seongtae Hong, Minhyuk Kim, Chanjun Park, … https://t.co/p3b0ygS8Qt [𝚌𝚜.𝙰𝙸 𝚌𝚜.𝙲𝙻] πŸ’¬Code: https://t.co/qK5JTlCFNR https://t.co/v4…

Beyond Penalizing Mistakes: Stabilizing Efficiency Training in Large Reasoning Models via Adaptive Correct-Only Rewards arxiv.org
View on Bluesky Β· β™₯ 0 ↻ 0 ↩ 0 Β· 2 from the directory shared this Β· 2d ago

VISTA Architect: A graph database-oriented health AI system demonstrated in multidisciplinary tumor boards Tuomo Kiiskinen, Jason Fries, Philip Adamson, David Wu, … https://t.co/FztYEvHRVn [𝚌𝚜.𝙰𝙸 𝚌𝚜.𝙲𝙻 𝚌𝚜.𝙳𝙱 𝚌𝚜.π™Έπš] πŸ’¬Code: https://t.co/delaPGK4mg https://t.co/fBV6d1tjfQ

VISTA Architect: A graph database-oriented health AI system demonstrated in multidisciplinary tumor boards arxiv.org
View on Bluesky Β· β™₯ 0 ↻ 0 ↩ 0 Β· 2 from the directory shared this Β· 2d ago