Sangyun Lee, Sean McLeish, Tom Goldstein, Giulia Fanti Language Models Need Sleep https://arxiv.org/abs/2605.26099
arxiv cs.CL
Articles & links
Qihan Wang, Nicholas Tomlin, Michael Hu, Brian Dillon, Tal Linzen Simulating Human Memory with Language Models https://arxiv.org/abs/2605.25680
Zeli Su, Ziyin Zhang, Zewei Pan, Zhou Liu, Dingcheng Huang, Dehan Li, Zhankai Xu, Longfei Zheng, Xiaolu Zhang, Jun Zhou, Wentao Zhang Source-Grounded Semantic Reinforcement Learning for Low-Resource Target-Language Generation https://arxiv.org/abs/2605.29502
Yuxuan Ye, Raul Santos-Rodriguez, Edwin Simpson Teaching Language Models to Check Grounded Claim Factuality with Human Test-Taking Strategies https://arxiv.org/abs/2605.29712
Aditi Khandelwal, Marius Mosbach, Verna Dankers, Siva Reddy, Golnoosh Farnadi Leveraging Routing Dynamics in Mixture-of-Experts Models for Efficient Language Adaptation https://arxiv.org/abs/2605.29714
Chenghao Zhang, Guanting Dong, Yufan Liu, Tong Zhao, Zhicheng Dou Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation https://arxiv.org/abs/2605.29861
Jinze Li, Yixing Xu, Guanchen Li, Jinfeng Xu, Shuo Yang, Yang Zhang, Xuanwu Yin, Dong Li, Edith C. H. Ngai, Emad Barsoum Beyond the Target: From Imitation to Collaboration in Speculative Decoding https://arxiv.org/abs/2605.24793
Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, Leszek Rutkowski, Dacheng Tao Better, Faster: Harnessing Self-Improvement in Large Reasoning Models https://arxiv.org/abs/2605.24998
Xiang Cheng, Yulan Hu, Lulu Zheng, Zheng Pan, Xin Li, Yong Liu GroupTravelBench: Benchmarking LLM Agents on Multi-Person Travel Planning https://arxiv.org/abs/2605.25200
Jinyan Su, Jennifer Healey Clarification Is Not Enough: Post-Clarification Answering Remains the Bottleneck in Multi-Turn QA https://arxiv.org/abs/2605.25204
Russell Yang, Ruishi Chen, Pierce Kelaita, Riya Ranjan, Sibo Ma, Charles Dickens, Matthew Guillod, Megan Ma, Julian Nyarko JudgmentBench: Comparing Rubric and Preference Evaluation for Quality Assessment https://arxiv.org/abs/2605.25240
Yu Wang, Minghao Liu, Jiayun Wang, Jinrui Huang, Ankit Shah, Wei Wei Inference Time Optimization with Confidence Dynamics https://arxiv.org/abs/2605.25244