arxiv cs.CL

Computer Science -- Computation and Language source: export.arxiv.org/rss/cs.CL maintainer: @tmaehara.bsky.social

Articles & links

Zeli Su, Ziyin Zhang, Zewei Pan, Zhou Liu, Dingcheng Huang, Dehan Li, Zhankai Xu, Longfei Zheng, Xiaolu Zhang, Jun Zhou, Wentao Zhang Source-Grounded Semantic Reinforcement Learning for Low-Resource Target-Language Generation https://arxiv.org/abs/2605.29502

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 1h ago

Yuxuan Ye, Raul Santos-Rodriguez, Edwin Simpson Teaching Language Models to Check Grounded Claim Factuality with Human Test-Taking Strategies https://arxiv.org/abs/2605.29712

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 2h ago

Aditi Khandelwal, Marius Mosbach, Verna Dankers, Siva Reddy, Golnoosh Farnadi Leveraging Routing Dynamics in Mixture-of-Experts Models for Efficient Language Adaptation https://arxiv.org/abs/2605.29714

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 2h ago

Chenghao Zhang, Guanting Dong, Yufan Liu, Tong Zhao, Zhicheng Dou Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation https://arxiv.org/abs/2605.29861

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 2h ago

Jinze Li, Yixing Xu, Guanchen Li, Jinfeng Xu, Shuo Yang, Yang Zhang, Xuanwu Yin, Dong Li, Edith C. H. Ngai, Emad Barsoum Beyond the Target: From Imitation to Collaboration in Speculative Decoding https://arxiv.org/abs/2605.24793

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 2d ago

Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, Leszek Rutkowski, Dacheng Tao Better, Faster: Harnessing Self-Improvement in Large Reasoning Models https://arxiv.org/abs/2605.24998

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 2d ago

Xiang Cheng, Yulan Hu, Lulu Zheng, Zheng Pan, Xin Li, Yong Liu GroupTravelBench: Benchmarking LLM Agents on Multi-Person Travel Planning https://arxiv.org/abs/2605.25200

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 2d ago

Jinyan Su, Jennifer Healey Clarification Is Not Enough: Post-Clarification Answering Remains the Bottleneck in Multi-Turn QA https://arxiv.org/abs/2605.25204

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 2d ago

Russell Yang, Ruishi Chen, Pierce Kelaita, Riya Ranjan, Sibo Ma, Charles Dickens, Matthew Guillod, Megan Ma, Julian Nyarko JudgmentBench: Comparing Rubric and Preference Evaluation for Quality Assessment https://arxiv.org/abs/2605.25240

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 2d ago

Yu Wang, Minghao Liu, Jiayun Wang, Jinrui Huang, Ankit Shah, Wei Wei Inference Time Optimization with Confidence Dynamics https://arxiv.org/abs/2605.25244

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 2d ago