AI Firehose

Daily-updated stream of AI research from ArXiv

Articles & links

Cognitive science is set for a breakthrough with AI integration, allowing generalizable models of cognition via naturalistic tasks. This method reshapes intelligence understanding, yielding insights and hypotheses about human cognition with complex data. https://arxiv.org/abs/…

[2502.20349] Naturalistic Computational Cognitive Science: Towards generalizable models and theories that capture the full range of natural behavior arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 3 from the directory shared this · 4d ago

CUSP is a new framework evaluating AI’s forecasting of scientific progress, revealing significant prediction accuracy limitations. Current models lack reliability and show overconfidence, indicating a need for enhanced reasoning for AI's meaningful contributions. https://arxiv…

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 6d ago

Introducing RoMo, a dataset of 820K high-quality 3D human motions with rich annotations for advanced motion generation. Its innovative taxonomy and curation enable fine evaluation, paving the way for models that truly grasp complex motions. https://arxiv.org/abs/2605.26241

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 16h ago

Research shows modern embedding-based retrieval systems achieve optimal performance at low dimensions, challenging the belief that higher dimensions are needed. This underscores the power of large-margin embeddings for greater AI efficiency. https://arxiv.org/abs/2605.23556

arxiv.org
View on Bluesky · ♥ 5 ↻ 0 ↩ 0 · 2 from the directory shared this · 4d ago

Source-Grounded Semantic Reinforcement Learning (SG-SRL) uses abundant source-language data to improve low-resource target-language generation. This method enhances semantic accuracy and factual coverage for languages that lack sufficient parallel data. https://arxiv.org/abs/2…

arxiv.org
View on Bluesky · ♥ 1 ↻ 0 ↩ 0 · 2 from the directory shared this · 3h ago

Researchers proposed a fine-tuning strategy for Mixture-of-Experts models, updating less than 2% of parameters while maintaining competitive performance on low-resource languages. Their insights into routing dynamics may enhance cross-lingual models. https://arxiv.org/abs/2605…

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 3h ago

Eureka enhances feature engineering in enterprise AI by using LLMs for code generation, boosting Alibaba Cloud's GPU demand fulfillment by 16% and reducing migration losses by 33%. This indicates a shift toward scalable AI systems in resource management. https://arxiv.org/abs/…

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 3 · 2 from the directory shared this · 4h ago

PTAH is revolutionizing multimodal report generation, merging LLMs with visual evidence for more reliable outputs. This innovative system guarantees factual accuracy and integrates text and images effectively, elevating deep research capabilities. https://arxiv.org/abs/2605.29861

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 5h ago

At Bristol, researchers reshaped grounded claim factuality checking into comprehension tasks with language models, reducing token use by 80%. Their method sets a new benchmark, showcasing the potential of human-inspired strategies in AI. https://arxiv.org/abs/2605.29712

arxiv.org
View on Bluesky · ♥ 1 ↻ 0 ↩ 0 · 2 from the directory shared this · 8h ago

CoSpec introduces a new method for large language model inference by combining draft and target models collaboratively, exceeding traditional performance limits with faster processing and improved accuracy. https://arxiv.org/abs/2605.24793

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 2d ago

A study presents Dynamic Variance-adaptive Advantage Optimization (DVAO) for multi-reward learning, allowing models to balance signals adaptively. DVAO significantly outperforms benchmarks, achieving high accuracy and compliance while maintaining stability. https://arxiv.org/a…

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 2d ago

Research shows that while AI systems enhance questioning strategies, the main challenge in multi-turn question answering is ensuring accurate responses after clarifications. This gap highlights the need for better understanding of user intent in AI. https://arxiv.org/abs/2605.…

arxiv.org
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 2d ago