Who's Who of AI

arxiv cs.CL

81 trust feed @arxiv-cs-cl.bsky.social · 1,586 followers

Compute & infrastructure

Why they matter

Feed with public evidence across Compute & infrastructure.

AI signals: 559
Sources: 2
Discussions: 0
Latest signal: 1d ago

View every signal from arxiv cs.CL →

Computer Science -- Computation and Language source: export.arxiv.org/rss/cs.CL maintainer: @tmaehara.bsky.social

What they're sharing

Articles & links

Teagan Johnson, Elliott Ash, Andrew Piper, Maria Antoniak Characterizing Narrative Content in Web-scale LLM Pretraining Data https://arxiv.org/abs/2606.19468

Characterizing Narrative Content in Web-scale LLM Pretraining Data arxiv.org

AI Weekly's analysis →

A new arXiv preprint introduces NarraBERT, a RoBERTa-based classifier, and applies it to 3 million passages from the 3-trillion-token Dolma corpus.
The framework operationalizes three narrative elements, agency, setting, and events, across 11 interpretable dimensions, trained on 400 annotated passages.
The authors report narrative qualities are unequally distributed across pretraining sources and topics in ways current curation practices do not measure.

Read full analysis →

View on Bluesky · ♥ 1 ↻ 0 ↩ 0 · 3 from the directory shared this · 39d ago

Mandana Samiei, Eunice Yiu, Anthony GX-Chen, Dongyan Lin, Jocelyn Shen, Blake A. Richards, Alison Gopnik, Doina Precup Human Adults and LLMs as Scientists: Who Benefits from Active Exploration? https://arxiv.org/abs/2606.06464

Human Adults and LLMs as Scientists: Who Benefits from Active Exploration? arxiv.org

AI Weekly's analysis →

Read full analysis →

View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 3 from the directory shared this · 53d ago

Maria Thomas, Kristina Gligoric, Nihar B. Shah Mitigating LLM-based p-Hacking by Preregistering for the Next LLM https://arxiv.org/abs/2606.27687

Mitigating LLM-based p-Hacking by Preregistering for the Next LLM arxiv.org

AI Weekly's analysis →

A new arXiv paper proposes preregistering LLM experiments and running the confirmatory analysis on the first eligible model released after registration.
Across 20 models from four providers and 11 configurations, the protocol blocked p-hack transfer in 73.9% and 72.7% of cases across two tasks.
The authors preregistered their own experiment; of 7 configurations that hacked the prior model, 6 failed to carry over to the next.

Read full analysis →

View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 3 from the directory shared this · 29d ago

Nina Begus World Wide Models: Literary Tools for Cultural AI https://arxiv.org/abs/2607.02369

[2607.02369] World Wide Models: Literary Tools for Cultural AI arxiv.org

AI Weekly's analysis →

Nina Begus argues LLMs stage a cultural encounter that is 'massive, automated, and monolingual,' and proposes literary scholarship as the corrective toolkit.
The essay proposes applying world literature methods — macrostructure, circulation, and untranslatability — to build culturally literate AI.
The 15-page essay is forthcoming in MFS Modern Fiction Studies in 2027 and connects critical theory to structural monolingualism in AI.

Read full analysis →

View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 3 from the directory shared this · 25d ago

Fitsum Reda, John Kamalu, Roger Waleffe, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro Nemotron-TwoTower: Diffusion Language Modeling with Pretrained Autoregressive Context https://arxiv.org/abs/2606.26493

Nemotron-TwoTower: Diffusion Language Modeling with Pretrained Autoregressive Context arxiv.org

AI Weekly's analysis →

NVIDIA's Nemotron-TwoTower splits an LM into a frozen autoregressive context tower and a trainable diffusion denoiser with bidirectional block attention.
The system is built on Nemotron-3-Nano-30B-A3B, a 30B hybrid Mamba-Transformer MoE backbone, and trained on roughly 2.1 trillion tokens.
The authors report retaining 98.7% of the autoregressive baseline's quality while delivering 2.42x higher wall-clock generation throughput.

Read full analysis →

View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 3 from the directory shared this · 32d ago

Sangyun Lee, Sean McLeish, Tom Goldstein, Giulia Fanti Language Models Need Sleep https://arxiv.org/abs/2605.26099

[2605.26099] Do Language Models Need Sleep? Offline Recurrence for Improved Online Inference arxiv.org

View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 3 from the directory shared this · 63d ago

Liancheng Gong, Zhiyang Wang, Yiwei Xu, Julia Mendelsohn Persuasion Index: A Theory-Guided Framework for Persuasion Analysis https://arxiv.org/abs/2606.14580

Persuasion Index: A Theory-Guided Framework for Persuasion Analysis arxiv.org

AI Weekly's analysis →

Persuasion Index is a taxonomy of 15 dimensions grounded in psychology and communication, implemented with 55 sub-features from lexicons and rule-based detectors.
The authors evaluate PI on four public datasets varying in domain, style, and outcome measures, and report linear models carry meaningful predictive signal while staying lightweight.
PI is released as an open-source package and web interface for principled and auditable analysis of human and AI-mediated communication.

Read full analysis →

View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 43d ago

Qihan Wang, Nicholas Tomlin, Michael Hu, Brian Dillon, Tal Linzen Simulating Human Memory with Language Models https://arxiv.org/abs/2605.25680

[2605.25680] Simulating Human Memory with Language Models arxiv.org

View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 63d ago

David Jurgens The Future of NLP may not be at NLP Conferences: Scholarly Migration Patterns in Natural Language Processing https://arxiv.org/abs/2607.02416

[2607.02416] The Future of NLP may not be at NLP Conferences: Scholarly Migration Patterns in Natural Language Processing arxiv.org

AI Weekly's analysis →

In the post-LLM era, established NLP authors lost 19.2pp of share at flagship *ACL main-conference tracks while gaining 14.8pp in Findings tracks.
General ML venues rose 8.6pp among established authors, even after adjusting for parallel growth across both fields.
Among debut authors with three or more first-author NLP papers, those mostly at *ACL fell from 84% in 2019 to 74% in 2024.

Read full analysis →

View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 25d ago

Akshat Gupta, Jermaine Lei, Alexander Lu, Gopala Anumanchipalli, Leshem Choshen Automated Discovery Has No Universally Superior Harness https://arxiv.org/abs/2607.18235

Automated Discovery Has No Universally Superior Harness arxiv.org

AI Weekly's analysis →

Researchers tested 30 budget-matched harnesses across 12 model-problem pairs, running over 3.1 million LLM rollouts to compare autonomous discovery systems.
The paper's headline finding is that no fixed harness is reliably superior across the evaluated model-problem pairs, with OpenEvolve variants often underperforming simpler alternatives.
An adaptive-allocation approach that prunes weak harnesses mid-run and reallocates budget to stronger candidates beat both fixed-harness commitment and non-adaptive ensembles.

Read full analysis →

View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 7d ago

Joseph Marvin Imperial, Junhong Liang, Belal Shoer, Abdullah Barayan, Rodrigo Wilkens, Omar Mussa, Dawn Knight, Eug\'enio Ribeiro, Ekaterina Kochmar, ... ComplexityMT: Benchmarking the Interaction Between Text Complexity and Machine Translation https://arxiv.org/abs/2606.05421

ComplexityMT: Benchmarking the Interaction Between Text Complexity and Machine Translation arxiv.org

AI Weekly's analysis →

ComplexityMT benchmarks machine translation across Arabic, Dutch, English, French, Hindi and Russian using CEFR as the measure of text complexity.
Higher CEFR levels make texts more difficult to translate, and MT systems shift the CEFR level of the target versus the source for most languages.
The study evaluates three open-weight models, one closed model and one commercial machine translation system on two CEFR-grounded tasks.

Read full analysis →

View on Bluesky · ♥ 0 ↻ 0 ↩ 1 · 2 from the directory shared this · 53d ago

Xinyu Geng, Xuanhua He, Sixiang Chen, Yanjing Xiao, Fan Zhang, Shijue Huang, Haitao Mi, Zhenwen Liang, Tianqing Fang, Yi R. Fung DeepSearch-World: Self-Distillation for Deep Search Agents in a Verifiable Environment https://arxiv.org/abs/2607.07820

DeepSearch-World: Self-Distillation for Deep Search Agents in a Verifiable Environment arxiv.org

AI Weekly's analysis →

A 9B-parameter web agent reaches 31.2% on BrowseComp, 61.5% on GAIA, and 93.4% on HotpotQA using self-distillation.
The DeepSearch-World environment supplies 420,000 multi-hop QA tasks built from entity-level random walks with reproducible search and page-reading.
The training loop cycles through trajectory generation, filtering, data mixing, and fine-tuning without relying on a larger teacher model.

Read full analysis →

View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 18d ago

Their network

In arxiv cs.CL's orbit

Center = arxiv cs.CL. Left = members they follow (green edges). Right = members who follow them (blue edges). Top = mutual follows (orange edges, slightly larger). Drag any node to reposition; click to open that profile.