Tom Aarsen

Sentence Transformers, SetFit & NLTK maintainer Machine Learning Engineer at 🤗 Hugging Face

Articles & links

The full training data is also released as `cross-encoder/ettin-reranker-v1-data`: ~143M (query, document, teacher score) triples, kept as 39 named splits. Built from @LightOnIO 's pre-training data plus a re-scored subset of their fine-tuning data. huggingface.co/datasets/cro...

huggingface.co
View on Bluesky · ♥ 0 ↻ 0 ↩ 1 · 10d ago