UV Scripts Bring One-Command ML Pipelines to Hugging Face Jobs
TL;DR
- Each script is a self-contained Python file using PEP 723 inline dependency declarations, runnable with a single `uv run` command.
- Nine task categories are covered including OCR with 30+ models, audio transcription, vision detection, embeddings, and LLM inference.
- Scripts use standardized argument patterns so both humans and AI agents can run them locally or on Hugging Face Jobs GPU infrastructure.
Running a document through an OCR model, transcribing audio, or embedding a dataset typically means wiring up a Python environment, installing dependencies, and choosing among competing libraries. The uv-scripts-for-ai repository by davanstrien on GitHub takes a different approach: each script is a single Python file that declares its own dependencies inline using PEP 723, making it "a portable unit you run with `uv run`" -- no cloning, no virtual environment, no requirements file.
The collection covers nine task categories: OCR with more than 30 models, vision detection and segmentation, audio transcription and speech translation, dataset embedding and visualization, data filtering and deduplication, dataset creation from PDFs and URLs, synthetic data generation via LLMs, LLM and VLM inference across datasets, and named entity recognition. Scripts are composable, chaining together through Hugging Face Hub datasets so the output of one task feeds directly into the next.
The GPU path is equally minimal. The same script that runs locally with `uv run <script-url>` can be sent to managed GPU infrastructure with `hf jobs uv run --flavor l4x1 --secrets HF_TOKEN <script-url>`. That symmetry between local and cloud execution is what makes the project genuinely agent-friendly: standardized argument patterns mean an AI agent can invoke the same scripts a human would, without custom wrappers or environment configuration.
The licensing picture requires attention. The code is Apache 2.0, but the underlying models carry individual licenses including MIT, Apache-2.0, and OpenRAIL-M variants, and the project places responsibility for checking those terms on the user. Anyone using these in a commercial or production context needs to audit model licenses per task. The source also does not address costs for Hugging Face Jobs GPU runs, or how script versioning works when executing by URL -- both worth understanding before incorporating these into automated pipelines.
For small teams and researchers who need GPU-accelerated OCR, transcription, or embedding without DevOps infrastructure, the practical value is real and immediate.
Shared on Bluesky by 2 AI experts
-
Got a digitised collection that needs OCR? uv-scripts is a set of single-file Python scripts that OCR a whole image dataset to markdown in one command — 20+ open VLMs to pick from, nothing to install but uv. github.com/…
View on Bluesky →
Originally reported by github.com
Read the original article →Original headline: GitHub - davanstrien/uv-scripts-for-ai: Self-contained UV scripts for data & ML tasks — OCR, vision, audio & more — run one in a command, locally or on Hugging Face Jobs. Built for humans and agents.