github.com via Hacker News

Modelplane v0.1 Brings Kubernetes-Style Control to AI Inference

By Alexis Dufresne Published June 23, 2026 at 18:33 UTC Updated June 23, 2026 at 18:35 UTC

open source inference ai infrastructure open-source inference mlops

TL;DR

Modelplane is a v0.1 open-source control plane built on Crossplane for orchestrating AI model deployments across cloud, neocloud, and on-premise infrastructure.
It provides unified, OpenAI-compatible endpoints and supports weighted canary rollouts, A/B traffic routing, Kubernetes-based scaling, and model weight caching.
The project is engine-agnostic and separates platform team responsibilities from developer-facing resources, with 56 GitHub stars at early release.

Running AI models in production across mixed infrastructure is still more art than science for most engineering teams. Modelplane is an open-source project aiming to bring Kubernetes-style declarative control to AI inference, built on Crossplane. The pitch is broad: run any model on any engine on any infrastructure, from a single GPU to disaggregated, multi-node deployments, while exposing a unified, OpenAI-compatible endpoint regardless of what sits underneath.

The architecture separates concerns by role. Platform teams provision inference clusters and define hardware recipes describing what a cluster can run. Developers then declare model deployment and service resources, and Modelplane handles the rest automatically, including scheduling model replicas onto compatible clusters, scaling via Kubernetes standards, routing traffic with weighted canary and A/B rollout capabilities, and caching model weights per cluster.

That positioning, infrastructure-level orchestration rather than a simple API proxy, is what separates this from lighter-weight routing tools. The system is described as continuously reconciling the fleet toward a declared state, and it is designed to be engine-agnostic without injecting configuration into containers, so teams are not locked into a specific serving runtime.

The honest caveat is that this is a v0.1 release, early and under active development. Crossplane itself carries a meaningful operational learning curve, so the realistic adoption surface is platform engineering teams already comfortable with Kubernetes, not individual developers. With 56 GitHub stars at launch, Modelplane is at the earliest stage of community formation, and the repository does not yet spell out which inference engines receive first-class support or what the roadmap looks like beyond v0.1.

Teams running self-hosted models across neoclouds or on-premise GPU clusters have relatively few mature orchestration options today, which is the gap Modelplane is stepping into. Whether it gains traction will depend on how quickly v0.1 stabilizes and whether Crossplane as a dependency feels like leverage or overhead to the platform teams it needs to win over.

Originally reported by github.com

Read the original article →

Original headline: Show HN: Modelplane — Open-Source Control Plane for AI Inference With Multi-Provider Routing and Observability