paper web signal

COrigami Generates Flat-Foldable Origami Designs From Text

TL;DR

  • COrigami filters 560,000 candidate designs down to 27,869 (5% overall) that satisfy flat-foldability constraints.
  • Direct LLM fine-tuning for origami plateaued at roughly 60% flat-foldability, showing need for a constrained pipeline.
  • A Gemini Flash VLM aesthetic evaluator reached 0.766 classification accuracy to guide the RL refinement stage.

Generating origami from a text prompt sounds like a toy problem, but it runs into mathematics fast. A flat-foldable crease pattern must satisfy Kawasaki's and Maekawa's theorems at every vertex, and verifying the absence of global self-intersections is, according to this paper on arXiv, strictly NP-hard. The straightforward alternative -- fine-tuning a language model end-to-end on origami crease patterns -- reportedly plateaued at around 60% flat-foldability compliance in the researchers' own tests, meaning four in ten outputs simply do not fold flat.

The COrigami pipeline takes a neuro-symbolic route instead, chaining five stages: semantic generation from natural language, base packing, crease pattern solving, pattern shaping, and a final reinforcement learning refinement pass. The pipeline starts from 560,000 initial candidate designs and culls aggressively -- only 5.0% survive all stages, yielding a curated dataset of 27,869 models. The RL stage uses Gemini Flash as a vision-language aesthetic evaluator, which the team reports achieved 0.766 classification accuracy distinguishing higher-quality designs, with a double-tournament variant reaching 0.811.

The honest caveat is one the authors themselves state plainly: COrigami produces "structural starting points that human artists can further expand and shape," not finished pieces. Physical post-processing and manual layer thinning are still required because the whole pipeline operates under a zero-thickness paper assumption that does not survive contact with actual paper. Of the designs produced by the RL stage, 200 were manually selected by the team and put through a tournament to identify the top ten -- a detail that signals meaningful human curation remains in the loop.

What the reporting does not give you is any sense of how long the pipeline takes per valid design, or whether it generalises beyond the discrete box-pleated grid it currently operates on. Those gaps matter for anyone thinking about practical adoption.

The broader signal is about constraint satisfaction architecture. The authors describe their goal as showing "how AI systems can satisfy multi-objective physical constraints to enable reliable, mathematically grounded co-creativity" -- and origami, with its well-defined mathematical rules, is a good test domain precisely because compliance is checkable. For designers working on other physically rule-governed problems, the pipeline pattern here -- neural generation filtered by hard constraint solvers and refined by a learned aesthetic judge -- is the part worth examining.

Shared on Bluesky by 1 AI expert