BuildWise · System diagrams

Facts in code,
judgment in the model.

● LIVE ON HUGGING FACE · DETERMINISTIC FEASIBILITY + GEOMETRY · LOCAL-EMBEDDING RAG

An AI slicer coach for 3D printing. You state your printer, your material, and what you're trying to make; BuildWise returns hardware-aware settings with an honest reason on every row. The spine is one rule: everything that can be made deterministic — specs, physical feasibility, geometry, retrieval — is computed in code and handed to the model as authoritative context. Claude is reserved for the one thing it's actually good at: reasoning over grounded facts and communicating them well. These five diagrams trace exactly where that line falls.

Blue — deterministic, computed in code

Amber — Claude · the judgment

Coral — refused in code (the gate says no)

01 / 05 — THE CORE LOOP

Many concerns funnel into one grounded prompt, and one model call.

What you provide on the left is turned into facts by deterministic code — printer and material specs, a feasibility verdict, the measured geometry of your part, and the engineering principles your goal depends on. All of it converges into a single assembled prompt — the narrow waist — which is handed to Claude for one low-temperature, cached call. The model reasons and writes the settings table; it never sources the facts itself.

Everything left of Claude is deterministic — same inputs, same context, every time. There is exactly one model call, at low temperature, over a cached system prompt. If the feasibility step returns a blocker, the assembled prompt carries a refuse verdict — the model is never handed a physics question it could get wrong.

02 / 05 — THE FEASIBILITY GATE

Whether a material can print on a printer is physics, not opinion — so it's decided in code.

Can polycarbonate print on a stock Ender 3? The hotend caps at 240 °C, PC needs ~260 °C, and PC requires an enclosure the machine doesn't have. A pure-code check compares material requirements against printer limits and returns a verdict — blockers that make it impossible, warnings that make it tricky. The impossible is refused before the model is ever called; Claude only communicates the verdict and pivots.

This is the single most important reliability decision in the system. The model is good at judgment and terrible at quietly inventing a hotend temperature; so the physics is taken away from it entirely. A blocker ends the request in code with the exact reason; Claude only ever sees a verdict it has to explain, never a number it has to guess.

03 / 05 — TWO KINDS OF KNOWLEDGE

Facts and reasoning are different things, so they live in different places.

A printer's max hotend temperature is a fact — it belongs in sourced, diffable JSON, read through typed repositories, never invented. Why walls beat infill for strength is reasoning — it belongs in a curated Markdown corpus, embedded locally and retrieved by relevance. Both are grounded; neither is left to the model to recall. An unlisted printer still gets sane advice by mapping to the closest of four archetypes.

The split is the point. Facts change by editing a JSON row and reading the diff; reasoning changes by adding a Markdown doc to the corpus. Neither path asks the model to remember something it might get wrong — it's handed both, already grounded.

04 / 05 — HYBRID RETRIEVAL

Plain semantic search let the material drown the intent. So retrieval pins the goal first.

A single embedding of "strength + PLA + Ender 3" once surfaced material-selection docs instead of the canonical walls-before-infill principle — the material words drowned the intent. The fix: for each stated optimization axis, pin its canonical principle with a metadata-filtered query, then fill the remaining slots with use-case reality-checks. The principle the goal depends on is always present, never left to chance — and an eval keeps it that way.

The eval is the part that matters most: retrieval relevance is held to a test, not to vibes. When it ran during development it caught a missing corpus doc — a goal with no principle to surface — and that class of regression has been fenced off ever since.

05 / 05 — THE GROUNDED PROMPT

A static brain that caches, a per-request body that grounds — inspectable for free.

The coach's behavior lives in one static system prompt — the engineering rules, the persona, the experience-level depth control, the output structure. Keeping it static means it caches on the API. Everything request-specific — the facts, the verdict, the geometry, the retrieved principles — rides in the user message. And dry-run assembles the entire thing without spending a token, so the prompt can be inspected and tuned for nothing.

It's built like production AI, not a demo: a cached static prompt, low temperature for repeatability, and a dry-run mode that lets the entire assembled prompt be read and tuned without a single paid call. The model is the last step, and the cheapest one to get right.