AI coaching platform for a behavioural-science startup
An assessment engine and LLM-powered coaching product: structured evaluations in, personalized AI feedback loops out — with the guardrails to make that safe.
Representative engagement. This story describes work we do, anonymized and with details changed. We publish named case studies only with client approval.
Context
A behavioural-science startup had a validated methodology — structured assessments that map how people make decisions — and a vision for an AI coach that could turn assessment results into ongoing, personalized guidance. What they didn't have was a product: the methodology lived in spreadsheets and human-run workshops.
The challenge
Turning a human coaching methodology into software is mostly a trust problem. The AI's feedback had to stay inside the methodology's frameworks — an LLM freelancing its own pop-psychology advice would undermine the science the company is built on. And assessment data is sensitive, so privacy boundaries had to be structural, not aspirational.
What we built
- An assessment engine — configurable instruments, scoring pipelines, and versioned frameworks, so the science team can evolve the methodology without engineering work.
- An LLM coaching layer — Claude-powered feedback grounded in each user's assessment results, constrained by the methodology's frameworks through structured prompting and retrieval.
- Evaluation before launch — golden-answer test suites scored on every prompt change, so quality regressions surface in CI, not in a user's coaching session.
- A clean data boundary — assessment data isolated with row-level security, and only the minimum necessary context passed to the model per request.
How it went
The product went from kickoff to a working end-to-end flow in the first month, then through two evaluation-driven iterations of the coaching layer before launch. The team now ships methodology updates themselves — the coaching quality bar is enforced by the evaluation suite rather than by engineers reading transcripts.
Why it's representative
This is the shape of most of our AI Solutions work: a domain expert with real intellectual property, an LLM that must be constrained to respect it, and the unglamorous engineering — evaluation, data boundaries, versioning — that turns a demo into a product.
