Swotta — Jesse Merrigan

Type

Revision and AI tutoring platform

Stack

Next.js 15, PostgreSQL + pgvector, Claude API

Tests

1,605 tests across 84 files

Code

Open source (Polyform Noncommercial)

Why this project

Why the context problem matters more than the chat interface.

I'm interested in how memory works — the biological kind and the computational kind. How spaced repetition exploits the forgetting curve. How retrieval practice strengthens recall more than re-reading ever does. How confidence calibration (knowing what you don't know) is arguably the skill most students lack and most revision tools ignore. Swotta is where those interests meet an agentic AI problem: how you give an AI enough structured context about a student — what they know, what they've forgotten, where they're miscalibrated — that it can do something genuinely useful in the moment rather than just responding to a prompt.

What existing tools get wrong

Most revision tools track time spent, not mastery gained.
Students don't know what they don't know — confidence miscalibration is the core problem most tools ignore.
AI tutoring without structured context is just a chatbot with a study-themed prompt.
Spaced repetition research is well-established but poorly applied in real products.

Design constraints

The system had to model a full UK exam specification as a relational topic graph with prerequisite edges.
AI sessions needed structured context — mastery state, misconceptions, learning preferences, student materials — not just a blank prompt.
The scheduling engine had to factor exam proximity, topic weights, and behavioural signals, not just track overdue items.
Multi-tenant identity needed to support both families and schools from the same schema.

What I built

Structured context assembly, not a chatbot with a study prompt.

Scheduling engine

Modified SM-2 spaced repetition factoring exam proximity, topic weights from the actual specification, avoidance patterns, and confidence miscalibration. Picks both what to study and how — across 10 distinct session types.

AI study sessions

Conversational sessions powered by Claude. Each session receives the student's mastery level, known misconceptions, learning preferences, relevant chunks from their own materials (pgvector similarity search), and the qualification's mark scheme structure.

Source ingestion

Upload PDFs of class notes or past papers. Pipeline extracts text, chunks at semantic boundaries, generates embeddings, and classifies each chunk against the curriculum topic graph.

Parent reporting

Weekly reports with mastery changes, misconception narratives, confidence calibration insights, and behavioural patterns. Reports that tell parents something useful, not just "studied for 3 hours."

Multi-tenant identity

Household-as-organisation model supporting B2C families and B2B schools from the same schema, with policies resolving through five layers.

Systems involved

More than a chatbot with a curriculum prompt.

Core systems

Curriculum topic graph with prerequisite edges (40+ tables across 5 schema layers)
Modified SM-2 scheduling engine with exam-aware prioritisation
AI context assembly and 15 Markdown prompt templates
pgvector similarity search for student materials
Inngest background jobs for ingestion, reporting, and scheduling
Terraform-managed GCP infrastructure (Cloud Run, Cloud SQL, europe-west2)

Why it was non-trivial

The hard part wasn't the chat interface — it was context assembly: making each AI session genuinely useful by giving it structured information about the student rather than a generic prompt.
The scheduling engine doesn't just track what's overdue. It models exam proximity, topic weights, avoidance patterns, and confidence miscalibration to decide both what to study and how.
The system ingests a student's actual materials, classifies them against the curriculum, and retrieves relevant chunks during sessions — no generic content.