Design Decisions

Resolved questions about how Lugh handles specific challenges. These started as open questions in Known Limitations and Mitigations and got resolved through design discussion.

Facts vs opinions: understanding is not evaluation

“Understanding X” means understanding the facts of X, not evaluating X.

Every topic has facts: what a theory proposes, what its mechanisms are, what its proponents argue, what its critics argue, what historical events occurred, what empirical data shows. These are teachable. Opinions about whether the thing is good, effective, or desirable are the learner’s business, not the course’s.

This principle does the most work on politically charged topics (socialism, capitalism, religion) but applies everywhere. “Understanding MCAS” teaches what mast cells do, not whether your doctor is handling it correctly. “Understanding Design Patterns” teaches what Singleton does, not whether you should use it in your codebase.

The Feynman tutor tests factual understanding (“can you explain what this theory claims?”) not agreement (“do you think this theory is correct?”).

When historical sample sizes are small (e.g., “how many times has socialism been implemented”), the system keeps things theoretical and notes where causal claims exceed available evidence rather than drawing conclusions.

Appeal to existing pedagogy

For any topic that’s taught in an academic setting, the Stage 0a agent should find existing course syllabi, textbooks, and pedagogical standards. A professor has already solved the sequencing and source selection problem. The agent should learn from that work rather than inventing a pedagogical approach from scratch.

This is no different from finding refactoring.guru for design patterns — it’s using the best existing curriculum resources as a starting point.

Contested and uncertain source material

When sources disagree or the science is unsettled, the agent handles it the way a good teacher or podcaster would: transparently.

  • Present the disagreement honestly. “There are two schools of thought here…”
  • “I don’t know” and “the research is uncertain” are valid and important answers
  • The agent should never silently pick a side when legitimate disagreement exists
  • For scientific topics especially, distinguishing “established consensus,” “emerging research,” and “actively debated” is part of the teaching

This isn’t a special case to engineer around — it’s a feature. A learner who comes away understanding where the uncertainty lives has learned something more valuable than a false confident answer.

Model quality and the refinement chain

The self-check Feynman step (Stage 3c) is the primary defense against confidently wrong content. If the script can’t pass its own gate, it gets flagged before it ever reaches TTS.

The full refinement chain — accuracy review (3b) → self-check (3c) → rubric extraction (3d) — creates multiple opportunities to catch errors. A single prompt might hallucinate; three sequential prompts each checking the previous one’s work are less likely to all fail the same way.

Whether this is sufficient with local models is an implementation discovery, but the architecture is designed to degrade gracefully — errors get caught downstream rather than propagating silently.

Feynman prompt adaptation per topic

The base Feynman protocol is universal. But Step 5 (falsification) requires enough topic-specific context in the window to formulate meaningful “what would break if…” questions.

This means the tutor prompt isn’t static — it gets assembled per episode from:

  • The base Feynman Tutor Prompt protocol
  • The episode’s learning objectives
  • The rubric generated in Stage 3d (which includes specific questions and edge cases)
  • The learner’s meta-assessment context (communication style, background)

The AI adapts each time. The protocol is the skeleton; the content is dynamic.

Meta-assessment influence on content

When a learner identifies as neurodivergent, a visual learner, someone who learns through analogy, or provides other context about how they process information, this flows through the pipeline in two ways:

Script generation: Episodes are structured more bottom-up (concrete examples first, abstractions emerging from them) rather than top-down (definition first, examples after). Analogy density increases. Tangential connections that leverage the learner’s existing knowledge get woven in.

Tutor sessions: The Feynman protocol adapts its approach — offering recognition tasks alongside recall, accepting different communication styles in explanations, and using the learner’s own analogies as bridges rather than insisting on textbook framing.

This isn’t a separate code path — it’s context that gets included in the prompt for every stage that generates or evaluates content.