Skip to main content

What Are Guardrails?

What Are Guardrails?

Guardrails are hard rules that are injected into every session alongside the practitioner prompt. They act as a safety layer — enforcing constraints that the AI must follow regardless of what the conversation contains or what the user asks.

Why Guardrails Exist

The practitioner prompt instructs the AI on how to behave. Guardrails enforce what it must not do. The distinction matters because:

  • Users may try to push the conversation into inappropriate territory
  • The AI may occasionally drift from its instructions under conversational pressure
  • Certain rules are non-negotiable (legal, safety, scope boundaries) and must be applied with zero flexibility

Guardrails are the enforcement mechanism for those non-negotiables.

How Guardrails Work

At the start of every session, the system assembles the full AI context:

  1. The practitioner base prompt
  2. Relevant knowledge base articles
  3. The patient's session history
  4. All active guardrails, appended as a rule block

The AI receives all of this together. Guardrails appear as explicit "you must" / "you must not" statements that the AI treats with high priority.

Guardrail Categories

Guardrails are organised by category:

Category Purpose
boundary What the AI must not engage with (medical advice, crisis support, etc.)
methodology How the AI must conduct sessions (question cap, CTA format, etc.)
safety Responses to distress, harm risk, or emergency signals
legal Disclaimers and scope limitations

Active vs Inactive Guardrails

Each guardrail has an is_active flag. Inactive guardrails are stored but not injected into sessions. This lets you draft guardrails, test them, and deactivate them without deleting them.