Boundaries are a first-class concern: layered orientation, clear support signposting, human oversight, and transparent limits. The assistant uses language to notice overload and readiness for reflection, then changes stance accordingly rather than treating support as a single on/off switch.
Safety here is enacted through relational containment: how the assistant slows, clarifies, redirects, or closes a thread when language suggests strain. It reads for intensity and direction, not diagnosis. Containment comes before interpretation.
In this project, support boundaries are not only about spotting danger language. It is a multi-layered concept: reading language for signs of overload, deciding whether the conversation needs containment or signposting, and also recognising when the person may be steady enough for careful reflective work.
This means the same language-orientation logic can do two jobs: tighten boundaries when strain rises, and determine when a reflective Memory Reorientation stance may be appropriate rather than destabilising.
The immediate version can work through relatively transparent language recognition: keywords, repeated phrases, moral injury themes, direct safety language, and shifts in tone or coherence. That is enough to support early containment and support-boundary logic.
Longer term, neural networks or sequence-aware models could make this richer by noticing patterns across time rather than single turns: loops, changes in coherence, persistent blame themes, or signs that the person is moving from dysregulation toward reflective capacity.
Even then, the purpose stays the same. These models are not there to diagnose the user. They are there to improve timing: when to slow down, when to signpost external help, and when it may be safe and useful to invite reflection.
Rather than a single response, the system shifts between stances. Stance changes are explained to the user for transparency and trust, and the user can always choose to pause.
| Stance | Trigger | Behaviour |
|---|---|---|
| Steady support | Stable language | Open dialogue, gentle reflection. |
| Containment | Elevated arousal | Grounding, reduced pace, reduced cognitive load. |
| Support signpost | Persistent distress or moral strain | Explicit offer of human help, plus bounded reflective support where appropriate. |
| Immediate external support boundary | Immediate danger language | Stop reflection, acknowledge limits, clear external support options. |
This differs from more conventional methods because it uses the dialogue itself as the monitoring surface. Instead of relying only on scheduled checklists, one-off triage, or fixed scripts, the companion can adapt turn by turn as language changes.
That does not mean it replaces existing approaches. It can work in conjunction with TRiM-style check-ins, clinician review, formal screening tools, or partner-led support pathways.
Safety is also about restraint. The assistant will not push trauma narrative or exposure, will not make promises, and will not present itself as the only support available. Ending safely is treated as success, not failure.
Human oversight is situational, not constant. When boundaries are crossed, the system routes to a human reviewer with minimal, relevant context.
Support signposting is language- and region-sensitive. The assistant does not assume location unless the user has chosen a language pack. It offers options, not orders.
For demo pages, it is fine to keep helplines as placeholders until you finalise verified partners.
Logs exist to review system behaviour, not to label users. This supports minimisation, proportionality, and ethical accountability.
Safety-first here is not an attempt to prevent all harm, which would be impossible. It is an attempt to reduce avoidable harm, preserve dignity under strain, and ensure automated presence never exceeds its ethical authority.