SAIL

SAIL is the implementation layer of the SAFE standard.

It operationalises safety within AI systems.

It does not define the standard.

It does not verify compliance.

Structural Separation

SAFE defines what safety is.

SAFE Labs verifies whether systems meet it.

SAIL implements safety within systems.

These roles are separate by design.

They cannot be combined.

SAFE

Defines the standard

SAFE Labs

Verifies systems independently

SAIL

Implements safety within systems

Function

SAIL enables systems to operate within the SAFE standard.

It supports:

Detection of defined risk patterns

Responses aligned to SAFE thresholds

Integration of human escalation pathways

Operation within verified safety parameters

SDK Integration

Implementation Reference

Trajectory-based harm detection

Governance-defined intervention thresholds

Privacy-preserving safeguarding signals

Audit-safe intervention records

Example Integration

import { SafeGuard } from '@safe/sail-sdk';

// Initialize with governance rules
const guard = new SafeGuard({rules: 'SAFE-001-v1', mode: 'on-device'});

// Process conversation trajectory
const result = await guard.evaluate( conversation );

if (result.intervention_tier > 0) {// Trigger governance-defined responseawait guard.respond(result);}

Audit Trail

Privacy-Preserving Records

SAIL audit records contain only the minimum information required for governance verification. No message content is stored.

Audit Record Structure

{
  "timestamp": "2025-01-15T14:32:00Z",
  "confidence_score": 0.73,
  "intervention_tier": 2,
  "rule_triggered": "boundary_erosion_v1"
}

Constraint

SAIL cannot certify itself.

All implementations must be independently verified through SAFE Labs.

In Practice

SAIL operates in real-time within messaging environments.

Below are examples of SAIL interventions from the working MVP.

SAIL Layer 4 intervention — escalation detection and user prompt

Escalation Detection

SAIL detects rapid escalation from location sharing to video requests. Intervention is delivered directly to the user with the option to disengage.

SAIL message flagging — inappropriate content detected and confirmed

Message Flagging

SAIL flags content confirmed as inappropriate by the receiver. This applies to all users — not only predatory behaviour but also conversations between peers where requests may cross safety thresholds.

Position

Any system can implement the SAFE standard.

SAIL is one implementation aligned to it.

Implementation does not equal verification.

Implementation is not proof.

Learn About Certification

Made with Emergent