Skip to content

Sign the Scope (ROE and Authorization)

Establish written authorization, a defined scope, and rules of engagement before any other play runs. This is the gate. Nothing in this playbook is legal without it.

The Play

Every other play in this book is an action against a system. The only thing that separates a sanctioned assessment from a crime is a piece of paper signed by someone with the authority to grant it. The trust boundary here is not technical, it is legal and organizational: the boundary between what you were asked to do and what you were not. ROE makes that boundary explicit before you touch anything, so that when an attack succeeds the success is authorized, the scope is known, and the blast radius was agreed in advance. It works because the people who own the risk decided to accept it on the record, in writing, with their name on it.

Before the Snap

Stand up an owned lab you control end to end: a self-hosted LLM endpoint or a deliberately vulnerable AI app on your own machine, on a network you own. Download the Red Team Guide ROE template and read NIST SP 800-115 section 5 (Planning). Identify, for this lab, who the "system owner" is (you), what the actual asset boundary is (which containers, ports, models, and data stores), and what is explicitly off limits even in your own environment (your real accounts, production keys, anything shared). Have the template open and ready to fill before you write a single test.

Run It

  1. Read NIST SP 800-115 Planning (section 5): the three-phase model and what the standard requires you to fix in writing before execution, namely scope, rules, and logistics.
  2. Open the Red Team Guide ROE template. Walk every section once top to bottom so you know what the document is asking for before you start filling it.
  3. Define and write the scope: the exact targets in your lab. List the hostnames, IPs, ports, model endpoints, and app URLs that are in scope. This is the allow list. If it is not on this list, it is not a target.
  4. Write the explicit exclusions: the out-of-scope assets and prohibited actions. Even in your own lab, name what you will not touch (real credentials, shared infrastructure, anything that could leak to production or third parties).
  5. Fill the rules of engagement: allowed test windows, allowed techniques, data-handling rules, and the stop conditions. Decide in advance what makes you halt and call it (unexpected data exposure, instability, anything outside scope).
  6. Fill the points of contact and authorization block: who owns the system, who approves, and how to reach them. In a solo lab you sign as both owner and operator, which is the point, you are putting your name on the boundary.
  7. Sign and date the authorization block. An unsigned ROE is a draft, not authority. The signature is the gate.
  8. Store the signed ROE where the engagement record lives, then verify: re-read it and confirm every target you intend to test appears in scope and nothing you test falls outside it.

What You Learn

Authorization is not a formality you collect at the end, it is the precondition that makes everything else legal. The transferable lesson is that scope is a contract, not a suggestion: the gap between "what the ROE says" and "what you actually did" is the single most common way a sanctioned assessment turns into an incident. Define the boundary in writing, get it signed, and stay inside it. The failure class is scope creep and verbal-only authorization, which leave you with no defense when something goes wrong.

Drive It with Claude Code

Read the ROE template at docs/roe-template.md, then draft a signed Rules of Engagement for my owned AI lab: fill in the authorized target as my local self-hosted model endpoint, set the testing window to this week, list allowed techniques as prompt injection and jailbreak probing only, and define hard stop conditions plus an out-of-scope list. Output the completed ROE to ROE-mylab.md and flag any field I left blank.

## Rules of Engagement (ROE), AI/LLM Security Test
 
authorization:
  authorized_by: "<asset owner name, role>"      # the person who can grant consent
  operator: "<your name, contact>"
  signed_date: "<YYYY-MM-DD>"
  written_consent: true                            # must be true before any probe runs
 
scope:
  targets:                                         # owned or sanctioned ONLY
    - "lab.local model endpoint (self-hosted)"
  in_scope:
    - "prompt injection probing"
    - "jailbreak resistance testing"
    - "system-prompt extraction attempts"
  out_of_scope:
    - "any third-party or shared API key"
    - "production data, real PII"
    - "denial of service / resource exhaustion"
 
window:
  start: "<YYYY-MM-DDTHH:MMZ>"
  end:   "<YYYY-MM-DDTHH:MMZ>"
  timezone: "<TZ>"
 
rules:
  data_handling: "no real PII; synthetic prompts only; findings stored encrypted"
  hard_stops:                                      # stop immediately if hit
    - "evidence of a real user data leak"
    - "any access to a system outside the target list"
  deconfliction_contact: "<phone / channel for live abort>"
 
mapping:
  framework: "OWASP LLM Top 10 + MITRE ATLAS"      # tag every finding to both
  report_to: "<owner contact for the final report>"
 
# RULE: if a field above is blank, testing does NOT start.

Defend It

For the organization granting access: never authorize an assessment without a written, signed ROE that names the scope, the windows, the stop conditions, and the approving owner. Require a defined allow list and an explicit out-of-scope list, not "everything." Insist on named points of contact on both sides and a kill-switch contact reachable during the test. For the operator: treat the signed ROE as a hard fence. Before each action, confirm the target is on the in-scope list. If you find yourself wanting to touch something off-list, stop and get the ROE amended and re-signed first. Re-scoping mid-engagement is allowed; doing it silently is not. ROE discipline is the mitigation, and it is enforced by the people who hold the pen, not by tooling.

References

Krypteia AgentComing soon

The Krypteia agent refuses to fire a single probe until a signed scope is loaded, then runs the whole engagement inside those exact boundaries while a multi-agent crew maps every finding to OWASP LLM and MITRE ATLAS in an operator console, coming soon.