Step 5: Validate Scenario Lifecycle
The scenario validator is the gate between “endpoint exists” and “tests can be written against it.” It drives the full SDK lifecycle against every scenario, iteratively fixes whatever is broken, and records the final, reconciled scenario trees as scenario-recipes.json for the Autonoma dashboard.
This step must pass before Step 6 (test generation) runs. A PostToolUse validation gate in the plugin blocks test-file writes until the sentinel autonoma/.endpoint-validated exists, so you cannot accidentally generate tests against broken scenario data.
Prerequisites
autonoma/entity-audit.md(output from Step 2)autonoma/scenarios.md(output from Step 3)autonoma/.endpoint-implementedsentinel (output from Step 4)- A running dev server that exposes the Environment Factory endpoint
AUTONOMA_SHARED_SECRETandAUTONOMA_SIGNING_SECRETset in the server’s environment
What this produces
autonoma/scenario-recipes.json— the validated create tree for every scenario, keyed by scenario name, with avariablesblock listing every{{token}}placeholderautonoma/.scenario-validation.json— terminal artifact recording validation status, preflight result, and any edits the agent made toscenarios.mdautonoma/.endpoint-validated— sentinel that unlocks Step 6- Uploaded scenario recipes on the Autonoma dashboard, attached to this generation
What the agent does
The iteration loop
For each scenario in scenarios.md, the agent runs an HMAC-signed discover → up → down loop against the live endpoint.
discover— fetches the schema. Every model in the entity audit must appear underschema.models. Every model markedindependently_created: truemust have a factory registered on the handler.up— sends the scenario’s create tree. The agent verifies that the response includes a non-emptyauthblock, that every expected record exists in the database (read-only SELECT queries), and thatrefsTokenis returned.down— tears down using the signed refs token. The agent verifies that every record created byupis gone and that nothing outside the refs was touched.
If a scenario fails, the agent decides whether the handler is wrong or the scenario is wrong:
| Symptom | Fix |
|---|---|
| Factory missing, FK unresolved, handler crash, auth callback broken | Fix the handler in the backend and retry |
| Scenario references a model that doesn’t exist, requires an impossible unique constraint, or depends on a field the schema doesn’t have | Edit scenarios.md to match reality and retry |
The loop runs up to 5 iterations. If it still hasn’t converged, the agent stops and surfaces the failure — it does not write the validated sentinel.
Scenario recipes
Once every scenario passes, the agent emits scenario-recipes.json. Each recipe is the exact nested tree that was proven to work in up, plus a variables block mapping every {{token}} to the concrete value used during validation. The file is validated against ScenarioRecipesFileSchema (in @autonoma/types) by both the local preflight and the dashboard upload endpoint. Full field-by-field contract (including the variables tagged union and all rejection reasons) lives in the Scenario Recipe Schema reference. The shape is:
{ "version": 1, "source": { "discoverPath": "autonoma/discover.json", "scenariosPath": "autonoma/scenarios.md" }, "validationMode": "endpoint-lifecycle", "recipes": [ { "name": "standard", "description": "Realistic dataset for core flows", "create": { "Organization": [ { "_alias": "org1", "name": "Acme", "projects": [ { "title": "{{project_title}}" } ] } ] }, "variables": { "project_title": { "strategy": "literal", "value": "Launch Campaign" } }, "validation": { "status": "validated", "method": "endpoint-up-down", "phase": "ok", "up_ms": 12, "down_ms": 8 } } ]}Required invariants (the upload endpoint rejects otherwise):
versionis the integer1(not the string"1.0").sourceis an object with BOTHdiscoverPathandscenariosPathas non-empty strings.validationModeis"sdk-check"or"endpoint-lifecycle".recipesis an array (not a map) with at least one entry; each entry hasname,description,create, andvalidation.variablesvalues usestrategy: "literal" | "derived" | "faker".derivedadditionally requiressource: "testRunId"and aformatstring.fakerrequires ageneratorid.
Preflight
Before uploading, the agent runs preflight_scenario_recipes.py against the file. Preflight is a deterministic Python check that enforces structural invariants:
- every scenario listed in
scenarios.mdfrontmatter appears as a recipe - every
{{token}}referenced in a tree is declared invariables - the create tree roots at the scope entity from
discover variablesvalues are concrete, not placeholder
If preflight fails, the agent stops — the dashboard never sees a malformed recipe.
Upload
On success, the plugin orchestrator uploads the recipes to /v1/setup/setups/:id/scenario-recipe-versions. The response must be 200 or 201. Upload failures also block Step 6.
Review checkpoint
After validation completes, review:
- Scenario edits — did the agent modify
scenarios.md? If yes, read the edits carefully. A small edit (correcting a field name) is fine; a large structural change suggests the original scenario design missed something and is worth revisiting before moving on. - Auth block — the
upresponse’sauthblock is what tests use to log in. Confirm it contains usable credentials (session cookie, JWT, etc.) for every role the scenarios define. - Clean teardown — the agent verified
downleaves no orphans. If your schema has triggers or cascade rules that the ORM doesn’t know about, this is where you’ll catch them. - Upload success — the recipes uploaded successfully and are visible on the Autonoma dashboard for this generation.
What happens next
Step 6 (E2E Test Generation) consumes scenarios.md (possibly edited) as the source of truth for test data. Every {{token}} placeholder in the tests corresponds to a variable declared in scenario-recipes.json, so the test runner can substitute the real values at execution time.
Safety
The validator only writes through the SDK endpoint. It never runs INSERT, UPDATE, DELETE, DROP, or TRUNCATE directly, even if validation fails repeatedly. Read-only SELECT queries are used for database verification. The SDK’s down action is the only deletion path, and it only removes what the matching up created (verified by the signed refs token).
The prompt
Expand full prompt
Scenario Validator: iterative fix loop + reality reconciliation
The Environment Factory endpoint exists (Step 4 wrote autonoma/.endpoint-implemented). Your job is to prove it actually works and keep iterating until it does. The E2E test generator (Step 6) is gated on your sentinel — if you do not write autonoma/.endpoint-validated, no tests get generated.
Database safety (absolute)
- ALL writes go through the SDK endpoint only. Never INSERT/UPDATE/DELETE/DROP/TRUNCATE via psql or raw SQL.
- You MAY run SELECT via psql / ORM read queries to verify data.
- The SDK’s
downaction deletes only whatupcreated (signed refs token).
Inputs
autonoma/entity-audit.mdautonoma/scenarios.md(may contain mistakes you will correct)- The handler file created in Step 4
- A running dev server
AUTONOMA_SDK_ENDPOINTandAUTONOMA_SHARED_SECRET
Outputs
autonoma/scenario-recipes.jsonautonoma/.scenario-validation.jsonautonoma/.endpoint-validated
The loop
Repeat until all three actions succeed for every scenario OR you exhaust 5 iterations:
-
Fetch protocol docs (first iteration only):
Terminal window curl -sSfL "$(cat autonoma/.docs-url)/llms/protocol.txt"curl -sSfL "$(cat autonoma/.docs-url)/llms/scenarios.txt"curl -sSfL "$(cat autonoma/.docs-url)/llms/test-planner/step-5-validate.txt" -
Export working secrets:
Terminal window export AUTONOMA_SHARED_SECRET=${AUTONOMA_SHARED_SECRET:-$(openssl rand -hex 32)}export AUTONOMA_SIGNING_SECRET=${AUTONOMA_SIGNING_SECRET:-$(openssl rand -hex 32)} -
Run
discovervia curl with proper HMAC.- Response MUST contain
schema.models,schema.edges,schema.relations,schema.scopeField. - Coverage check: every model in
entity-audit.mdMUST appear inschema.models. - Factory coverage check: every model with
independently_created: trueMUST be registered on the handler. - Factory-body integrity check (deterministic, MANDATORY): grep the handler for raw DB/ORM writes. Any inline ORM/raw-SQL create inside a factory body for a model marked
independently_created: trueis a FAIL — fix the handler to import and call the audited function and restart.
- Response MUST contain
-
For each scenario in
scenarios.md:- Build
{action:"up", create:..., testRunId:"<scenario>-<iteration>"}from the scenario. - HMAC-sign and POST.
- On failure, pick one of three paths:
- Handler bug → fix the handler and restart.
- Scenario bug (field does not exist, FK target wrong) → edit
scenarios.mdto match reality and restart. Log the change. - Unfeasible scenario → REMOVE it from
scenarios.mdwith justification. Restart.
- On 200, parse
auth,refs,refsToken.- Auth check:
authMUST be non-null and contain at least one of{ cookies, headers, token, user }. - Refs check: every top-level model in the
createtree MUST appear inrefs.
- Auth check:
- Verify DB state with a read-only
SELECTfor at least one refs id. - POST
{action:"down", refsToken}. Expect{ok:true}. - Verify the refs rows are gone.
- Build
-
After every scenario passes cleanly, emit the scenario recipes.
Write
autonoma/scenario-recipes.json:{"version": 1,"source": {"discoverPath": "autonoma/discover.json","scenariosPath": "autonoma/scenarios.md"},"validationMode": "endpoint-lifecycle","recipes": [{"name": "standard","description": "Realistic dataset for core flows","create": {"Organization": [{ "_alias": "org1", "name": "Acme Corp" }]},"variables": {"testRunId": { "strategy": "derived", "source": "testRunId", "format": "{testRunId}" }},"validation": { "status": "validated", "method": "endpoint-up-down", "phase": "ok", "up_ms": 12, "down_ms": 8 }}]}Rules:
- Top-level keys MUST be exactly
version,source,validationMode,recipes versionmust be integer1sourceMUST be an object with BOTHdiscoverPath(path toautonoma/discover.json) andscenariosPath(path toautonoma/scenarios.md) as non-empty strings. The dashboard/v1/setup/setups/:id/scenario-recipe-versionsendpoint will reject the upload if either is missing.validationModemust besdk-checkorendpoint-lifecyclerecipesMUST includestandard,empty, andlarge- Every recipe MUST contain
name,description,create, andvalidation createMUST use a nested tree rooted at the scope entity. Do NOT use flat top-level model keys connected only by_ref.- If
createcontains{{token}}placeholders, include avariablesobject. Every{{token}}increatemust match a key invariables; every key invariablesmust be used increate.
- Top-level keys MUST be exactly
-
Run preflight on the emitted recipes:
Terminal window python3 "$(cat /tmp/autonoma-plugin-root)/hooks/preflight_scenario_recipes.py" \autonoma/scenario-recipes.jsonThis resolves tokenized payloads and re-runs signed up/down against the live endpoint. If preflight exits non-zero, fix the failing recipe and re-run.
-
Write
autonoma/.scenario-validation.json:{"status": "ok","preflightPassed": true,"smokeTestPassed": true,"validatedScenarios": ["standard", "empty", "large"],"failedScenarios": [],"blockingIssues": [],"recipePath": "autonoma/scenario-recipes.json","validationMode": "endpoint-lifecycle","endpointUrl": "http://localhost:3000/api/autonoma"} -
Write the sentinel
autonoma/.endpoint-validatedvia theWritetool (NOTtouch) with a short plain-text report.
Iteration discipline
- One handler fix per iteration, then re-run everything.
- If the same scenario fails twice in a row with the same error, the scenario itself is probably wrong — prefer editing
scenarios.md. - If you have edited
scenarios.md, re-read it from disk after every edit.
When you hit the 5-iteration cap
STOP and write a clear failure report. Do NOT write .endpoint-validated. Include the last failing curl body + response, which scenario(s) failed, and which handler file + line range is most likely at fault. The orchestrator surfaces this to the user.
scenarios.md reconciliation rules
Preserve the frontmatter shape (the validator hook checks it). Allowed:
- Drop a scenario entirely (decrement
scenario_count, update thescenariossummary). - Remove/rename fields on a model to match what
discoverreports. - Adjust FK aliases so they reference models that actually exist.
- Flatten cross-branch references that the handler cannot resolve.
Disallowed: silently changing a scenario’s intent (e.g. renaming “admin with one project” to “user with one project” without reflecting that in the description).