Automation Performance

Achiral AI learns which of your automation rules are worth running. Every time an automation template finishes, Chiro records the outcome, updates a per-rule utility score, and uses that score the next time the rule could fire — ranking healthy rules first, gating shaky ones behind approval, and pausing rules that consistently fail.

This is a reweighting system, not a rule generator: it changes how often and how cautiously your existing templates fire. It never rewrites a template or invents new ones.

Two levels of rule memory

Achiral AI keeps a rule's learning at two separate, independent levels:

Org-level memory (this page). One shared utility per rule for the whole organisation, stored on the rule itself. It folds in everyone's outcomes together, is visible to owners and admins, and sets the rule's default firing behaviour for the organisation.
Member-level memory. A separate, private utility for each person, kept only when a run can be attributed to an individual. It never appears on this page and is managed by each member from their own account — see Your Automation Rules.

The two never overwrite each other: organisation outcomes never touch a member's private score, and a member's own pause or approval never changes the shared organisation rule. Everything below describes the org-level layer.

How a rule earns its score

Each template carries an organisation-level utilityScore between 0 and 1:

A brand-new rule starts at a neutral 0.5 — no rule is penalised before it has run.
A successful run raises the score; a failed run lowers it.
Older outcomes fade over time (a per-day decay), so a rule's score reflects its recent behaviour rather than its entire history.
A rule that has not run in a while drifts back toward the neutral 0.5 as its past outcomes decay.

Scores are computed from time-decayed success and trial counters, so a rule with a long, stable track record moves more slowly than one with only a handful of runs.

What counts as success or failure

When a run reaches a terminal state, it is attributed as follows:

Run outcome	Recorded as
Completed with every step succeeding	Success
Completed, but a step failed under an `onFailure: continue` / `skip` setting	Success, flagged degraded for observability
Failed (a step failed under `onFailure: stop`)	Failure
An approved step that then failed at execution	Failure
A required approval was rejected	Failure
Cancelled by a person	Neutral — no score change

The distinction on the last two rows is deliberate: rejecting an approval is a signal that the rule proposed the wrong action (a strong negative), whereas a person cancelling a run is treated as neutral.

Delayed credit from operational insights

Rules triggered by an operational insight get a second, slower signal in addition to their mechanical run outcome.

A periodic maintenance pass looks back over insight-triggered runs once each insight has had time to resolve:

If the triggering insight resolved after the run, the rule receives positive credit — the automation plausibly helped.
If the insight stayed open past the maturity window, the rule receives negative credit.
If the insight was dismissed, or had already resolved before the run finished, the run is treated as neutral.

This credit is applied at most once per run, and a positive signal is never applied to a run that did not itself complete — a coincidental insight resolution cannot paper over a run that actually failed. When several runs were triggered by the same insight, the credit is collapsed so one resolved insight rewards the rule once rather than once per run.

How scores change firing behaviour

At firing time, a rule's score determines how — or whether — it runs:

Ranking. When several rules match the same trigger, they are ordered by score, with an exploration allowance so that under-tried rules still get a chance instead of being crowded out by established ones. Changed in v3.12.0: when a run can be attributed to an individual, this ranking also blends in that person's own track record with the rule — see Your Automation Rules.
Approval gating. A rule whose utility sits in the low band is downgraded so its actions require human approval before they execute, even if they would normally run automatically.
Suppression. A rule whose utility falls below the floor and has enough runs to judge it fairly is paused from auto-firing for a cooldown period. Manually running a template is never blocked by suppression.
Auto-fire cap. The number of rules that auto-fire in a single trigger cycle is capped to prevent a noisy cycle from firing everything at once.

New rules are protected from premature suppression: a rule is only paused once it has accrued a minimum number of recent trials, so the system explores cold-start rules rather than starving them.

Where to review performance

Performance is shown in your organisation workspace under the Memory tab (visible to owners and admins), in the Automation Performance panel. The panel lists each rule with its utility score, success and failure counts, last outcome, and a state badge:

Badge	Meaning
Healthy	Scoring well; fires normally
Learning	Too few recent runs to judge; still exploring
Approval-gated	Low utility; actions require approval before running
Suppressed	Paused from auto-firing during its cooldown

Suppressed rules are surfaced first. Owners and admins can lift a pause with the Un-suppress action; you can optionally reset the rule's score back to the neutral 0.5 so it earns a genuine fresh start.

Reviewing performance via the API

Two organisation-scoped endpoints back the panel:

GET /api/automation-templates/:id/performance — returns the utility score, success rate, run counts, last outcome, suppression state, and the active thresholds. Readable by any organisation member.
POST /api/automation-templates/:id/unsuppress — lifts the auto-fire cooldown; pass { "resetScore": true } to also reset the decayed counters to the neutral prior. Admin or owner only.

Shadow mode

Reweighting is controlled by the AUTOMATION_REWEIGHTING_ENABLED setting. When it is turned off, Achiral AI still records every outcome and keeps scores up to date, but it never changes firing behaviour — no ranking, no approval-gating, and no suppression. This lets an organisation accumulate accurate performance data before the behavioural half of the feature is switched on.

Defaults

The thresholds below are the shipped defaults and are configurable per deployment.

Setting	Default	Effect
Neutral starting score	0.5	Score for a rule with no outcomes
Per-day decay	0.98	How quickly older outcomes fade
Approve-below band	0.4	Utility under this gates actions for approval
Suppress-below floor	0.25	Utility under this (with enough trials) pauses auto-fire
Minimum trials before suppression	5	Recent runs required before a rule can be paused
Auto-fires per cycle	5	Cap on rules auto-firing in one trigger cycle
Suppression cooldown	24 hours	How long a paused rule stays out of the auto-fire pool