The Agent Risk Score: A Quantitative Posture Dashboard for CISOs

Every trading desk in Manhattan knows its Value-at-Risk before lunch. Most CISOs cannot tell you the agent-equivalent at four o’clock on a Friday.

Why the Agent Risk Score Matters Now

Security teams keep getting asked the same question by their audit committees: “How exposed are we to AI agents, in dollars?” The honest answer today is usually a shrug followed by some narrative bullets. That is not a posture. That is a vibe.

The capital-markets industry solved this problem three decades ago for portfolios. You decompose exposure into a small set of measurable factors, weight them, sum them, and produce a single rollup number you can defend in front of a regulator. The same discipline is overdue for agents.

Stat	Value	Source
Enterprises with at least one unsanctioned agent on a managed endpoint	88%	Ospiri signature pipeline, 2026
Average annualized cost of a single uncontrolled agent incident	+$670K	Ospiri, against IBM Cost of a Data Breach baselines
Median window from first agent deployment to first material incident	12-18 months	Ospiri field data

The dashboard needs to exist before the incident, not after. This piece walks through the framework Ospiri uses with design partners.

What an Agent Risk Score Is — and Is Not

Most “AI risk” frameworks circulating today are qualitative posters: a 3×3 matrix with high/medium/low boxes and no formula underneath. A real risk score has to be quantifiable, comparable across endpoints, and decomposable into the factors a CISO can actually move.

Framework	What it measures	Decomposable	Comparable across endpoints
NIST AI RMF	Process maturity	Partial	No
ISO 42001 controls map	Policy presence	No	No
Vendor “AI risk rating”	Marketing	No	No
Agent Risk Score (this framework)	Operational exposure	Yes — four factors	Yes — endpoint, team, org

The Agent Risk Score is meant to behave like VaR for an agent fleet: a number you can mark daily, a methodology you can disclose, and a delta you can attribute to specific control changes.

The Formula

Agent Risk Score = (Permission Scope × Reversibility) + (Frequency × Drift)

Two halves, four factors. The first half captures static exposure — what the agent can touch, and how bad a single action would be. The second half captures behavioral exposure — how often the agent acts, and how fast its behavior is changing relative to its deployment baseline.

Factor	Definition	Scoring rubric (0–10)
Permission Scope	Breadth of resources the agent can access — filesystem, network, identity, secrets, code execution	0 = read-only single directory; 10 = full kernel access with persistent credentials
Reversibility	How recoverable a single agent action is — higher score means less reversible	0 = sandboxed, copy-on-write, snapshot-restorable; 10 = irreversible writes to production systems
Frequency	Actions per hour against in-scope resources	0 = idle; 10 = >100 actions/hour with no human-in-the-loop
Drift	Behavioral delta from the agent’s deployment baseline — new directories, new syscalls, new endpoints, new tools	0 = no drift over 30 days; 10 = >50% new behavior signatures week-over-week

Worked example. A Cursor instance on an engineer’s laptop with full filesystem and git push permissions, irreversible commits to main, ~40 actions per hour, mild drift after six weeks: Permission Scope 8, Reversibility 7, Frequency 6, Drift 3. Score = (8 × 7) + (6 × 3) = 74.

By comparison, a sandboxed Aider instance restricted to a project subdirectory with copy-on-write isolation: Permission Scope 3, Reversibility 2, Frequency 5, Drift 2. Score = (3 × 2) + (5 × 2) = 16.

Same engineer, same nominal toolchain — more than a four-fold exposure delta on the rollup. That is the conversation you want to be having before the incident, not after.

How the Score Rolls Up

The same logic that lets a portfolio manager view exposure at the security, sector, and book level applies here.

Rollup level	Aggregation	What it answers
Per-endpoint	Sum of agent scores resident on that host	Which laptops are the firm’s hot spots?
Per-team	Weighted average across endpoints, weighted by data-sensitivity tier	Which teams concentrate exposure?
Per-org	Mark-to-market sum, decomposed by factor	What is the firm’s agent VaR, and which factor drives it?

This is the view that puts a CISO in a defensible position when the audit committee asks for a number. “Our org-level Agent Risk Score is 4,200 today, up from 3,100 in March, with the increase concentrated in Frequency as engineering rolled out Claude Code.” That is auditable. That is also a sentence a regulator will accept.

Where the Score Plugs Into the Existing Stack

The whole point of expressing exposure as a number is so the existing security stack can act on it. The score is not a new dashboard you stare at — it is a feature that feeds the dashboards your team already pays for.

Existing system	How the Agent Risk Score plugs in	Outcome
UEBA (Splunk UBA, Defender for Identity)	Drift factor becomes a behavioral signal alongside user anomalies	Agents finally appear in the same anomaly view as humans
SIEM (Splunk, Datadog, Sentinel)	Per-endpoint and per-team scores ingested as a daily metric, threshold-alertable	Score crossing a threshold triggers a ticket, not a quarterly review
GRC (ServiceNow GRC, OneTrust, Archer)	Org-level score feeds the AI risk register with a defensible methodology	The auditor stops asking “show me your AI risk register” because it has numbers
EDR (CrowdStrike, SentinelOne, Defender)	Permission Scope and Reversibility derived from kernel-level telemetry the EDR already collects	Agent posture rides the same agent the firm already deploys

The score is deliberately stack-agnostic. The factors are computable from telemetry that exists in any environment running EDR plus an agent firewall — which, on our 12-to-18-month forecast, will be most large enterprises by end of 2027.

What CISOs Should Do This Quarter

This is not a six-quarter consulting engagement. The minimum viable score takes a fortnight if the telemetry is already flowing.

Step	Action	Output	Effort
1	Inventory agents on managed endpoints — sanctioned and shadow	Agent census	3 days
2	Score each agent on the four factors using the rubric above	First Permission/Reversibility/Frequency/Drift snapshot	2 days
3	Aggregate to per-endpoint, per-team, per-org rollups	First org-level Agent Risk Score	1 day
4	Wire the rollup into SIEM/UEBA as a metric, set threshold alerting	Continuous score with drift detection	1 week

The output of week two is a single number you can put in a board deck, with an honest methodology behind it. The output of quarter two is a downward trend on that number, attributable to specific agent governance controls you put in place.

The Bottom Line

If your firm cannot mark its agent exposure daily, your firm cannot price what is on its balance sheet. The Agent Risk Score gives you a defensible, decomposable, threshold-alertable number — and a methodology that survives an external audit because every factor maps to telemetry the EDR and the agent firewall are already collecting. The whole point of borrowing Value-at-Risk discipline from trading is that the score becomes comparable across endpoints, across teams, and across time. The conversation with the audit committee then shifts from “we are working on it” to “we are at 4,200, here is the driver, and here is the playbook.”

If your team is sizing this for the Q3 board cycle, request a working session. We will walk through your environment, score a representative sample of endpoints, and produce a first org-level Agent Risk Score you can defend. 90 minutes.

Related reading on Ospiri