The moment an agent gets its own permission to drive the machine, the interesting risk stops living in the prompt and starts living at the kernel.

Why Computer-Use Scopes Change the Endpoint Math Now

A conversational assistant exposes a narrow surface: it reads text, it writes text, and the worst outcome is a bad answer. An agent with a dedicated computer-use scope is a different instrument entirely. It opens applications, moves a cursor, clicks through dialogs, reads the screen, and writes to the filesystem — autonomously, at machine speed, across whatever the logged-in user can touch. The permission scope is the whole story. Once the operating system hands an agent that grant, the position you are holding is no longer “a model that might say something wrong.” It is “a process with broad endpoint authority and no hesitation.”

That reprices the endpoint. Security teams spent a decade marking the risk of a Claude or a Copilot as a data-handling question — what does it read, where does the text go. A computer-use grant turns it into an action-handling question, which is the harder one. The agent’s plan exists for a few hundred milliseconds inside the model; by the time intent becomes a write() syscall or a UI click, the plan that produced it is gone. You cannot inspect a decision that has already become an action. So the control point has to sit where the action lands, not where the prompt was reasoned.

Metric Figure Source
Enterprises with at least one unsanctioned agent in production 88% Ospiri research, 2025
Unauthorized agent transactions caused by internal violations through 2028 ≥80% Gartner
Time horizon for an unmanaged agent footprint to reach material scale 12–18 months Ospiri customer signal
Average global cost of a data breach $4.88M IBM Cost of a Data Breach Report, 2024

The 80% number is the one that should set policy here. A computer-use agent doing the wrong thing is overwhelmingly an internal-violation event — over-broad permission, a misread screen, a destructive command issued in good faith — not an attacker. That is the tail you are hedging.

Conversational Scope vs. Computer-Use Scope: Two Different Instruments

Treat the two grants as separate asset classes, because they fail in different places and demand different controls.

Dimension Conversational scope Computer-use scope
Primary action Generates text Drives the OS — clicks, types, reads screen, writes files
Blast radius The output channel Everything the logged-in user can reach
Failure mode Wrong or leaked answer Wrong or irreversible action
Where intent is visible In the prompt and response Gone by the time the syscall fires
Natural control point Prompt guardrails, DLP on output Per-tool policy at the OS / kernel boundary
Reversibility High — discard the text Low — a deletion or send has no undo

The bottom row is the one that compounds. A bad answer is reversible at near-zero cost; you ignore it. A computer-use agent that empties a directory, fires a destructive shell command, or sends a file to the wrong recipient has produced an event with no undo button. When reversibility drops and frequency rises — and autonomous agents raise frequency by design — the expected loss is dominated by the few low-reversibility actions in the distribution. That is a tail-risk profile, and you price tail risk by capping the position, not by hoping the mean behaves.

The Anatomy of a Computer-Use Incident

Walk the sequence of an agent action and the gap becomes obvious. The point of failure is consistent: every layer that could have caught it sees the request, not the consequence.

  1. Plan forms in the model. The agent reasons about a multi-step task. This is the only moment the full intent exists in inspectable form — and it lives inside the model, not on the endpoint.
  2. Plan decomposes into tool calls. “Clean up the export folder” becomes a sequence of file operations. A prompt guardrail (Lakera, Protect AI) can see the instruction; it cannot see which paths the tool call will actually resolve to.
  3. Tool call hits the OS. The agent issues a real syscall or UI event. The plan is now discarded. EDR (CrowdStrike, SentinelOne, Defender) sees a process doing file I/O — but to EDR it looks like the user’s own legitimate activity, because it runs under the user’s session.
  4. Action completes. The file is gone, the command ran, the message sent. Observability tools log it after the fact. The audit trail is now a postmortem, not a brake.

The architectural lesson is that prompt-layer controls and process-layer controls each see a true but incomplete slice. Guardrails see intent without the resolved action. EDR sees the action without the intent. Neither can apply a per-tool policy at the instant the action is about to land, because neither sits at that boundary.

Pricing the Exposure: A Per-Action Risk Frame

If you want to size which computer-use grants to govern first, score them rather than argue about them.

Action Risk = (Permission Scope × Irreversibility) + (Frequency × Drift)

The structure matters. The first term is the static exposure — how much the agent is allowed to touch, multiplied by how hard the worst action is to undo. The second is the dynamic exposure — how often it acts, multiplied by how far its behavior is wandering from baseline. A read-only agent scoped to one directory carries almost no risk regardless of frequency. A broadly scoped agent that can write, delete, and send, acting hundreds of times a day with rising drift, is uncorrelated tail risk to the endpoint.

Factor What raises it Where it’s controllable
Permission Scope Broad filesystem, shell, network, and app grants At the OS — scope the grant per tool, not per agent
Irreversibility Delete, send, transact, overwrite operations At the kernel — gate or copy-on-write the destructive class
Frequency Autonomous, high-throughput task loops Rate and context limits at the action boundary
Drift New paths, new syscalls, new endpoints vs. baseline Behavioral observability, not static rules

Two of the four factors are only enforceable at the action boundary. That is the structural argument for where the control belongs.

The Kernel-Scope Answer: Per-Tool Policy at the OS

So here is the bet. Per-tool policy at the OS layer is no longer a nice-to-have; with a dedicated computer-use scope in play, it is table stakes. The control has to distinguish not just which agent is acting but which tool it is invoking and what that specific action does — and it has to make that call at the moment the action resolves, alongside the existing EDR stack rather than on top of the prompt.

Control point What it enforces What it cannot reach
Prompt guardrails (Lakera, Protect AI) Intent-level filtering before the model acts The resolved action — the plan is gone downstream
EDR (CrowdStrike, SentinelOne, Defender) Process and binary reputation, anomaly detection Per-tool agent intent — it reads as the user’s own session
DLP / CASB (Purview, Forcepoint) Data egress on monitored channels Local filesystem and shell actions off the channel
Agent firewall at the kernel / tool boundary Per-tool, per-action policy at the instant it lands — block, gate, or copy-on-write the irreversible class Pure SaaS-embedded conversational use with no endpoint action

The distinction that earns the architecture is block-on-deny versus copy-on-write. A pure block-on-deny posture stops the destructive action but also stops the productive 95% — and engineering teams rip that out within two quarters, the same way they revolted against early DLP. Copy-on-write lets the agent run, intercepts only the low-reversibility operations, and preserves a recoverable state. It is the difference between a control that survives political review and one that gets disabled the first sprint it slows someone down. This is exactly the seam our Claude firewall and broader agent firewall work is built to sit in.

What CISOs Should Do This Quarter

Four steps, none longer than a sprint.

Step Action Output Effort
1 Inventory which agents on the dev estate hold a computer-use or equivalent OS grant A scoped list of action-capable agents One sprint
2 Score each grant on Permission Scope × Irreversibility — flag the broad, destructive-capable ones A ranked exposure register Half a day
3 Put per-tool policy at the action boundary for the top-scoring agents, copy-on-write the irreversible class A live control on the highest-risk grants Two to four weeks
4 Pipe agent action telemetry into the SIEM and baseline for drift Closed-loop observability on behavior, not just intent One week, parallel to step 3

Step 1 usually changes the budget conversation on its own — most teams discover more action-capable agents than the org chart admits. Step 3 is what survives the incident.

The Bottom Line

A dedicated computer-use scope moves the agent’s real risk off the prompt and onto the endpoint, and you cannot govern an action at the layer where only the intent was visible. Price the exposure as tail risk — permission scope times irreversibility, plus frequency times drift — and you will find the worst-case loss concentrated in a handful of low-reversibility operations that no prompt guardrail and no EDR module sits close enough to stop. Per-tool policy at the OS boundary is the control that does, and copy-on-write rather than block-on-deny is what keeps it deployed past the second quarter. Hold guardrails for intent, hold EDR for the process, and add a kernel-scope layer for the action — the three are complementary, not competing. If your team is sizing this for the Claude 5 and agent governance rollout this quarter, request a working session. We will inventory your action-capable agents, score each grant, and scope a per-tool enforcement layer for the destructive class. 90 minutes.