The moment an agent gets its own permission to drive the machine, the interesting risk stops living in the prompt and starts living at the kernel.
Why Computer-Use Scopes Change the Endpoint Math Now
A conversational assistant exposes a narrow surface: it reads text, it writes text, and the worst outcome is a bad answer. An agent with a dedicated computer-use scope is a different instrument entirely. It opens applications, moves a cursor, clicks through dialogs, reads the screen, and writes to the filesystem — autonomously, at machine speed, across whatever the logged-in user can touch. The permission scope is the whole story. Once the operating system hands an agent that grant, the position you are holding is no longer “a model that might say something wrong.” It is “a process with broad endpoint authority and no hesitation.”
That reprices the endpoint. Security teams spent a decade marking the risk of a Claude or a Copilot as a data-handling question — what does it read, where does the text go. A computer-use grant turns it into an action-handling question, which is the harder one. The agent’s plan exists for a few hundred milliseconds inside the model; by the time intent becomes a write() syscall or a UI click, the plan that produced it is gone. You cannot inspect a decision that has already become an action. So the control point has to sit where the action lands, not where the prompt was reasoned.
| Metric | Figure | Source |
|---|---|---|
| Enterprises with at least one unsanctioned agent in production | 88% | Ospiri research, 2025 |
| Unauthorized agent transactions caused by internal violations through 2028 | ≥80% | Gartner |
| Time horizon for an unmanaged agent footprint to reach material scale | 12–18 months | Ospiri customer signal |
| Average global cost of a data breach | $4.88M | IBM Cost of a Data Breach Report, 2024 |
The 80% number is the one that should set policy here. A computer-use agent doing the wrong thing is overwhelmingly an internal-violation event — over-broad permission, a misread screen, a destructive command issued in good faith — not an attacker. That is the tail you are hedging.
Conversational Scope vs. Computer-Use Scope: Two Different Instruments
Treat the two grants as separate asset classes, because they fail in different places and demand different controls.
| Dimension | Conversational scope | Computer-use scope |
|---|---|---|
| Primary action | Generates text | Drives the OS — clicks, types, reads screen, writes files |
| Blast radius | The output channel | Everything the logged-in user can reach |
| Failure mode | Wrong or leaked answer | Wrong or irreversible action |
| Where intent is visible | In the prompt and response | Gone by the time the syscall fires |
| Natural control point | Prompt guardrails, DLP on output | Per-tool policy at the OS / kernel boundary |
| Reversibility | High — discard the text | Low — a deletion or send has no undo |
The bottom row is the one that compounds. A bad answer is reversible at near-zero cost; you ignore it. A computer-use agent that empties a directory, fires a destructive shell command, or sends a file to the wrong recipient has produced an event with no undo button. When reversibility drops and frequency rises — and autonomous agents raise frequency by design — the expected loss is dominated by the few low-reversibility actions in the distribution. That is a tail-risk profile, and you price tail risk by capping the position, not by hoping the mean behaves.
The Anatomy of a Computer-Use Incident
Walk the sequence of an agent action and the gap becomes obvious. The point of failure is consistent: every layer that could have caught it sees the request, not the consequence.
- Plan forms in the model. The agent reasons about a multi-step task. This is the only moment the full intent exists in inspectable form — and it lives inside the model, not on the endpoint.
- Plan decomposes into tool calls. “Clean up the export folder” becomes a sequence of file operations. A prompt guardrail (Lakera, Protect AI) can see the instruction; it cannot see which paths the tool call will actually resolve to.
- Tool call hits the OS. The agent issues a real syscall or UI event. The plan is now discarded. EDR (CrowdStrike, SentinelOne, Defender) sees a process doing file I/O — but to EDR it looks like the user’s own legitimate activity, because it runs under the user’s session.
- Action completes. The file is gone, the command ran, the message sent. Observability tools log it after the fact. The audit trail is now a postmortem, not a brake.
The architectural lesson is that prompt-layer controls and process-layer controls each see a true but incomplete slice. Guardrails see intent without the resolved action. EDR sees the action without the intent. Neither can apply a per-tool policy at the instant the action is about to land, because neither sits at that boundary.
Pricing the Exposure: A Per-Action Risk Frame
If you want to size which computer-use grants to govern first, score them rather than argue about them.
Action Risk = (Permission Scope × Irreversibility) + (Frequency × Drift)
The structure matters. The first term is the static exposure — how much the agent is allowed to touch, multiplied by how hard the worst action is to undo. The second is the dynamic exposure — how often it acts, multiplied by how far its behavior is wandering from baseline. A read-only agent scoped to one directory carries almost no risk regardless of frequency. A broadly scoped agent that can write, delete, and send, acting hundreds of times a day with rising drift, is uncorrelated tail risk to the endpoint.
| Factor | What raises it | Where it’s controllable |
|---|---|---|
| Permission Scope | Broad filesystem, shell, network, and app grants | At the OS — scope the grant per tool, not per agent |
| Irreversibility | Delete, send, transact, overwrite operations | At the kernel — gate or copy-on-write the destructive class |
| Frequency | Autonomous, high-throughput task loops | Rate and context limits at the action boundary |
| Drift | New paths, new syscalls, new endpoints vs. baseline | Behavioral observability, not static rules |
Two of the four factors are only enforceable at the action boundary. That is the structural argument for where the control belongs.
The Kernel-Scope Answer: Per-Tool Policy at the OS
So here is the bet. Per-tool policy at the OS layer is no longer a nice-to-have; with a dedicated computer-use scope in play, it is table stakes. The control has to distinguish not just which agent is acting but which tool it is invoking and what that specific action does — and it has to make that call at the moment the action resolves, alongside the existing EDR stack rather than on top of the prompt.
| Control point | What it enforces | What it cannot reach |
|---|---|---|
| Prompt guardrails (Lakera, Protect AI) | Intent-level filtering before the model acts | The resolved action — the plan is gone downstream |
| EDR (CrowdStrike, SentinelOne, Defender) | Process and binary reputation, anomaly detection | Per-tool agent intent — it reads as the user’s own session |
| DLP / CASB (Purview, Forcepoint) | Data egress on monitored channels | Local filesystem and shell actions off the channel |
| Agent firewall at the kernel / tool boundary | Per-tool, per-action policy at the instant it lands — block, gate, or copy-on-write the irreversible class | Pure SaaS-embedded conversational use with no endpoint action |
The distinction that earns the architecture is block-on-deny versus copy-on-write. A pure block-on-deny posture stops the destructive action but also stops the productive 95% — and engineering teams rip that out within two quarters, the same way they revolted against early DLP. Copy-on-write lets the agent run, intercepts only the low-reversibility operations, and preserves a recoverable state. It is the difference between a control that survives political review and one that gets disabled the first sprint it slows someone down. This is exactly the seam our Claude firewall and broader agent firewall work is built to sit in.
What CISOs Should Do This Quarter
Four steps, none longer than a sprint.
| Step | Action | Output | Effort |
|---|---|---|---|
| 1 | Inventory which agents on the dev estate hold a computer-use or equivalent OS grant | A scoped list of action-capable agents | One sprint |
| 2 | Score each grant on Permission Scope × Irreversibility — flag the broad, destructive-capable ones | A ranked exposure register | Half a day |
| 3 | Put per-tool policy at the action boundary for the top-scoring agents, copy-on-write the irreversible class | A live control on the highest-risk grants | Two to four weeks |
| 4 | Pipe agent action telemetry into the SIEM and baseline for drift | Closed-loop observability on behavior, not just intent | One week, parallel to step 3 |
Step 1 usually changes the budget conversation on its own — most teams discover more action-capable agents than the org chart admits. Step 3 is what survives the incident.
The Bottom Line
A dedicated computer-use scope moves the agent’s real risk off the prompt and onto the endpoint, and you cannot govern an action at the layer where only the intent was visible. Price the exposure as tail risk — permission scope times irreversibility, plus frequency times drift — and you will find the worst-case loss concentrated in a handful of low-reversibility operations that no prompt guardrail and no EDR module sits close enough to stop. Per-tool policy at the OS boundary is the control that does, and copy-on-write rather than block-on-deny is what keeps it deployed past the second quarter. Hold guardrails for intent, hold EDR for the process, and add a kernel-scope layer for the action — the three are complementary, not competing. If your team is sizing this for the Claude 5 and agent governance rollout this quarter, request a working session. We will inventory your action-capable agents, score each grant, and scope a per-tool enforcement layer for the destructive class. 90 minutes.