When AI Builds Its Own Successors
Bernhard Liebl
5 min. read More than 80 percent of the code in Anthropic’s own development pipeline is now authored ...
Autonomous AI agents are popping up across enterprises–often without inventory or ownership. CIOs must now decide which agents deliver pipeline value and which only create risk.
06.05.2026
Key Takeaways
Related:Autonomous AI: How CIOs Manage Black-Box Risk/NVIDIA Agent Toolkit: SAP, Salesforce and the Vendor Question
The answer doesn’t start with the technology; it starts with the journal entry. In most corporations we’ve spoken to over the past few quarters, the first productive supply-chain agents are bolted onto procurement functions: purchase requisitions are pre-checked, freight invoices are matched against contracts, carrier disputes are pre-qualified–exactly where volume, clear rulebooks, and a measurable error-cost tag converge.
What’s new in 2026 is the speed. At Google Cloud Next 26, Google positioned the Data Agent Kit as a portable toolbox of MCP tools, plug-ins, and skills that turn VS Code and the Gemini CLI into autonomous data workspaces. At RSA 2026, Microsoft extended Purview with real-time prompt DLP for Copilot Studio agents and released Agent 365, a control plane that maps every agent trace through Entra identity. Both moves shift the bottleneck: it’s no longer the model, but the identity and data gating that becomes the choke point.
The reality in the DACH mid-market looks different. Here three to twelve agents are running in production, another fifteen to thirty in pilot, and the CIO only learns about many of them because the license-management system flags anomalies. That’s sprawl. And sprawl is the costliest form of agent programs because it cedes control to the vendor.
Procurement Exception Handling is the classic example. At an industrial customer processing around 240,000 order items per year, roughly 9 percent fell into a manual clarification loop before agent deployment: price deviations below threshold, missing shipping address, contract mismatch. The agent resolves 71 percent of these cases without human intervention. That’s more than 15,000 transactions per year no longer left unresolved for four weeks. Cycle-time reduction from an average of 14 days to 1.8 days.
Freight Cost Auditing is the second clear-cut opportunity. Industry observations show discrepancies between agreed tariff matrices and actual carrier invoices averaging 1.4 to 2.1 percent. On a mid-sized logistics volume of 80 million Euro in freight costs, that’s 1.1 to 1.7 million Euro evaporating annually between tariff and invoice. An agent that automatically matches freight invoices against contracts and flags anomalies for the disputes desk reclaims a meaningful share of that loss.
Order-to-Confirm Automation is the third spot where the numbers add up. When sales can deliver a reliable order confirmation to the customer in 6 hours instead of 36, cash conversion accelerates and cancellation rates in volatile markets drop measurably. Agents prove especially effective here when paired with ATP logic (Available-to-Promise) and when they treat the ERP–not an Excel approximation–as the single source of truth.
ROI Snapshot: Procurement Agent in a DACH Industrial Group
| Metric | Pre-Agent | After 9 Months |
|---|---|---|
| Order items per year | 240,000 | 240,000 |
| Manual clarification rate | 9 percent | 2.6 percent |
| Median clarification cycle time | 14 days | 1.8 days |
| FTE equivalent in clarification desk | 11 | 4 (remaining shifted to audit) |
| Cash avoidance (discounts + claims) | Baseline | approx. 2.1 million Euro/year |
Anonymized figures from a DACH industrial group, nine months of live operation. Model calculations; not transferable to every sector. Source: board-level reporting, cross-checked against the patterns Gartner describes in its Three Building Blocks for Autonomous Supply Chain (May 2026).
The uncomfortable list is longer than the ROI list. Demand-forecasting agents that simply layer a language model over an existing forecast model rarely outperform the previous statistical stack. They shift the problem into a black box that’s more expensive to audit than an explainable ARIMA model. Supplier-risk agents that consolidate external sources often fail on traceability. And the ESG-reporting agent that synthesizes CSRD data points from unstructured supplier emails collides harder with CSRD audit obligations than any board member expects.
The issue runs deeper. Agents shine where rulebooks are clear, volumes are high, and a mistake carries a concrete cost tag. Agents disappoint where judgment, legal weighing, or genuine supplier relationships are required. That exact divide is a CIO task, not a vendor slide.
Agent suitable
Classic workflow better
The uncomfortable tension isn’t technical. It’s organizational. Who in the group has decision-making and escalation responsibility for a procurement agent that reroutes an order against the contract? The CIO. The CPO. The COO. In most DAX-listed companies in 2026, this question still has no clean answer. That’s exactly why agent programs land on the board agenda rather than in the IT steering committee.
The second conflict is between risk and speed. Microsoft’s Purview-DLP against prompt injection is real protection, but if sensitivity labels aren’t properly maintained across the group, it pushes the first pilot weeks back by six to twelve weeks. Anyone who hasn’t kept their identity homework up to date pays twice for every agent.
One observation from recent board meetings: the interesting agent decisions aren’t made in the quarter a vendor announces a new suite. They’re made in the quarter the board realizes the license line is no longer just cloud–it’s agents. Until then, the CIO remains the only one who can measure sprawl.
Agent sprawl refers to the parallel, uncoordinated growth of autonomous AI agents across procurement, logistics, inventory, and supplier functions, often procured by individual departments. The defining issue is the absence of a central registry or unified identity layer, leading to duplicate agents, unresolved escalation paths, and compliance gaps. The result is that the impact per agent and the total license spend no longer correlate.
In three areas: procurement-exception handling with high volume and clear rule sets, freight-cost auditing with the typical 1.4 to 2.1 percent tariff-bill spread, and order-to-confirm automation paired with ATP logic. Gartner’s Three Building Blocks for Autonomous Supply Chain confirms this pattern: high transaction volume, structured rules, and measurable error-cost tags.
Since RSA 2026, Purview has extended real-time data classification in Copilot Studio agents to include prompt DLP, integrating it into Agent 365 as a central control plane. For supply-chain agents, this means sensitivity labels on supplier data and contract attachments are enforced before the agent is invoked–not after data output. This is a prerequisite for agents working with contract or personal data in procurement processes.
Google’s Data Agent Kit is primarily a toolkit for data-centric agent development in BigQuery, Dataform, and Gemini environments, featuring MCP tools and plugins for VS Code and CLI. Microsoft Agent 365, by contrast, is an operations and identity plane for agents built on Entra ID, leveraging Purview, Defender, and Conditional Access as its control layer. In practice, large DACH corporations combine both ecosystems and decide lock-in at the domain level rather than across the entire group.
Gartner estimates the market for supply-chain software powered by agentic AI at $53 billion by 2030, up from under $2 billion in 2025. The forecast matters because vendor roadmaps from SAP, Oracle, and Microsoft are already aligned to this pace. For mid-sized firms, a wait-and-see strategy by 2027 will cost more than a well-gated pilot in 2026, as licensing models shift toward per-use-case agent pricing by then.
About the author
Angelika Beierlein is COO at Evernine. She writes for digital chiefs from the boardroom perspective about leadership decisions that don’t make the quarterly report but keep the business running.
More from the MBF Media Network
Microsoft Intelligent Purview in May 2026: real-time DLP for AI prompts and agent outputs
Source of header image: Pexels / Tom Fisk (px:1427107)