Reference Architecture · 9 min read
Agentic AI for Report Generation: Regulated-Grade Reference Architecture
How regulated UK firms use agentic AI to draft, review, and approve reports — with provenance, audit trail, and the human-in-the-loop checkpoints supervisors expect. Reference architecture and 5 patterns we have seen ship.
Published 30 April 2026 · By Sunny Patel, Founder, Agentic AI Associates
The thesis
Report generation is the highest-volume, lowest-risk place to start an agentic AI programme — and the highest-volume, highest-risk place to do it carelessly. Done well, agents reduce drafting time 50–80% on routine reports while improving consistency. Done carelessly, you ship hallucinated figures into a regulated submission. The difference is architecture, not model choice.
This page covers six report classes and the controls that apply to each, the failure modes we have seen most often, and a reference architecture that has shipped successfully in regulated UK firms.
Six report classes and what they need
| Report kind | Regulated | Risk tier | Human-in-loop | Notes |
|---|---|---|---|---|
| Regulatory return drafts | Yes | Material | Always (named approver before submission) | GABRIEL, COREP, FINREP — agent drafts, human signs |
| Internal MI / management reports | Sometimes | Limited–Significant | For Material decisions | Decision-supporting; flag where AI-influenced |
| Customer suitability or advice notes | Yes | Material | Always | Consumer Duty outcome 3 + 4 apply directly |
| Incident and post-mortem reports | Sometimes | Limited | Author-then-review | Agent extracts signals from logs; human owns narrative |
| Board and committee packs | No | Significant (reputation) | Editor-in-chief | Speed wins; quality risk if unreviewed |
| Audit and compliance reports | Yes | Material | Always | Independence of evidence collection matters |
Reference architecture
The pattern that ships reliably across regulated workloads:
- Trigger. Schedule, event (regulatory deadline, end-of-day close), or human request via approved channel. Trigger is logged.
- Source resolution. Agent identifies the source-of-truth datasets, retrieves data via typed connectors. No data invented; every figure has a retrieval provenance record.
- Template binding. Report template owned by the firm (not the agent). Template defines structure, fixed sections, and required figures. Agent fills narrative slots.
- Drafting. Agent writes narrative around retrieved data. Each claim links to the retrieval that supports it. Numbers are inserted verbatim from retrieval, never paraphrased.
- Self-review. A second agent (or the same agent in a different mode) checks the draft against the template, flags missing data, inconsistencies, or out-of-policy phrasing.
- Human review. Named human reviewer receives the draft with provenance annotations visible. They edit. Edits logged.
- Approval. Named SMF holder or delegate signs off. Identity and timestamp logged.
- Release. Final report rendered without provenance annotations (kept in audit). Released to recipients.
- Audit. Full chain stored: trigger, retrievals, agent drafts, human edits, approval, release.
Steps 6 and 7 are non-negotiable for regulated outputs. The 50-80% time saving comes from steps 1-5. The supervisor confidence comes from steps 6-9.
Failure modes and mitigations
| Failure mode | Mitigation |
|---|---|
| Hallucinated figures | Numbers come only from typed retrieval against source-of-truth datasets; never generated by the LLM |
| Stale grounding | Every retrieval logs the document version + as-of date; report cites both inline |
| Untraceable reasoning | Each section ties claims to the retrieval that supports it; no claim without a link to source |
| Inconsistent boilerplate | Templates own structure; agent owns narrative; never let the agent re-draft fixed legal language |
| Vulnerable-customer mis-tone | Tone classifier + human review for any consumer-facing section; mandatory for Consumer Duty contexts |
| Approval shortcuts | Approver identity logged with timestamp; reports cannot be released without the named SMF or delegate sign-off |
Five patterns we have seen ship
- Pattern 1 — Daily ops MI digest. Replaces the analyst spending two hours each morning summarising overnight system signals. Agent reads ops logs, drafts a digest, sends to ops lead. Limited tier, no human approval needed for non-customer-facing.
- Pattern 2 — Board pack drafter. Agent assembles standard sections (financials, KPIs, risk dashboard) from source-of-truth, drafts narrative for each. Editor-in-chief reviews. 60% time saving on the assembly phase.
- Pattern 3 — Suitability note co-author. Agent drafts the suitability note from client data + product fit; advisor reviews and edits before client sees. Material tier; human always approves; AI involvement disclosed in the note where required.
- Pattern 4 — Regulatory return pre-fill. Agent pre-fills standard returns (e.g. monthly liquidity, periodic suitability reports) from regulatory data sources, flags variances vs prior period, presents to compliance for review and submission.
- Pattern 5 — Audit working-paper extraction. Agent reads source documents, extracts findings against an audit template, drafts working papers. Auditor reviews, retains independence over evidence and conclusions.
Frequently asked questions
Is agentic AI for report generation acceptable for FCA-regulated firms?
Yes, with the right controls. The FCA AI Approach is technology-neutral; it expects the firm to operate AI within its existing risk and governance framework. For report generation specifically, the controls that matter are: provenance (every figure traceable to a source-of-truth), human approval before release for any regulated output, full audit trail of agent decisions, and clear identification of which sections were AI-drafted in audit working papers. With those in place, agentic report generation is no different in principle from any other operational technology.
Which reports should we automate first?
Start with high-volume, low-risk internal reports — operational MI, status summaries, log digests. The point is to build trust in the audit trail and the human-in-the-loop pattern before you ship anything regulator-facing. Once the operating model has run smoothly for a quarter on internal reports, expand into Significant-tier reports (board packs, internal incident reports). Material-tier reports (regulatory returns, customer suitability) come last, with full SM&CR mapping in place.
Can the agent draft regulatory returns directly?
It can draft them. It cannot submit them. The pattern that works in 2026: agent drafts the return from the regulated data sources, surfaces variances against prior submissions and flags any data-quality issues, presents the draft with a structured rationale to the named human approver (typically a senior compliance or finance professional under SMF16 or SMF24 oversight), and the human signs off. The agent is faster and more consistent at the drafting; the human carries the regulatory accountability.
How do we handle confidence in agent-generated numbers?
Two rules. (1) Numbers in regulated reports never come from the LLM — they come from typed retrieval against source-of-truth systems, with the retrieval result rendered into the report verbatim. The LLM produces narrative around them, not the numbers themselves. (2) Every number is annotated with its source identifier in the working draft (visible to the human approver, removed from the final). This is the same provenance discipline you would apply to a manually-prepared regulated report.
What does the audit trail look like for a generated report?
Per generated report, you store: the agent ID and version hash, the prompt and grounding documents, every retrieval performed with source identifiers, the agent's draft output, every human edit with the editor identity and timestamp, the final approved version, the approver identity, and the release timestamp. This sits alongside the standard event-level audit trail for the agent itself. Cross-reference: see our 12-field audit trail schema.
Does the agent need to disclose itself in the report?
Internal reports: not required, but track which were AI-drafted in your audit log. Customer-facing reports: yes — the FCA position under Consumer Duty outcome 3 (consumer understanding) makes clear that material AI-driven communications should be transparent to the customer when AI has materially influenced what they read. The most defensive pattern is a brief, plain-English disclosure footer where AI was the primary drafter, with a "right to human review" pointer.
How much time does this actually save?
In our engagements: 60–80% drafting-time reduction on routine internal MI; 40–60% on board packs; 30–50% on regulated returns. The savings are not in the human-approval step — that should not be shortened. They are in the drafting, evidence-gathering, and consistency-checking phases that human reviewers traditionally spent half their time on. Net effect: senior reviewer time refocuses on judgement and approval, not assembly.
Ship report generation with regulated-grade controls
A Phase-Gate Diagnostic produces the architecture, the controls map, and the SMF approval workflow your compliance team can sign on. Two weeks, £6,500.
Book a Fit Call →