Reference Architecture · 9 min read

Agentic AI for Report Generation: Regulated-Grade Reference Architecture

How regulated UK firms use agentic AI to draft, review, and approve reports — with provenance, audit trail, and the human-in-the-loop checkpoints supervisors expect. Reference architecture and 5 patterns we have seen ship.

Published 30 April 2026 · By Sunny Patel, Founder, Agentic AI Associates

The thesis

Report generation is the highest-volume, lowest-risk place to start an agentic AI programme — and the highest-volume, highest-risk place to do it carelessly. Done well, agents reduce drafting time 50–80% on routine reports while improving consistency. Done carelessly, you ship hallucinated figures into a regulated submission. The difference is architecture, not model choice.

This page covers six report classes and the controls that apply to each, the failure modes we have seen most often, and a reference architecture that has shipped successfully in regulated UK firms.

Six report classes and what they need

Report kind Regulated Risk tier Human-in-loop Notes
Regulatory return drafts Yes Material Always (named approver before submission) GABRIEL, COREP, FINREP — agent drafts, human signs
Internal MI / management reports Sometimes Limited–Significant For Material decisions Decision-supporting; flag where AI-influenced
Customer suitability or advice notes Yes Material Always Consumer Duty outcome 3 + 4 apply directly
Incident and post-mortem reports Sometimes Limited Author-then-review Agent extracts signals from logs; human owns narrative
Board and committee packs No Significant (reputation) Editor-in-chief Speed wins; quality risk if unreviewed
Audit and compliance reports Yes Material Always Independence of evidence collection matters

Reference architecture

The pattern that ships reliably across regulated workloads:

  1. Trigger. Schedule, event (regulatory deadline, end-of-day close), or human request via approved channel. Trigger is logged.
  2. Source resolution. Agent identifies the source-of-truth datasets, retrieves data via typed connectors. No data invented; every figure has a retrieval provenance record.
  3. Template binding. Report template owned by the firm (not the agent). Template defines structure, fixed sections, and required figures. Agent fills narrative slots.
  4. Drafting. Agent writes narrative around retrieved data. Each claim links to the retrieval that supports it. Numbers are inserted verbatim from retrieval, never paraphrased.
  5. Self-review. A second agent (or the same agent in a different mode) checks the draft against the template, flags missing data, inconsistencies, or out-of-policy phrasing.
  6. Human review. Named human reviewer receives the draft with provenance annotations visible. They edit. Edits logged.
  7. Approval. Named SMF holder or delegate signs off. Identity and timestamp logged.
  8. Release. Final report rendered without provenance annotations (kept in audit). Released to recipients.
  9. Audit. Full chain stored: trigger, retrievals, agent drafts, human edits, approval, release.

Steps 6 and 7 are non-negotiable for regulated outputs. The 50-80% time saving comes from steps 1-5. The supervisor confidence comes from steps 6-9.

Failure modes and mitigations

Failure mode Mitigation
Hallucinated figures Numbers come only from typed retrieval against source-of-truth datasets; never generated by the LLM
Stale grounding Every retrieval logs the document version + as-of date; report cites both inline
Untraceable reasoning Each section ties claims to the retrieval that supports it; no claim without a link to source
Inconsistent boilerplate Templates own structure; agent owns narrative; never let the agent re-draft fixed legal language
Vulnerable-customer mis-tone Tone classifier + human review for any consumer-facing section; mandatory for Consumer Duty contexts
Approval shortcuts Approver identity logged with timestamp; reports cannot be released without the named SMF or delegate sign-off

Five patterns we have seen ship

  • Pattern 1 — Daily ops MI digest. Replaces the analyst spending two hours each morning summarising overnight system signals. Agent reads ops logs, drafts a digest, sends to ops lead. Limited tier, no human approval needed for non-customer-facing.
  • Pattern 2 — Board pack drafter. Agent assembles standard sections (financials, KPIs, risk dashboard) from source-of-truth, drafts narrative for each. Editor-in-chief reviews. 60% time saving on the assembly phase.
  • Pattern 3 — Suitability note co-author. Agent drafts the suitability note from client data + product fit; advisor reviews and edits before client sees. Material tier; human always approves; AI involvement disclosed in the note where required.
  • Pattern 4 — Regulatory return pre-fill. Agent pre-fills standard returns (e.g. monthly liquidity, periodic suitability reports) from regulatory data sources, flags variances vs prior period, presents to compliance for review and submission.
  • Pattern 5 — Audit working-paper extraction. Agent reads source documents, extracts findings against an audit template, drafts working papers. Auditor reviews, retains independence over evidence and conclusions.

Frequently asked questions

Is agentic AI for report generation acceptable for FCA-regulated firms?

Yes, with the right controls. The FCA AI Approach is technology-neutral; it expects the firm to operate AI within its existing risk and governance framework. For report generation specifically, the controls that matter are: provenance (every figure traceable to a source-of-truth), human approval before release for any regulated output, full audit trail of agent decisions, and clear identification of which sections were AI-drafted in audit working papers. With those in place, agentic report generation is no different in principle from any other operational technology.

Which reports should we automate first?

Start with high-volume, low-risk internal reports — operational MI, status summaries, log digests. The point is to build trust in the audit trail and the human-in-the-loop pattern before you ship anything regulator-facing. Once the operating model has run smoothly for a quarter on internal reports, expand into Significant-tier reports (board packs, internal incident reports). Material-tier reports (regulatory returns, customer suitability) come last, with full SM&CR mapping in place.

Can the agent draft regulatory returns directly?

It can draft them. It cannot submit them. The pattern that works in 2026: agent drafts the return from the regulated data sources, surfaces variances against prior submissions and flags any data-quality issues, presents the draft with a structured rationale to the named human approver (typically a senior compliance or finance professional under SMF16 or SMF24 oversight), and the human signs off. The agent is faster and more consistent at the drafting; the human carries the regulatory accountability.

How do we handle confidence in agent-generated numbers?

Two rules. (1) Numbers in regulated reports never come from the LLM — they come from typed retrieval against source-of-truth systems, with the retrieval result rendered into the report verbatim. The LLM produces narrative around them, not the numbers themselves. (2) Every number is annotated with its source identifier in the working draft (visible to the human approver, removed from the final). This is the same provenance discipline you would apply to a manually-prepared regulated report.

What does the audit trail look like for a generated report?

Per generated report, you store: the agent ID and version hash, the prompt and grounding documents, every retrieval performed with source identifiers, the agent's draft output, every human edit with the editor identity and timestamp, the final approved version, the approver identity, and the release timestamp. This sits alongside the standard event-level audit trail for the agent itself. Cross-reference: see our 12-field audit trail schema.

Does the agent need to disclose itself in the report?

Internal reports: not required, but track which were AI-drafted in your audit log. Customer-facing reports: yes — the FCA position under Consumer Duty outcome 3 (consumer understanding) makes clear that material AI-driven communications should be transparent to the customer when AI has materially influenced what they read. The most defensive pattern is a brief, plain-English disclosure footer where AI was the primary drafter, with a "right to human review" pointer.

How much time does this actually save?

In our engagements: 60–80% drafting-time reduction on routine internal MI; 40–60% on board packs; 30–50% on regulated returns. The savings are not in the human-approval step — that should not be shortened. They are in the drafting, evidence-gathering, and consistency-checking phases that human reviewers traditionally spent half their time on. Net effect: senior reviewer time refocuses on judgement and approval, not assembly.

Ship report generation with regulated-grade controls

A Phase-Gate Diagnostic produces the architecture, the controls map, and the SMF approval workflow your compliance team can sign on. Two weeks, £6,500.

Book a Fit Call →