Intelligence Brain · accounting

Audit-trail architecture for accounting AI in Ireland

← Back to Intelligence Brain

If you're running AI inside an Irish accounting practice and you can't show an auditor exactly what the model saw, what it produced, who approved it, and where the source data came from — you don't have AI, you have a liability. Every regulated firm I talk to has the same blind spot: they've bolted ChatGPT or Copilot onto their workflow, the partners are quietly pleased with the time savings, and nobody can answer a basic question from CCAB-I or Revenue about how a figure was derived. This article is about fixing that. Not with policy documents. With architecture.

Why the standard SaaS AI stack fails an Irish audit

The default pattern most practices fall into is: staff paste client data into a hosted LLM, get a draft, edit it, send it. There are three problems with that from an audit perspective, and none of them are about data residency — though that matters too.

First, there's no immutable record of the prompt. The chat history sits in a vendor's account, scoped to a user, deletable, and not bound to the engagement file. Second, there's no record of which version of the model produced the output — providers update weights without notice, so "the AI said X on Tuesday" is not reproducible by Friday. Third, the source documents the model saw aren't linked to the output. If a draft set of accounts cites a figure, you can't trace it back to the underlying invoice, bank line, or trial balance entry.

For ISQM 1, ISA 230, and the Code of Ethics obligations Chartered Accountants Ireland members operate under, that's not an inconvenience. It's a documentation failure. The standard requires you to demonstrate the basis for professional judgement. "The AI suggested it" is not a basis. "The AI suggested it, here is the prompt, the retrieved source documents, the model version, the reviewer who approved it, and the timestamp" — that's a basis.

The four artefacts every AI action must produce

Whenever a model touches client data in a regulated firm, the system should write four artefacts to an append-only store before the output is shown to the user. Not after. Before.

  • The input envelope. The full prompt, the system instructions, the retrieval context (every document chunk pulled from your knowledge base with its source ID), the user identity, the client/engagement ID, and the timestamp.
  • The model fingerprint. Model name, version hash, temperature, top-p, any tool definitions exposed to it, and the runtime environment. If you're running locally, this is the container digest.
  • The output envelope. The raw model response, any tool calls it made, any retries, and the final rendered result given to the user.
  • The disposition. What the human did with it — accepted, edited (with diff), rejected, escalated. Tied to the reviewer's identity and the time they signed off.

If any of those four are missing, the action is unauditable. The architectural rule I apply is simple: the UI cannot display model output to the user until the first three artefacts are committed. The fourth follows when the human acts.

Append-only storage, hash chaining, and what "immutable" actually means

"Immutable audit log" gets thrown around loosely. In practice it means three things working together. Append-only writes (no UPDATE, no DELETE), hash chaining (each record contains the hash of the previous one, so tampering anywhere breaks verification everywhere downstream), and out-of-band attestation (a periodic hash signed and stored somewhere the operations team cannot reach — a separate trustee, a notary service, or even a printout in a safe for the paranoid).

Postgres with a write-only role works fine for the append-only layer. You don't need a blockchain. What you need is the chain itself: every audit row stores prev_hash and row_hash = SHA256(prev_hash || canonical_json(row)). Verification is a single pass through the table. If a row is altered or deleted, the chain breaks at that point and every subsequent verification fails. That's the property that makes the log defensible.

The out-of-band step matters because someone with database admin rights could in principle truncate the table and rebuild a fresh chain. Periodically signing the head hash and depositing it externally — even just emailing it to a partner's personal address on a schedule — defeats that. It's the same logic as the old "bag and tag" exhibit chain, just digital.

Retrieval lineage: tying outputs back to source documents

This is where most internal AI projects fall apart. The model produces a paragraph that cites a figure. The auditor asks where the figure came from. The team says "from the trial balance". The auditor asks which line. Silence.

The fix is retrieval lineage. Every chunk that goes into the prompt context must carry a stable source identifier — not a filename, but a content-addressed reference: document hash, page or row number, and the version of the document at the moment of retrieval. When the model produces output, the system asks it to cite chunk IDs inline, then validates after the fact that the cited chunks were actually in context. Hallucinated citations get flagged before the user sees them.

For accounting work specifically, this means your RAG pipeline needs to index more than PDFs. It needs to index ledger lines, bank feed entries, and reconciliation states with the same rigour as policy documents. A good test: can your system answer "show me every AI-assisted output in the last quarter that referenced this invoice" in under a second? If not, the lineage isn't bidirectional, and you're going to struggle when a client query comes in.

This is one of the design constraints that drove the accounting build of the Intelligence Brain — the retrieval index has to treat structured ledger data as first-class, not as a bolt-on, because that's where the audit questions land.

Human-in-the-loop checkpoints that actually mean something

Most "human in the loop" controls I see in accounting AI deployments are theatre. A reviewer clicks accept on a draft they didn't read, the system records "approved", and everyone feels better. That's worse than no control, because it manufactures false assurance.

Real checkpoints have three properties. They're proportionate to risk — a draft cover letter doesn't need partner sign-off, a tax computation does. They capture the diff — what did the reviewer actually change, and was it material? And they record dwell time — if a reviewer "approved" a six-page document in eleven seconds, the log shows that, and quality assurance can sample those for re-review.

The dwell-time signal is uncomfortable to implement because staff don't like being measured. But the alternative is that your audit defence rests on the claim that someone reviewed the work, with no evidence of review effort. Pick which conversation you'd rather have with a regulator.

Data residency, processor obligations, and where the model actually runs

Irish firms operate under GDPR, the Data Protection Act 2018, and — for client confidentiality — the Chartered Accountants Ireland code. If you're sending client data to a US-hosted model API, you're a controller relying on a processor relationship with a sub-processor chain you mostly can't see. Standard Contractual Clauses help. They don't eliminate the disclosure obligation to clients, and they don't help you when the regulator asks for a specific record of processing activities for a specific engagement.

On-premise or sovereign-cloud deployment fixes the residency question and simplifies the processor analysis to almost nothing — the data never leaves your infrastructure. It also lets you pin model versions, which means your audit trail's "model fingerprint" field stays valid for years rather than weeks. You decide when to upgrade. You decide when to retire a version. The model becomes a piece of software you own, not a service you rent.

The trade-off is operational. You now run GPUs, or you pay someone to run them for you in an Irish or EU data centre. For a mid-sized practice that's a real cost decision, but it's the cost of having defensible AI rather than convenient AI. The architectural shape I describe in the Intelligence Brain overview is built around that constraint — local inference, local retrieval, local logs, with the audit chain anchored externally.

Where to start this week

Don't try to build all of this at once. This week, do one thing: pick a single AI use case in your firm — drafting client correspondence, summarising a set of accounts, anything narrow — and write down the four artefacts it currently produces. Most firms will find they capture none of them. That gap analysis, on one page, is what you take to your next partners' meeting. Once the partners can see in concrete terms what an audit trail looks like and what's missing today, the conversation about whether to keep using public AI tools or move to a controlled deployment becomes much shorter.

Book a 30-minute assessment

Direct with Michael. No charge. No pitch deck.

Pick a slot →