Every Irish public body I've spoken to since founding IMPT has the same FOI problem, and it isn't the law. The 1997 Act, as amended in 2014, is workable. The problem is the mechanics: a request lands by email, a decision-maker has four weeks, and somewhere in a tangle of shared drives, Outlook archives, SharePoint sites, finance systems and meeting minutes sit the records that have to be located, considered against Sections 28 to 42, redacted, scheduled, and released. The work is mostly searching and reading. That is precisely the work an on-premise intelligence layer is good at — and precisely the work you cannot push to a public cloud LLM without breaching the very confidentiality your exemptions are meant to protect.

Why FOI is a retrieval problem before it is a legal one

A typical non-personal FOI request might read: "All records held by the body relating to the procurement of [system X] between [date A] and [date B], including correspondence with the supplier, internal evaluations, and minutes of any meetings where the procurement was discussed." That single sentence triggers four distinct retrieval tasks. You need email between named and unnamed parties on a shifting topic. You need documents — tenders, evaluations, memos — that may or may not use the supplier's name in the filename. You need calendar entries and the minutes attached to them. And you need to detect records that refer to the procurement without naming it directly, because a competent requester will appeal a schedule that obviously misses things.

Keyword search inside Outlook and SharePoint will not find half of this. The supplier might be referred to internally by a project codename. Evaluation scoring sheets might live as Excel files named "Final scores v3 (clean)". Minutes might say "the new platform" rather than the system name. The Decision-Maker ends up doing what FOI officers across the country do every day — emailing five or six people asking "do you have anything on this?" — and hoping nothing is missed.

This is a semantic retrieval problem. It is the kind of thing a properly indexed, locally-hosted vector store with a competent reasoning layer on top can do in minutes rather than days. But it has to be local. The records being searched include the exact material the exemptions in Part 4 of the Act exist to protect.

The on-premise constraint is not optional for FOI work

I want to be blunt about this because it gets fudged in vendor pitches. If you send the contents of a public body's email archive to a hosted LLM API to do FOI triage, you have created a disclosure that the Act never contemplated and that the body cannot defend. It does not matter that the vendor says they don't train on your data. It does not matter that there is a DPA in place. The records may include third-party commercially sensitive information (Section 36), security-related material (Section 32), Government deliberations (Section 29), and personal data of identifiable individuals (Section 37). Routing those through a US-hosted inference endpoint, even via an EU region, is not a defensible posture under either FOI or GDPR.

This is why the Intelligence Brain for public sector runs on hardware the body controls. The model weights, the vector index, the document store, the audit log — all of it sits inside the body's own network boundary. Nothing leaves. The reasoning layer can be a quantised open-weights model running on a single GPU server for a smaller body, or a clustered setup for a Department. The architecture is the same: ingestion, embedding, retrieval, reasoning, and output, all local.

What the ingestion layer actually has to handle

The unglamorous part of FOI AI is the ingestion. A real public body has records in: Exchange or Microsoft 365 mailboxes; SharePoint Online and on-prem SharePoint that someone forgot about; a finance system (Agresso, SAP, or similar) with attached invoices and POs; an HR system; case management systems specific to the body's function; shared drives with fifteen years of accreted folders; Teams chat history; and PDFs scattered across all of the above, half of them image-only scans.

An FOI-capable intelligence brain needs connectors that pull from each of these, OCR for the scans, and a normalisation step that produces a consistent document object: source system, original location, author, recipients, date created, date modified, content, and crucially, a stable identifier so the schedule of records you produce at the end actually points back to something. The OCR step is where most homegrown attempts fall over. Tesseract on a tilted scan of a 2009 letter produces gibberish. You need a modern document AI step — LayoutLM-style or a vision model — that handles tables, signatures, and handwritten margin notes.

Embeddings then go into a local vector store. I lean toward Qdrant or Weaviate self-hosted, but the choice matters less than the discipline of versioning the embeddings alongside the model that produced them, so that when you swap the embedding model in eighteen months you can reindex without losing provenance.

From request to schedule of records

Once the corpus is indexed, the FOI workflow becomes tractable. A Decision-Maker pastes the request into the system. The reasoning layer expands it — it identifies the entities (supplier names, system names, likely codenames inferred from the corpus itself), the date range, the record types in scope, and the people likely to hold relevant material. It runs a hybrid search: dense vector similarity for semantic matches, BM25 for exact phrase and name matches, and metadata filters for date and custodian.

The output is a candidate set, ranked, with each record annotated: why it was retrieved, which part of the request it appears to respond to, and a first-pass exemption flag — does this record contain personal data of a third party (Section 37)? Does it contain commercially sensitive information (Section 36)? Is it a record of Government (Section 28)? The flagging is a recommendation, not a decision. The Decision-Maker still decides. But instead of reading two thousand emails, they are reviewing a structured shortlist of perhaps two hundred, with the reasoning visible for each one.

The same system can draft the schedule of records in the format the body uses — record number, date, description, decision (grant, part-grant, refuse), exemption cited, page count. It can draft the initial decision letter with the standard recitals and the body-specific paragraphs. It cannot sign it. A human signs it, because a human is accountable under the Act.

Redaction is where most automation fails badly

Redaction is the step that vendors love to demo and that fails in production. The demo shows the system blacking out names and email addresses. The production reality is that under Section 37 you are redacting personal data of identifiable individuals — which includes the staff member's initials in a margin note, the desk phone number on a letterhead, the photograph on page 4, the licence plate visible in an annexed site photo, and the handwritten signature. Under Section 36 you are redacting pricing structures that may be inferable from totals elsewhere in the document.

A useful intelligence brain handles this by treating redaction as a two-pass process. First pass is detection: NER tuned on Irish names, organisation names, addresses with Eircode awareness, financial figures, and a body-specific dictionary of terms-of-art that have been redacted in previous releases. Second pass is consistency: if a name was redacted on page 3, it must be redacted on page 47, and the system must check whether the surrounding context makes it identifiable even when the name is gone. Then a human reviews. Always.

The audit log records every redaction, who confirmed it, and which exemption was cited. When the requester appeals to the Information Commissioner, you produce that log. That is the difference between defensible automation and the kind of automation that costs you a finding against the body.

What this changes for FOI officers

The point of an intelligence brain is not to replace the FOI officer. It is to give one officer the search reach that previously required ten people across five business units to provide. The legal judgement, the public-interest balancing test under Section 11(3), the decision to grant or refuse — those stay with the human, because the Act requires a named Decision-Maker and because the judgements are not mechanical. What changes is the proportion of the four-week window spent on retrieval versus on judgement. At the moment it is roughly the wrong way around in most bodies I've looked at.

There is also a longer-term effect. Once the corpus is indexed and the workflow is in place, the body starts to see patterns: which units generate the most requests, which record types are most often refused and on what grounds, which exemptions survive appeal and which don't. That feeds back into proactive publication under the Publication Scheme, which is the part of the Act everyone agrees with and almost nobody resources properly.

Where to start this week

If you run FOI in an Irish public body, the useful thing to do this week is not to procure anything. It is to take the last twenty closed requests and write down, for each, where the records actually came from, how long retrieval took, and what was missed on first pass. That document is the brief for any intelligence-brain project worth doing. If the answer is "retrieval took four hours and we missed nothing", you do not need this. If the answer is closer to what I hear most often — that retrieval ate two of the four weeks and an appeal surfaced records nobody had searched — then the case for a local, on-premise intelligence layer makes itself, and you can have a grown-up conversation about architecture rather than a vendor conversation about features.

Book a 30-minute assessment

Direct with Michael. No charge. No pitch deck.

Pick a slot →

Freedom-of-information requests and the intelligence brain