AI-powered data room analysis for M&A due diligence | Magilium

Overview

For mid-market M&A, the document review phase of due diligence is the single biggest bottleneck. A typical deal involves several hundred files spanning financial, legal, commercial, tax, operational and ESG domains. The work of opening every file, ticking off requirements against a checklist, cross-referencing figures and writing up what is missing has historically fallen to junior analysts — slow, expensive, inconsistent, and one of the least liked parts of the job.

Magilium designed and built an AI-powered system that replaces this manual workflow with a structured, LLM-maintained wiki. The system reads each document once, accumulates findings against the firm's diligence requirements, surfaces contradictions and gaps, and produces a defensible assessment with citations to source documents. The wiki sits between the raw data room and the deal team, who read and edit it rather than the underlying PDFs.

The challenge

The dominant tools in this space — keyword matchers, filename-based checklists, and more recently RAG-based search — all fail at due diligence in different ways.

Filename matching is unreliable because filenames lie. A board pack PDF contains financial, legal, commercial and HR information all at once. A document called "Insurance Schedule" simultaneously contributes to financial, legal and ESG checklists. No rule could be written to catch this without enumerating every possible document type — which is to say, without solving comprehension in the first place.

RAG (retrieval-augmented generation) treats due diligence as a Q&A workload. It is not. Due diligence is a synthesis workload across hundreds of documents and dozens of interlocking requirements. RAG re-derives the answer from scratch every time a question is asked, has no way to reason about what is missing, and cannot detect contradictions between documents — because contradictory passages produce different embeddings and never confront each other.

Single-pass "stuff everything into one big context" approaches fail differently again. They produce no reviewable artifact, cannot handle the new files that arrive every few days during a live deal, and degrade unpredictably on long contexts.

The real problem to solve was not "answer questions about a data room." It was "produce a working, defensible artifact that accumulates and stays current as the data room grows."

The approach

The system Magilium built treats the data room not as a search index but as input to an incrementally-maintained wiki. The pipeline has six steps:

1. Extract. Python pulls text out of PDFs, Word, Excel, PowerPoint and images. Deterministic, cheap, runs once per file.

2. Scaffold. The system reads the firm's requirements schema and creates an empty wiki page for each diligence domain.

3. Ingest. An LLM reads each extracted document and updates every domain page the document is relevant to. A single insurance certificate may touch financial, legal and ESG pages in one pass. Evidence is recorded as specific facts — figures, dates, document names — never as "see attached".

4. Lint. The LLM rereads the wiki and looks for contradictions, placeholder text, stale evidence, missing cross-references and coverage gaps.

5. Assess. The LLM reads the linted wiki plus the firm's domain knowledge and produces a structured assessment: for each requirement, met or not, with the specific evidence pinned to it.

6. Export. Python turns the assessment into an Excel workbook for the deal team.

The wiki is plain markdown in a git repo — human-readable, diffable, reviewable at every stage. When new files arrive on day five of a deal, the system extracts only the new ones, the LLM updates only the relevant pages, and the assessment is regenerated. Nothing is reprocessed unnecessarily.

The architecture separates the system, the firm's diligence criteria, and a particular live deal. The base layer (prompts, scripts, conventions) is firm-agnostic. Each PE firm has a configuration directory containing its requirements schema and domain knowledge — what each requirement means, what good looks like, what counts as a red flag. Each deal is a project under that firm.

One rule does most of the work: Python does I/O only. The LLM does all the analysis. No regex over wiki pages. No keyword scoring. No fuzzy filename matching. The moment the system starts reimplementing comprehension in deterministic code, it has rebuilt the broken thing it was designed to escape.

The outcome

The system replaces what was previously several days of junior-analyst time per deal with a process that runs in hours and produces a more thorough, more consistent and more defensible artifact.

The specific gains the deal team noticed first:

Coverage of every requirement is explicit. Gaps are first-class outputs. "No evidence found for requirement 3.2" appears in the assessment by construction, rather than being something the analyst might or might not have flagged.

Contradictions surface automatically. When two documents disagree about last year's revenue, both findings land on the same domain page during ingest, and the lint pass surfaces the conflict. This is where the real money in due diligence is hiding, and it is the failure mode every previous tool was structurally bad at.

Quality assessment, not just presence. The system can tell you that the management accounts are unsigned, the going concern footnote is hedged, or the VAT reconciliation is missing. The output is not "document found" but "document found, but unsigned and only six months — falls short of Level 3."

Incremental updates are cheap. New files cost work proportional to their size, not to the size of the data room. This matches the operational reality of a live deal, where files arrive and are corrected throughout the process.

The artifact is reviewable. The wiki is markdown in a git repo. A deal partner can open the financial page and read it. A junior analyst can spot-check the citations. There is no "trust the model" step — the model's work is sitting on disk in a form that can be read, edited and version-controlled.

Setup once, run many. A PE firm's diligence criteria and domain knowledge are configured once. From then on, every new deal reuses the same standard, which means the firm's diligence becomes more consistent across deals and across analysts than was previously possible.

What it does not do

The system does not eliminate human judgement, and it does not run itself. The LLM still needs to be pointed at the documents, the prompts still need to be tuned for each firm's standards, and a human still needs to read the final assessment and decide whether to do the deal. Image and scanned-PDF handling needs visual review; very large data rooms still benefit from batching.

What the system does eliminate is the part nobody on the deal team wanted to do anyway: opening every file, ticking every box, cross-referencing every figure, and writing up what is missing. That work now lives on disk, in a wiki the LLM keeps current and a human can read. The deal team gets to do the part that actually requires judgement, with a substrate underneath them that is more thorough and more consistent than any junior analyst could be.

This is the pattern Magilium increasingly applies wherever documents accumulate against a known set of requirements — due diligence, regulatory submissions, tender responses, accreditation, audit. The same mechanics; the same payoff. Where the work is high-stakes, repetitive across instances and structured around a known set of requirements, the LLM-maintained wiki is the right artifact to put between an LLM and a pile of documents.