Why vertical AI is winning the compliance race

May 26, 2026

In June 2023, New York attorney Steven Schwartz submitted a legal brief to a federal court in Manhattan citing six cases that simply did not exist. He had used ChatGPT to conduct his research and, crucially, had not verified a single citation against an authoritative legal database.

According to Sherlocq, the judge described the submission as “legal gibberish” and sanctioned both Schwartz and his colleague $5,000 each. Chief Justice John Roberts later highlighted the case in his 2023 year-end report as an early warning sign about AI risks in regulated professional environments.

Sherlocq recently discussed the topic of AI for compliance, and what practitioners need that generic tools cannot deliver.

For compliance practitioners, however, the case pointed to something more unsettling than one lawyer’s oversight. It illustrated a systemic mismatch that is playing out across every regulated industry — finance, insurance, legal — where professionals are being handed AI tools built for breadth and speed, then expected to use them in environments that demand precision, verifiability, and accountability.

When breadth meets depth

General-purpose large language models (LLMs) such as ChatGPT are remarkable tools. They synthesise information at scale, draft with fluency, and adapt across almost any task. But they were designed for breadth. Compliance work demands a very particular kind of depth that cuts against the grain of how consumer-facing AI is built.

A compliance professional does not need an AI that can write a poem, explain quantum mechanics, and summarise a news article in the same session. They need an AI that understands the difference between Regulation D and Regulation DD. That knows when a FINRA notice supersedes a prior interpretation. That can trace a specific obligation back to its originating statute without conflating jurisdictions. That can tell a practitioner not just what a rule says, but when it was last updated, whether it applies to their business line, and what the enforcement history looks like.

Generic AI is optimised for plausibility. Compliance work demands verifiability. In regulated environments, the gap between those two things can cost firms millions in fines and reputational damage.

The hallucination problem is not trivial

Hallucination, when an AI model generates false information with high confidence, is a well-documented limitation of LLMs. For most consumer use cases, a hallucinated fact is a minor inconvenience. In compliance, it can constitute a material risk event.

Consider what a compliance officer actually does with AI output. They do not verify every sentence against source documents – that would defeat the purpose of automation. They act on it. They update policies. They brief boards. They report to regulators. They train staff. An AI that confidently cites a rule that no longer exists, or misquotes an exemption threshold by a decimal point, introduces error deep into institutional decision-making before anyone catches it.

Generic AI tools have made some progress on hallucination through retrieval-augmented generation (RAG), but they apply these techniques broadly across all domains. Vertical AI tools built specifically for compliance take a fundamentally different approach: they index authoritative regulatory sources, such as SEC releases, CFPB bulletins, PRA guidance, ESMA technical standards, and constrain the model to reason within that corpus. When the answer is not in the source material, a well-built compliance AI says so, rather than improvising.

Five things generic AI cannot do in regulated environments

There are specific capabilities that general-purpose tools consistently fail to deliver: citing specific regulatory provisions with version-accurate, jurisdiction-correct sourcing; flagging when guidance has been superseded, withdrawn, or is subject to active rulemaking; producing audit-ready outputs with traceable reasoning and source attribution; applying firm-specific policy logic on top of external regulatory requirements; and alerting practitioners to enforcement trends and examination priorities from live regulatory data.

Auditability: the requirement generic tools ignore

There is a question no compliance officer should ever struggle to answer: “Why did your system produce this output, and what was it based on?” In a regulatory examination, a board review, or litigation discovery, the ability to explain and defend AI-assisted decisions is not optional, it is a core governance requirement.

Generic AI tools are built for end-user experience, not institutional accountability. They compress reasoning, elide uncertainty, and present outputs as finished products. They do not surface the sources behind each conclusion, flag where the model was uncertain, or produce logs that a regulator could review.

Vertical AI tools for compliance are architected with this constraint front and centre. Every output is tied to a specific regulatory source — the original document, with version and date. Reasoning chains are exposed, not hidden. Outputs are formatted for documentation, not just for reading. Audit trails and access logs are built into the product from the ground up, not retrofitted as an afterthought.

This is not a cosmetic distinction. It reflects a fundamentally different understanding of who the customer is. Generic AI is built for individual users. Compliance AI is built for the institution, and for the regulators that examine it.

Domain specificity as competitive advantage

The financial services compliance universe is technically demanding and vast. BSA/AML. Suitability and best interest standards. Capital adequacy under Basel III. Conduct risk frameworks. Consumer protection obligations spanning federal and state layers. Cross-border reporting under FATCA, CRS, and EMIR. Each domain has its own vocabulary, its own enforcement culture, and its own interpretive history. A model that has not been trained deeply on this material will produce outputs that sound reasonable to a generalist but raise immediate red flags for a practitioner.

This is where vertical AI earns its premium. The best compliance-focused platforms have invested years fine-tuning on curated regulatory corpora, building taxonomy structures that reflect how compliance professionals actually navigate the landscape, and training on firm-generated data — policies, escalations, examination findings — under appropriate data governance controls. The result is not a model that simply retrieves relevant text, but one that applies interpretive logic consistent with how regulators and practitioners actually think.

A compliance AI that can answer whether a given product feature requires a new regulatory filing is not a search engine with a chat interface — it is a structured reasoning system trained on the logic of regulatory interpretation. That is not something a general-purpose model, however large, can replicate without that domain investment.

Which AI is appropriate for which use cases?

The conversation among compliance leaders has shifted significantly. A year ago, the question was whether to use AI at all. Today, the question is which AI is appropriate for which use cases, and a clear taxonomy is emerging.

Generic tools — ChatGPT, Microsoft Copilot, or Claude in its standard consumer form — are useful for drafting communications, summarising public documents, and accelerating research on non-sensitive matters. They are not appropriate for regulatory interpretation, policy gap analysis, examination preparation, or anything where the output will drive institutional decisions without extensive human review.

Vertical compliance platforms, by contrast, are proving their value precisely in the high-stakes use cases where generic tools break down: automated horizon scanning that flags new guidance before it becomes effective; policy comparison engines that identify gaps between internal procedures and updated regulatory requirements; and case management tools that draw on enforcement history to calibrate investigation priorities. These applications require domain depth that cannot be improvised.

To read the full story, find the Sherlocq post here.

Read the daily FinTech news