Ensembling AI models to improve compliance risk detection

January 26, 2026

Ensembling, or the practice of combining multiple AI techniques, is becoming a cornerstone of modern risk detection in financial services, as firms grapple with increasingly complex digital communications.

According to Theta Lake, today’s workplace interactions span video calls, voice notes, instant messaging, collaborative documents, emojis, GIFs and even AI-generated content.

Platforms such as Microsoft Teams, Zoom and Webex, alongside tools like Microsoft Copilot and Zoom AI Companion, now sit at the heart of daily operations, creating vast volumes of multimodal data that must be monitored for regulatory, privacy and security risks.

This shift has driven widespread adoption of AI in compliance. Around 94% of financial services firms are either using or planning to use AI-based detections to supervise communications. However, relying on a single machine learning model has proven insufficient in such a dynamic and context-heavy environment. Risk behaviours are rarely explicit, often hidden behind ambiguous language or indirect intent, exposing the weaknesses of one-size-fits-all approaches.

Single models struggle because every algorithm is built on assumptions about how data behaves. Classical techniques, such as nearest-neighbour or maximum-margin classifiers, embed statistical biases by design. The same applies to modern large language models. While these systems can process vast datasets, they still inherit biases from their architectures, training methods and data sources. As a result, blind spots are inevitable, and organisations may not even realise which risks they are missing.

Ensemble modelling addresses this challenge by combining multiple models into a unified detection framework. Each model contributes a different analytical perspective, and together they compensate for individual weaknesses. In practice, this leads to higher accuracy and greater resilience. Weighted ensembles further refine this approach by prioritising whichever model performs best for specific types of data, allowing systems to adapt dynamically to changing communication patterns.

At Theta Lake, ensemble techniques go beyond machine learning alone. Lexicons are used to capture known risky terms and phrases, while intelligent fuzzy matching identifies variations and near-misses. Machine learning models, meanwhile, focus on semantic meaning and contextual interpretation. By blending these fundamentally different techniques, detection becomes more robust, reducing both false negatives and false positives compared with single-model or lexicon-only systems.

Data quality is another critical element of effective ensembling. Larger models cannot compensate for poorly labelled or unrepresentative datasets. Recent research from OpenAI highlighted that smaller, custom-trained classifiers can outperform general-purpose models for specific compliance tasks, stating:

“classifiers trained on tens of thousands of high-quality labeled samples can still perform better at classifying content than gpt-oss-safeguard does when reasoning directly from the policy. Taking the time to train a dedicated classifier may be preferred for higher performance on more complex risks.”

This reinforces the importance of accurate labels and diverse datasets. Theta Lake applies ensemble principles even before training begins, using proprietary algorithms to select informative samples and validate label quality, ensuring that models learn from meaningful and representative data.

A practical example of this approach is collusion detection. Indicators of collusion, such as secrecy or attempts to avoid detection, are rarely explicit. Effective monitoring combines natural language processing, lexicons, fuzzy matching and contextual analysis of surrounding conversations. The result is not just improved detection rates, but significantly fewer false positives, reducing alert fatigue for compliance teams.

Operationally, ensemble modelling is increasingly seen as a necessity. Language and behaviour constantly evolve, and no single model can anticipate every emerging risk. By aggregating signals from multiple techniques, ensemble systems are better equipped to detect out-of-distribution behaviour. They are also more efficient and scalable, allowing organisations to adjust weights and parameters rather than rebuild detection frameworks from scratch.

Ultimately, the lesson is one that has repeated across generations of AI: data diversity, label accuracy and model diversity all matter. Ensemble modelling recognises that complex compliance challenges cannot be solved by a single dominant approach. Instead, multiple analytical perspectives working together provide the robustness required for modern financial services.

Read the daily FinTech news