What is the role of federated learning in RegTech?

December 2, 2024

Federated learning is a machine learning technique that enables multiple entities to train a model together while keeping their data decentralized. In an era where AI models are taking shape, such a technology will be able to play a key role in the expansion of RegTech.

In the view of Laurence Hamilton, CEO of Consilient, the most powerful machine learning models typically require high-quality, diverse datasets. But that data is not always available at any one organization, he believes.

Hamilton explains, “Traditionally, the approach to machine learning and data analysis has involved collecting all data on a central server, which could be in a data centre or in the cloud. This means transporting and storing all the data in one place. Once centralized, machine learning algorithms can be applied to train models using the data.

“This has been the standard method for machine learning. However, this centralized approach has several limitations and is not suitable for many scenarios. Specifically, it fails when data cannot be consolidated on a central server or when the data available on one server is insufficient for training an effective model,”

He outlined there are currently ‘several issues’ facing organisations, limiting the performance of in-house systems and machine learning models. He added there are several reasons why aggregating all data in one location is inadequate for numerous critical real-world applications.

Firstly, in the areas of regulations, Hamilton stressed that data privacy regulations often prevent organisations from combining user data across different regions due to varying data protection laws.

In the area of user preference, Hamilton mentioned that some use cases demand that data never leave an organization due to strict data security requirements.

On data volume, Hamilton said, “Combining large datasets from various sources can be prohibitively expensive in terms of processing, management, and storage. Moreover, much of the data may be irrelevant or unnecessary. While traditional centralized machine learning has been the norm, it faces significant challenges in terms of regulatory compliance, user expectations, and data management, making it unsuitable for many important applications.

Hamilton underlined that in the current day, the financial services market wastes billions each year trying to identify anomalous behaviours. He added that organisations such as the Wolfsberg and Egmont groups believe one of the core needs to address the AML challenges globally is collaboration, however, collaboration in his view without sharing data is only ever going to scratch the surface of change that is needed.

This is where federated learning, Hamilton waxes, can foundationally change the way the AML industry operates.

This is where Federated Learning (a branch of machine learning) can foundationally change the way the AML industry operates, saving 100s of millions of dollars in expense.

He said, “Federated learning is a type of machine learning that “learns” by traveling across different data sets, which would be difficult to combine in one place. These data sets can be located in different organizations or in distinct parts of the same organization.

“As previously mentioned, FL is a game-changer for AML because financial crime detection is a fringe phenomenon for most institutions. Even major banks often have small populations of known risk behaviours among their monitored entities, making it difficult to differentiate between unusual but manageable high-risk behaviour and truly criminal activity. This challenge is particularly pronounced when identifying specific risk behaviours in underrepresented segments,” said Hamilton.

The Consilient CEO detailed that FL models are often trained on heterogenous datasets across different entities, encoding a wide spectrum of money laundering risks. This broad exposure, he stresses, enables the creation of models that are not only more accurate in detecting financial crime but also more robust to variations across different jurisdictions, customer types and behaviours. It also, he claims, allows for the detection of risks that may not have been observed at any specific institution but have been identified elsewhere.

Through learning from multiple independent datasets, federated learning empowers banks to substantially reduce the number of false positive alerts, lower operational costs and improve the detection of criminal behaviour using insights from a wide array of data rather than just their own company data.

“In conclusion, Federated Learning represents a paradigm shift in machine learning for RegTech, particularly in combating financial crime,” said Hamilton. “By enabling collaborative model training without data sharing, FL addresses the core limitations of traditional centralized approaches. It empowers financial institutions to develop more accurate, robust, and cost-effective solutions for detecting financial crimes, making it an essential innovation in the evolution of AML systems and beyond.”

He also stressed that FL offers a ‘transformative’ solution by shifting the computational process to the data rather than moving data to a central server.

He concluded, “FL trains models across decentralized, heterogeneous datasets from multiple organizations or parts of the same organization. This approach is especially impactful for AML, where criminal behaviour is rare and often context-dependent. By leveraging FL, organizations can collaboratively train models on diverse data while preserving privacy and complying with regulations. The results include reduced false positives, enhanced detection of complex criminal behaviours, and significant cost savings,”

A fast-growing technology

The role of federated learning is expected to grow fast, with its role being able to meet the moment to get to grips with machine learning models.

From the standpoint of RelyComply, federated learning will allow banks to train models across multiple local data repositories without the need to shift sensitive data around, unlike the centralised nature of traditional machine learning techniques.

The firm remarked, “One of the financial system’s most significant AML compliance gaps is more cooperation and shared resources. This can significantly affect institutions in developing countries needing RegTech access in the face of higher-risk entities. Correspondent banks may break relationships with them, retreating from a need to manage the risk of the respondents’ clients. If financial connections break, the banking ‘security wall’ integrity can break, and fincrime can proliferate.”

This, the company outlines, highlights why worldwide regulation needs far more structure and transparency, as it can assist businesses in any jurisdiction in maintaining compliance and fighting customer risk. The federated learning practice aims to remedy this, removing the ever-growing number of unregulated channels.

RelyComply added, “The method can help remove headaches around troublesome AI and data privacy issues while an aggregated model hinges on collective learning; it collates nuanced patterns of high-risk payments to cement robust frameworks for banks to engage with and adopt in line with their AML efforts. This is all without the need to directly release any data outside their control or region, opening up the threat of dangerous data breaches and leaks.”

RelyComply also stated that the early days of this model training concept are ‘yet to bear fruit’ and it must rely on widespread uptake.

“Identifying niche customer risks of all kinds requires cooperation from institutions to achieve an effective, unified model for risk management. That did feel like a world away given the sprawling nature of fincrime techniques, increasing compliance laws and a gulf in RegTech capability. Still, federated learning’s collective mission may at least try to make a dream scenario more of a reality,” finished the company.

Bolstered collaboration

A key advantage of federated learning, from the standpoint of Corlytics chief data officer Oisín Boydell, is that it enables firms to collaborate in creating shared AI models for mutual benefit, but without having to share the raw data that the models are trained on.

He said, “An example of this is financial fraud detection – it is in all financial institutions’ best interests to reduce fraud across the financial sector, and AI fraud detection models trained on large, comprehensive datasets can be a powerful tool for this.”

Despite this, Boydell explained that no one financial institution has the full picture in terms of training data, and pooling data directly with competitors to train a more capable model is out of the question due to data privacy and confidentiality.

He continued, “Federated learning solves this, as it allows a model to be trained over multiple datasets belonging to different entities, without those individual datasets being shared directly. Instead, each financial institution trains their own partial model themselves on their own data, with those partial models then being combined into a much more powerful fraud detection model without the underlying training data being exposed.

“From a technical perspective, federated learning is challenging to implement, and to be able to provide guarantees against either the malicious or inadvertent exposure of proprietary training data in practice. Vulnerabilities such as membership inference attacks, model inversion techniques and reconstruction attacks that expose the data used to train the model are always a risk, as with AI models in general,” Boydell stated.

Critical AI security

According to Last Feremenga, director of data science at RegTech Saifr, as AI becomes increasing integrated into financial firms, particularly through in-house developed solutions, the security of these technologies has become a critical concern.

Feremenga said, “Many of these in-house AI systems incorporate generative components, which pose unique challenges. When firms rely on closed-source models accessed via APIs, there’s a heightened risk of data leakage as sensitive data is transmitted to external services.

“Alternatively, firms utilizing open-source generative models must contend with the lack of robust guardrails typically required for production-level AI. These models may not be resilient against adversarial techniques such as prompt injection or jailbreaking, leaving systems vulnerable to manipulation and unintended outputs.”

The data science director believes that RegTechs can play a ‘pivotal role’ in addressing these security challenges by providing specialised expertise in regulatory compliance and technological safeguards.

“They can assist financial firms in implementing comprehensive quality assurance processes, including independent assessments and red teaming exercises to evaluate the safety and reliability of AI systems. Moreover, RegTechs can help in developing and enforcing stringent guardrails that help ensure AI models generate compliant and secure content, adhering to industry regulations and standards. Even for non-generative AI solutions, traditional safety concerns persist; and RegTechs can offer valuable support in helping to mitigate risks across the board, reinforcing the overall integrity and trustworthiness of AI technologies within the financial sector,” Feremenga remarked.

Bold step forward

In the view of Flagright growth manager Joseph Ibitola, federated learning represents a bold step forward in the evolution of machine learning, particularly for sectors like RegTech that operate under tight data privacy and security constraints.

He explained, “In essence, it allows machine learning models to be trained across decentralized data sources, such as multiple institutions, without sharing sensitive or proprietary data. For RegTech, this could be a game-changer in how compliance and fraud prevention technologies are developed and deployed.”

It also has the opportunity to play a pivotal role by enabling collaborative innovation without compromising confidentiality.

“For example, financial institutions could use federated learning to collectively train fraud detection models on a broader dataset, capturing trends and behaviors from across the industry, without ever exposing individual customer data. This ensures compliance with privacy regulations like GDPR while still benefiting from the insights that come with large-scale collaboration,” he remarked.

He finished by stating that federated learning can help standardise compliance approaches across jurisdictions. “Regulators and financial institutions could use the same decentralized models to harmonize enforcement, reduce redundancy, and cut compliance costs. In this way, federated learning could become the backbone of a smarter, more collaborative approach to regulation in the digital era,” said Ibitola.

4CRisk.ai, meanwhile, believes that the RegTech sector can strongly benefit by leveraging federating learning, which enables AI models to be trained from a wide variety of sources, without sharing private data, in multiple ways.

The firm explains, “First, to help in compliance, security and privacy for organizations, especially those that must respect data residency requirements such as in GDPR, while providing robust information to increase the accuracy of models across multiple scenarios. Secondly, by sharing information, organizations can develop better models, and standardize on benchmarks and processes.

“Thirdly, while this collaboration can pose technical challenges, it will ultimately improve the accuracy of more sophisticated risk, fraud and AML models and assessments,” it concluded.

Keep up with all the latest FinTech news here.