The data mistake putting AI projects at risk

The rapid rise of AI across financial services has thrust an old computing warning firmly back into the spotlight: “garbage in/garbage out.”
Despite the extraordinary promise of machine learning and large language models (LLMs), their effectiveness is ultimately determined by the quality and relevance of the data they rely on. As organisations push deeper into AI adoption, this foundational principle has become increasingly difficult to ignore, claims AscentAI.
AscentAI lead regulatory advisor Jilaine Bauer works closely with firms navigating this shift and sees data quality as the defining factor in any successful AI programme. AscentAI lead regulatory advisor Jilaine Bauer said, “it’s important for AI solutions to be trained on data sets that include industry-specific data to achieve greater accuracy, relevance and insights. For example, when working with an insurance company, it is important for an AI solution to be trained on data sets that include terminology and concepts relevant to the insurance sector and related subsectors. For FinTech firms delivering traditional financial services in new, engaging ways, such as digital banking, data sets may need to include new or different terms and concepts. And, for firms operating in more than one country, taxonomies and ontologies can help structure and categorize data to help ensure it is applied in a consistent matter. Finally, perhaps the most important step we take at AscentAI to ensure the data is fit for the AI application is to develop use cases specific to client use cases and then scope the data we think applies for client review and approval.”
As businesses become more aware of AI’s limitations, scepticism has grown. Models trained on flawed, poorly curated, or irrelevant data are far more likely to fail. Even the Institute of Electrical and Electronics Engineers (IEEE) has raised concerns, highlighting that simply scaling up a model does not guarantee better performance. The IEEE noted that newer, larger LLMs have in some cases become less reliable according to recent research.
Much of this unreliability stems from enormous datasets that developers cannot fully understand or validate. Bauer believes enhancing data governance is therefore essential for AI success. AscentAI lead regulatory advisor Jilaine Bauer said, “it’s a really hard problem to solve, but the success of AI applications depends on it. I think it’s a key determinant on whether you succeed or fail in leveraging the power of AI.”
To function accurately, AI systems require timely, structured, and clean data. This is pushing organisations to modernise their data governance frameworks, ensuring they can support both structured and unstructured information. Michelle Knight of Dataversity reinforces this shift, stressing that existing governance programmes must evolve to manage the vast volumes of data AI depends on.
For Dataversity, Michelle Knight said that today’s governance programmes “enforce roles, procedures and tools for some structured data throughout the company. Yet AI models learn from and use very large data sets, containing structured and unstructured data. All this data needs to be of good quality too, so that the AI model can respond accurately, completely, consistently, and relevantly.”
Knight also likened AI to an iceberg, warning that organisations often focus only on the visible potential while ignoring the extensive substructure beneath. If firms overlook data quality, she warns, the consequences can be catastrophic.
AscentAI has built its approach on this principle, applying strict governance to all AI and machine learning processes. The company uses 10 layers of redundancy, including automated and human reviews, to ensure data accuracy. Crucially, its models are trained solely on material published by national, state, and local regulatory bodies, ensuring inputs are both trusted and free from extraneous content. For AscentAI, high-quality, trusted and structured data is at the heart of responsible and effective AI deployment.




