Chasing the Data Instead of the Criminals

Ofir Reichenberg, Director of Product Management, Data Solutions
Chasing the Data Instead of the Criminals

In detective stories, the clues are there and we watch the clever detective put them together. The clues fit together to explain how the criminal got into the locked safe, or escaped from a locked room. For financial crime, I’ve found that often the biggest problem is finding out which clues belong to which investigation. A lot of activity is suspicious at first glance. Deeper investigation can reveal perfectly reasonable explanation. At other times, criminal activity splits across many accounts and entities. That makes it very hard to put into one story. We’ve often seen that preparing, gathering and organizing the data takes almost a quarter of the total investigation time. And this is before the actual investigation. This represents a significant effort. When the collected data is incorrect, the clues don’t add up. The result of the investigation will not be the right one. Below, I’ll review many of the common issues we’ve run into and some of the ways to address them. 

Missing and Incorrect Financial Crime Data

The most common problem we’ve encountered is missing data attributes. Sometimes, not all the data was correctly collected. Data that was manually entered or updated at any point is always at risk. Typing errors, copy and paste mishaps, unintended save and more can all lead to invalid records. Follow-up migrations, validations and processing can all make it worse or disappear. A user might type Columbia instead of Colombia, and the validation logic might leave an empty value for country. Or someone providing corporate information can mislabel between ISIN, CUSIP or SEDOL, leaving an invalid record for the future. Data entry systems are constantly becoming smarter. Validation is more interactive with a chance to fix errors. Systems are getting better at catching those errors, but often they are still connected to older systems and data sources. We constantly see partial and invalid records in all data repositories.

Financial Crime Data Keeps Changing

Getting the data complete and correct is only the start. Analytical models are only as good as the data coming in. Any problem with the data can quickly lead to mis-categorizing an entity and result in incorrect risk rating or false alerts. Companies are constantly undergoing changes. For example, we’ve recently reviewed a financial institution with over 20K monitored entities. Out of those, more than 200 had material changes every year, and almost this amount every DAY for more minor changes. We’ve often heard that fixing a bad record is 100x as expensive as preventing it. Keeping track of data changes is key.

Sometimes, the entity data is correct, but other things have changed. Most commonly, credit ratings and securities can be volatile. Recently S&P Global announced that the second quarter of 2020 was the first time that long term issuer credit ratings were lowered for over 400 companies. Understanding the activities and reliability of entities is a major part of correctly assigning them risk. This requires going beyond the basic entity information and tracking other data about the entity. 

New Types of Data

Historically, there have been relatively few types of data to track for every entity. The most common were personal information, addresses, regulatory information. Today, we see emerging types of data that can assist financial crime investigation. Examples include social media aggregators, “dark web” research, crypto-currency analysis, and more. Specialized data vendors are finding relevant sources and making them accessible. There are several challenges with using this information. The initial one is associating the right data to the right entity. This is “Entity Resolution” and requires a careful approach. Afterwards, we need to know which source to use for which investigation. We can do that by analyzing existing and past investigation. Then, we evaluate the data to understand how it changes over time. Like the previous challenges, this is not a one-time task. As usage patterns and criminal behavior evolve over time, the relative importance of each source changes, too. We need to track and adjust the sources we use.

Process

Getting all the right data, pulling it together and keeping it up to date is challenging. Creating a process to control and enforce it is the only reliable approach. There are many relevant data sources and many valid ways to keep the data correct and up to date. In parallel, there are many automated tools for facilitating it. 

In my experience, a good process for managing the data requires addressing the existing data as well as planning for new data. This applies to data created in our systems or imported from outside. Any data entered into the system must be reviewed. This includes data entered by users, or ingested from other systems. This is also a common regulatory expectation. The reviews can be manual or automated, but finding and fixing bad data after it’s already in the system is much harder. In a similar manner, any changes to the data must be validated and audited. An ideal system also keeps old versions of the data for troubleshooting. 

The next part of any process is deciding which external data to bring in. There is a wide and growing variety of data. Deciding on the sources of attribution is based on a process to evaluate each source for correctness and impact. This is subject matter expertise driven, as different data works better in different situations. 

The last part is conflict resolution. This addresses cases where new data is different than existing validated data. A good data management process provides ways to automatically and manually review those changes and make decisions.

Moving Forward

Chasing the data is a never-ending race. Financial crime investigators have to constantly adjust to changing usage patterns. Trusting your data is the first step in analyzing and catching financial crime. We’ve found that the best way to do that is to proactively keep the data correct. Good processes for validating the data and using external data to update and augment it are the first step in any program. They make the next steps much simpler and reliable. 

Which part of the data management process is the most impactful? Drop us a note at info@niceactimize.com to share your thoughts.

Speak to an Expert