It’s not John, it’s James. In the US alone, it is estimated there are over 30,000 people who share the same name, James Smith. In Korea, almost 20% of the population – some 10 million people – share the same family name of Kim. The world is also home to over 150 million with the same given name – Mohamed. Cases of mistaken identity are common, particularly when searching over large volumes of data, but they needn’t be.
Almost all investigatory work, whether in law enforcement, counter terrorism or within the anti-money laundering (AML) and due diligence processes of a bank, require accurate ways of searching and discovering specific entities in large data sets. However, poor record keeping, missing or incomplete data and legacy matching-logic hamper these efforts. False positive matches – selecting the wrong entity – and worse, false negatives (where a critical search result is missed altogether) are abundant.
Not only are they not unique, there is also no standard way of rendering names. Thus, James Smith can be Jim Smith, J Smith, J M Smith, as well as a huge array of possible typos, transpositions, aliases, or renderings in different dialects, alphabets and scripts. Matching against “exact hit” names works when data quality is very high, but it means there are no alerts at all if names have even the slightest variation, increasing the chances of criminals slipping through the net. Similarly, so-called “fuzzy matching” which will alert if one or two characters are different, still cannot account for the sheer variety and array of cultural nuances in how names are rendered in different types of data.
The solution is to use data to drive a new type of matching logic – advanced Entity Resolution. Ripjar uses observations from millions of names, deriving matching logic from how the name is used in real-world situations.
Entity Resolution is an essential capability in the fight against financial crime, fraud and terrorism. By improving the quality of the data that is used to make decisions such as enforcing international sanctions or alerting to possible corruption or fraud, it can dramatically improve the effectiveness and efficiency of human analysts and allow small teams to scale investigations to the demands of the modern information environment.
Combining recent work in entity resolution and NLP means that analysts can now see the complete picture across structured and unstructured data, and data-driven approaches to name matching covering transliterations, scripts and other real-world name variants can give 90% more accuracy than legacy “fuzzy matching” technology. Robust data privacy controls mean interconnected graphs of knowledge, resolving entities from all available data sources can be now built without compromising user privacy or data protection.
If you would like to know more about Ripjar’s approach and how we have helped global institutions roll out breakthrough innovations in entity resolution to support their counter-financial crime programmes, please download our whitepaper or get in touch with the team below.
David Balson
Director of Intelligence
Contact us for a demo today