Fraud detection in the age of machine learning

Reading Time: 4 minutes

It goes without saying that e-commerce has experienced continuous growth for years, especially in the retail sector. However, the pandemic has dramatically accelerated this growth, causing online sales to skyrocket to hitherto unseen levels. Certainly, this is profitable for those who mastered the e-commerce format early on; but it does raise challenges. Faced with this situation, retailers need to enhance their fraud detection capabilities.

According to a study by Juniper Research, losses due to online payment fraud will exceed $25 billion by 2024 – a staggering 52% increase on today’s levels. What’s more, this is all happening in spite of the Payment Services Directive in Europe (PSD2), namely, the Secure Customer Authentication (SCA) clause.

As such, the question of how to prevent online fraud is a key concern. For retailers, this comes with the subsequent challenge of being able to differentiate fraudulent transactions from those that are authentic. In turn, this means trying to find a balance between the large financial losses that these fraudulent transactions entail and the enormous effort that would be involved in investigating each individual payment.

Even if the latter were possible, questioning honest customers would surely have a negative impact on customer relations. Furthermore, it is estimated that around 30% of purchases are abandoned due to a wrongful rejection for suspected fraud, that is, a ‘false positive’ by fraud detection systems. Therefore, it is a matter of applying a scrupulous approach when it comes to discerning between fraudulent and non-fraudulent transactions.

Table of Contents

Machine learning as a fraud detection solution

When it comes to fraud detection, machine learning is a powerful tool. However, methodical, expert-led development is essential. This begins by defining goals. When we talk about preventing or detecting fraud, we basically set three main objectives:

To identify known fraudulent behaviours;
To recognise new potentially fraudulent behaviours;
And to achieve both objectives while minimising false positives.

Until not long ago, it was common to try to achieve these objectives – especially the first two – through predefined business rules that modeled these known fraudulent behaviours. Alternatively, they could be modeled via general statistical methods that could recognise anomalous activities that could potentially facilitate fraud detection.

However, these techniques present two fundamental problems. Primarily, they have a high tendency to mistakenly identify transactions as fraudulent, that is to say, false positives. This precipitates laborious manual reviews to try to mitigate this error. Second, they are not flexible enough to adapt to new consumption patterns, so once again they become very imprecise.

This is where machine learning comes into play. Machine learning, one of the more developed artificial intelligence technologies, is now being used to optimise a wide variety of processes of business, including fraud detection.

fraud detection retail — Fraud detection in retail

How to identify known fraudulent behaviours

To achieve the first objective, one would take a historical database of e-commerce transactions and label those transactions that were known to be fraudulent. This example corresponds to what we know as a supervised learning problem. This method involves building a model that learns from this historical data and is capable of identifying the patterns that characterise a fraudulent transaction.

Once the model is built based on a training algorithm fed by this historical data, it is possible to arrive at a set of precision metrics. We can use these metrics to predict the likelihood that a new transaction is fraudulent. Specifically, one of the most important metrics to monitor is the rate of false positives, thus fine-tuning the business’s approach to e-commerce security and fraud issues.

That said, one of the main challenges will be a marked imbalance in the historical data used to train the model. This is because authentic transactions will almost certainly outweigh fraudulent ones; generally speaking, fraudulent transactions will represent less than 5% of the samples. This will force us to apply downsampling techniques to try to better balance the data, without losing the statistical properties of the original data set.

What if the potentially fraudulent behaviour is new?

In the case of new fraud tactics, we find ourselves faced with a different problem. We have no prior knowledge that these behaviours are in fact fraudulent and, therefore, we do not have the labeled data that we had in the previous case. Thus, this issue corresponds to what we call an unsupervised learning problem.

More specifically, we will run into an anomaly detection issue. To rectify this, you need to identify, in real time, transactions with unusual behaviours and review them manually. This will enable us to make a final decision on whether they should be rejected due to high suspicion of fraud, again seeking to minimise the rate of false positives.

A combination of approaches

So, as is likely clear, to meet the three key objectives stated above we need to adopt a combination of approaches. By combining both types of solutions, supervised and unsupervised, and monitoring the performance of these predictive models, we can ensure that they continue to learn as behaviour patterns change. Subsequently, our fraud analysis and prevention capabilities will be greatly enhanced.

Today, according to statistics recently published in Forbes magazine, just 55% of online retailers feel confident that they’re implementing e-commerce fraud prevention best practices. Therefore, it is essential that, at times such as the present, we are prepared to protect our business. At the same time, we need to ensure our technique is nuanced enough to avoid overly aggressive measures that affect customer satisfaction and loyalty.

Connect with a fraud prevention specialist

Identifying the right professional to lead a machine learning fraud detection project is essential to building an effective tool. In times like these, it’s crucial you build an agile solution that can keep pace with a constantly evolving landscape. But, as we all know too well, hiring in-house artificial intelligence engineers is expensive, and for many small or midsize online retailers, an unviable overhead.

However, there are other options. Tailored talent platforms like Outvise give startups and corporate players alike access to the best freelance talent. Whether you’re looking for a seasoned professional or an innovative graduate, Outvise’s portfolio boasts thousands of machine learning specialists ready for on-site or remote work. Click here to explore the platform and get your fraud detection capabilities up to speed.

Enrique Sahún Pérez

Data & AI Global Product & Business Development Manager at Nae.
Currently leading the Data & AI department at Nae globally, being responsible for both Business and Product Development, and managing different predictive analytics projects based on AI technologies and the definition of the Data-Driven transformation strategy for companies in sectors like Telecom, Banking, Insurance, Retail and Utilities.