How machine learning is taking on online retail fraud

Fraud is one of the biggest causes of lost revenue for online retailers. Fraugster and Riskified, two startups that operate in this space, share their insights and methods for safeguarding online retail.
Written by George Anadiotis, Contributor

Amazon Prime Day (APD) was a huge success, they say. At an estimate 60 percent increase in sales over 2016 and nearly $2 billion in revenue, it's hard to argue otherwise.

If you want to talk numbers though, let's consider this. What would you say if you were told that Amazon could lose nearly 5 percent of that revenue, or $100 million, due to fraud?

That's a lot of money. And it's not just Amazon on its Prime Day, it's every online retailer that is exposed to online fraud every single day.

Retail hallmarks like APD or Christmas make things worse. What can be done to prevent this? Machine learning (ML) to the rescue. ZDNet talked to fraud prevention startups Fraugster and Riskified to get their insights.

The anatomy of fraud

According to industry blog Retail Minded, there are two main types of fraud -- chargeback fraud and card-testing fraud. Chargeback fraud involves purchases that are reported as never delivered and then charged back to the merchant by the credit card company.

Card-testing fraud happens when thieves with a list of stolen card numbers essentially "play the slots" by attempting purchase after purchase from an online store with different numbers until they find a card number that succeeds. They then use this number to make fraudulent purchases at other stores.

It takes both expertise and resources to be able to identify fraud. Behemoths like Amazon may be able to deal with this in-house, but most retailers are not. And in any case, this is not something retailers would like to spend resources on.

According to a 2016 report, the average yearly financial expense attributed to fraud for retailers was 7.6 percent of annual revenue across all channels, including online and offline sales. Seven percent of that is attributable to chargebacks; 74 percent is for fraud management software, hardware and employees; and 19 percent comes from false positives -- transactions erroneously rejected as fraud.

And that is on a business-as-usual day. On AMD, clients operating on Amazon have reportedly seen an increase of 150 percent in fraud attempts. Doing the math for chargebacks and false positives, we arrive at the 5 percent/$100 million figures.

Of course, the blunt of retailer spend attributed to fraud goes to fraud management software, hardware, and employees. Money well spent as far as retailers are concerned probably, since they represent resources required to minimize the impact of fraud.

This is an industry with considerable resources to spend and motivation to do so, producing and sitting on loads of data. Like any other domain with these characteristics, it seems ripe for automation by means of ML. Here is how Fraugster and Riskified approach this.

No false positives, we're positive

Riskified is a fraud management solution for enterprise online retailers, co-founded by Eido Gal and Assaf Feldman in 2012. Assaf is an MIT graduate with 15 years of experience developing machine learning algorithms, and Gal had been working on risk and identity solutions at various startups, including Fraud Sciences, which was purchased by PayPal.

Gal says that they realized there was a gap in the way the eCommerce industry managed risk: "while most retailers were relying on third-party solutions for some parts of their online business, such as payment processing and website creation, every merchant was trying to manage fraud in-house. Fraud prevention tools available in the market at that time generally provided retailers with a risk score per transaction, and the retailer's in-house fraud team was tasked with deciding whether to accept or reject the order."

Gal noted that scoring tools flagged any statistically risky transaction, and fraud teams were focused on preventing losses.

This combination meant that retailers ended up turning away many legitimate customers due to suspected fraud, and losing out on significant revenue. Riskified's vision was to outsource fraud detection to experts, allowing retailers to focus on growing revenue and improving customer service.

The company built a ML based fraud detection system, and leveraged a business model they say ensures their goals align with retailers: driving sales to good customers while avoiding fraud. Instead of providing a risk score and charging a flat fee for every transaction, Riskified offers retailers the option to approve or decline the transaction.


Riskified initially specialize in identifying false positives, but has expanded to cover other fraud scenarios as well. Image: Riskified

Riskified only charges a fee for approved orders covered by a chargeback guarantee in case of fraud. Gal says this incentivizes Riskified to approve as many good transactions as possible, while its chargeback guarantee means it takes on fraud liability for every order it approves, requiring the company to accurately identify fraud attempts.

In order for this to work, Riskified's algorithms must either be less picky about what they approve, or more smart. Gal says that in legacy systems, each data element receives a score, which contributed to the overall risk score of the transactions.

For example, any order shipping to a re-shipper or a placed via a proxy server will be "penalized," as these are potential indicators of fraudulent activity.

"Riskified's ML models are far more complex, taking into account many more data points to uncover the context of the order. In this example, thanks to automatic data enrichment, our systems will have an indication that while the order is shipping to a US-based re-shipper, the item's final destination is in China.

We know that statistically, it's common for consumers based in China to use proxy servers when shopping online, and that to avoid high shipping costs, many good Chinese shoppers use re-shipping services. This insight is incorporated as a feature into our algorithms.

But our ML models consider many additional data points, such as the shopper's online behavior, their digital footprint, and their past transactions with any other merchant using Riskified's solution. Only after evaluating all the relevant data, the models reach a decision to approve or decline the transaction.

When we first launched Riskified, our entire service was identifying good orders that retailers planned to decline. We've since expanded our offering, and today most retailers use Riskified for their entire online volume."

Look, mum, no rules

Fraugster, a German-Israeli payment security company founded in 2014, has its own approach here. Fraugster was founded by Max Laemmle and Chen Zamir. Laemmle says that after years of working in the payments industry, they experienced first-hand the challenges of fraud for e-commerce merchants.

He describes their vision was "to design and build an anti-fraud technology that could help create a fraud free world." Laemmle says they found that all existing anti-fraud solutions were built on outdated technologies and could not deal with sophisticated cyber criminals:

"Existing rule-based systems as well as classical ML solutions are expensive and slow to adapt to new fraud patterns in real-time, hence inaccurate. Our team of intelligence and payment experts spent the last several years designing our proprietary technology from scratch. The result is advanced artificial intelligence (AI) technology which can not only eliminate payment fraud but also maximize revenues by reducing false positives."

Laemmle explains their approach as follows:

"Translating intuition from rules or processes equals a human dictating to a machine how to reason. This requires a lot of manual work. What our engine does is use ML techniques that don't substitute these things but substitute the human intuition part with which we reason.

The end result is a deterministic accurate system, trained not by a human but by a machine. Our engine requires a rich vocabulary and the capability to tie in these separate words into sentences and paragraphs that tell a narrative. We want to expand our vocabulary and continue to train the engine to choose the right vocabulary to tell the right story."


Laemmle reports that their clients operating on Amazon saw an increase of 150 percent in fraud attempts on AMD.

"These are times when it's easier for fraud to pass through a manual review system or classical ML system due to more transactions and fewer resources.

Not because of lack of accuracy, because of lack of scalability and the necessary speed to adapt to new fraudpatterns. A cyber criminal doesn't generally care about sales (they plan on getting the item for free anyways) but during sales time they go through a less adverse security system.

One, because there are more transactions and it's difficult for manual reviews to keep up and two, an item that is on sale might go through a system that is meant to look at lower priced items, think rule-based systems. Our technology is super scalable and self-learning so it can identify new fraud patterns as they emerge in real time.

All ML players have to build work arounds because they can't process data in real-time. This means they have to pre-segmentize the data, etc. Their solutions are not fully automated / frictionless. Fraugster is not using human analysts, rules, or models. Our engine operates fully autonomously without contributing any friction in the check-out process."

Mind the black box

Each company has its own approach and strengths, but the point here is not to compare them. The point is that these are some of the most widely influential applications of real-life big data innovation. Applications like these, even when operating in stealth as far as most of us are concerned, push the boundaries on a number of levels.

Equally important to the technical aspect are the aspects of transparency and compliance. Assaf elaborates:

"While the recent EU law requiring organizations that rely on ML for user-impacting decisions to fully explain the data that resulted in this decision, transparency into ML decisions is also a business requirement. In our industry, retailers need to know why a certain shopper's purchase was identified as fraud and subsequently declined.

In case of a serious fraud ring attack that resulted in high chargeback rates, online merchants are held accountable by the payment gateway/processor -- and need to explain why those fraudulent purchases were approved by the algorithms, and what has been done to ensure such cases are correctly identified going forward.

This has been a blindspot of the tech community, and is a key reason that many businesses are reluctant to leverage ML based tools, which they consider to be "black box" solutions. Riskified has invested significant resources into providing retailers with transparency into our ML decisions.

This was achieved by translating the tools used by Riskified data scientists when researching ML decisioning into a visualization that coherently conveys the logic behind the models' decisions."

As we have noted before, transparency and ML approaches seems to be at odds at present. The requirement for clarity does not only come from regulatory frameworks, but by and large it comes from business users too, as noted by many practitioners. While different approaches have been proposed to work around this issue, at this time no perfect solution seems to exist.

Editorial standards