Custom Ad Fraud Detection System [Case Study]

In this issue:

What were the tasks?
Challenge #1: Scalability
Challenge #2: Refining Data Analysis Algorithms
Challenge #3: Developing the Crawler
Technological Stack
Project team
The APP Solutions Expertise

Fraudulent and malicious activity is a big problem all over the Internet. It is especially a sensitive subject in the realm of digital advertisement.

In many cases, fraudulent activity can throw a wrench into business operations and severely disrupt the workflow. For those who are managing ads and ad spaces – detecting and eliminating any notion of fraud is a matter of life and death.

A few ad fraud statistics for you:

1 out of 5 ad-serving websites is only visited by fraud bots
20% of pay-per-click conversions were fraudulent in 2017
Mobile display ad fraud: 30 billion deceptive impressions per minute
Facebook disabled a total of almost 1.3 billion fake accounts in 2018 (and deleted 865 million spam posts)

The problem of fraud is critical in cases of real-time bidding. Advertisers can get into hot water if some fraudulent ad content will slip through and cause a fuss. Such instances must be prevented wholly.

The APP Solutions created a custom ad fraud detection system, which is designed to detect and report fraudulent and malicious adverts before they cause any damage.

The fraud detection system is a twofold project, consisting of:

Ad data processing and analytics
Ad crawler

The system uses a multi-layered system of monitoring activity and reporting anything resembling unusual activity or malicious content before it started rampaging around.

Another element of the system is the crawler engine. Its purpose is to check the credibility of the ad publishers. Ad Crawler goes through adverts for fraudulent and malicious ad content and assesses its potential of being such. The method includes scanning ad publishers’ websites, analyzing incoming requests, and reporting on them in cases of fraud detection.

The system also uses a serverless tracking pixels to process information coming from mobile tracking systems.

What were the tasks?

This particular Fraud Detection System counters the following types of Fraud:

Cookie Stuffing
Click fraud:
- Click bots
- Click farm
- Click spamming
- Click injection
Lead fraud:
- Lead Bots
- Lead Farms
Impression Fraud

Among the prevention methods we have implemented were:

Digital Footprint / Signature-based – this method uses predefined patterns to detect suspicious activity.
Anomaly-based – this method uses statistical analysis and historical data to inspect suspicious kinds of content and determine whether it is malicious.
Credential-based – this method is used by a web crawler to assess potential fraud activities.

Challenge #1: Scalability

Fraud Detection is a resource-demanding operation that requires high scalability of the system. Given the fact that the system processes a large amount of incoming information — it is critically important to make it able to scale according to the workload.

To provide smooth and reliable scalability in cases of processing large quantities of incoming information, we’ve used Google Cloud Platform and its autoscaling features.

Also, we applied serverless tracking pixels to secure scalability and balanced data processing from multiple sources.

Challenge #2: Refining Data Analysis Algorithms

The centerpiece of the Fraud Detection System is an intricate set of data analysis algorithms that monitor and assess content and activity.

However, to make it work effectively, we needed to refine the algorithms of fraud detection, which required a substantial study of various types of fraudulent and malicious ad content and activity. This research became a foundation of the algorithms.

Data Analysis itself is a rather demanding process on the resources side. To keep the data processing workflow uninterrupted no matter the workload, we used Apache Beam.

Challenge #3: Developing the Crawler

Another significant challenge came during the development of the crawler engine. We needed to refine its working process and include every type and variation of fraudulent and malicious ad content to make the assessment precise.

To secure efficient monitoring, we needed to gather a database of references, which we did through extensive research of various fraudulent and malicious ad content.

Also, to make the crawler identifiable as a real mobile visitor, we have developed a specialized app and used actual Android / iOS mobile devices.

Technological Stack

Google Cloud Platform
Apache Beam
Java
NodeJS
PHP
Symfony
Vue.JS
MySQL
MongoDB

Project team

This team was a unique one because this project’s DevOps engineer was at the same time the Project Manager (it was because this was an existing client and so most of the tech details were known and processes set up.)

3 Senior Developers
2 QA Specialists
1 DevOps Engineer

The APP Solutions Expertise

This project was a real test of skill for our team. Because of the numerous technical challenges we learned a lot about the subject during the development. We performed thorough studies on fraudulent and malicious ad content. Also, we made research on the topic of how fraud content tries to counter fraud detection.

That broadened the scope of our fraud detection solutions and allowed us to develop an extremely efficient fraud detection system that can operate under a significant workload without slowing down or crashing.

Need to develop a complex software system?

Let's talk in detail!