Using Data Science to Detect Fraud in Employee Expense Reports

Case Studies
October 31, 2017


PredictX is a data services firm that serves a range of clients. The project here was aimed at supporting a financial services firm, seeking to develop an automated system to detect fraud in employee expense reports. The business need for such a system is apparent in the aggregate scale of the UK wide problem:

- Employee fraud is estimated to cost UK businesses 88 million GBP per year
- Travel expense fraud is estimated to account for 17% of total business fraud
- Companies with purely manual fraud detection methods are twice as vulnerable to employee fraud compared to those with an automated system

This client company has no current record of fraudulent transactions, so the first step was to develop a screening tool to identify suspicious transactions to be followed up on by the client’s travel experts.


The first step for the team was to define what was meant by a suspicious transaction, and then seek to identify the relationships between the categorical features to identify when there may be a higher risk of a suspicious transaction. The team used embeddings of features to understand the context better. For example, by creating an embedding for a type of flight, e.g. for first-class flights higher prices would be expected, or there may be an expected connection between longer flights or more senior members from the client organisation. Weights were assigned to represent these relationships and captured in a numerical form.

The team clustered the transactions based on their similarities and then sought to use the algorithms to extract specific characteristics that could be tied back to fraudulent activity.

The project core approach was then:

• Feature selection and feature engineering such as normalisation, standardisation, combination, etc.
• Transform from categorical information into a numeric form usable by a model using traditional methods such as one-hot-encoding and more advanced techniques such as feature embeddings using neural networks
• Clustering algorithms and post analysis

Modi ab aliquam a dolore voluptatem iusto voluptatem. Voluptatem eos dolor doloribus deleniti quod fuga. Eveniet tenetur qui
3:44 pm

Modi ab aliquam a dolore voluptatem iusto voluptatem. Voluptatem eos dolor doloribus deleniti quod fuga. Eveniet tenetur qui

Quo numquam pariatur quod est aut deserunt. Ullam consequatur quam voluptatum porro sunt veritatis sint quasi. Eos quia eaque vitae quae voluptate tenetur culpa. Asperiores dolorum cons

The Outcome:

The objective of the project was to identify suspicious travel related transactions. The team built a tool that could be easily used via a custom dashboard. This tool analysed particularly suspicious transactions and compared to similar transactions (flight carriers, location, distance, seniority etc.) and monitored norms for such transactions.

The team flagged approximately 5% of the transactions as suspicious, with a rating of severity for the anomaly transaction. The models the team developed will ultimately save the company a significant amount of time and manpower by prioritising the records to be checked, and will also save significant funds when fraudulent activity is identified.

Christopher Dancel

"We partnered with Pivigo to build a new analytical model for our product. We were very impressed with our team; their practicality and commercial thinking."

Free Consultation

Contact us today to get in touch and discover what AI can do for your business.

Hassle free
Confidential advice
No obligations