Identify leaked documents


76% of documents identified without analyst


100% recall of critical cases


Digital Shadows


Digital Shadows is a UK cyber-intelligence company supplying enterprise-grade threat intelligence and data leakage information as a managed service. Digital Shadows offers, amongst others, a service used by companies to determine what confidential, sensitive and proprietary data is leaking through the organisational boundary. The company used the Big Data Incubator to improve on their current algorithms for selection and scoring to identify leaked documents more efficiently, reducing the amount that require assessment by an Analyst.


The team consisted of four S2DS participants with analytical PhDs in collaboration with a mentor from Digital Shadows. The team spent the first part of the project understanding their dataset and the problem, followed by finding new and optimum methods of extracting information from documents analysing the results. Finally, using statistics, machine learning techniques and Graph/Network Models the team were able to radically improve the classification of documents.

We were delighted with S2DS – a very well run programme full of talented individuals that made a real difference to our business.

Alastair Paterson, CEO Digital Shadows
The Outcome

The team presented several algorithms to Digital Shadows. The recommended algorithm was able to reduce the need for analyst verification by 76%, while at the same time not missing a single critical case. The benefits are clear. By introducing robust and consistent machine learning methods, constructed and implemented by data scientists, Digital Shadows could significantly reduce the time spent for analyst verification. As Digital Shadows continues its rapid growth these efficiency gains will have a significant impact on the company, helping to maintain its position as the leading cyber intelligence provider.

Get priority access to Pivigo news, features, events and networking opportunities