William - stock.adobe.com

How the UK crime agency repurposed Amazon cloud platform to analyse EncroChat cryptophone data

UK crime agency repurposed AWS-based analytics platform to triage EncroChat data and identify threats to life in messages sent on encrypted phone network

The UK’s National Crime Agency (NCA) repurposed its cloud-based data analytics platform to help identify threats to life in messages sent by suspected criminals over the encrypted EncroChat phone network.

After placing a “software implant” on an EncroChat server in Roubaix, investigators from France’s digital crime unit infiltrated the encrypted phone network in April 2020, capturing 70 million messages.

The operation, supported by Europol, led to arrests in the Netherlands, Germany, Sweden, France and other countries of criminals involved in drug trafficking, money laundering and firearms offences. More than 1,100 people have been convicted under the NCA’s investigation into the French EncroChat data, Operation Venetic, which has led to more than 3,000 arrests across the UK, and more than 2,000 suspects being charged.

UK police have seized nearly six and a half tonnes of cocaine, more than three tonnes of heroin and almost 14 and a half tonnes of cannabis, along with 173 firearms, 3,500 rounds of ammunition and £80m in cash from organised crime groups.

Europol supplied British investigators with overnight downloads of data gathered from phones identified as being in the UK, through Europol’s Large File Exchange, part of its Siena secure computer network.

With an estimated 9,000 UK-based EncroChat users, the NCA needed to quickly process a large volume of potentially incriminating data, so tasked its National Cyber Crime Unit (NCCU) with categorising it for human investigators to analyse. To automate the preprocessing of data once it had received the EncroChat material, NCCU staff added pre-built capabilities from Amazon Web Services (AWS) to its cloud data platform, including machine learning software with the capability to extract text, handwriting and data from EncroChat text messages and photographs.

“For us, it’s about preventing harm and protecting the public,” said an NCCU spokesperson, quoted in a technology company case study. “We had a flood of unstructured data and had to operate swiftly to reduce harm to the public. Our data scientists could probably have devised ways of analysing this data themselves. But when we have more than 200 threats to life, we can’t afford to spend time doing that. Using off-the-shelf services from AWS enabled us to go from a standing start to a full capability in the space of hours. If we were to build it ourselves from scratch, that might have taken over a month of effort.”

From 10 to 300 users in two weeks

The NCCU was able to scale-up its existing data analysis platform from tens of users in the NCA to 300 within two weeks of being informed of the EncroChat investigation. 

Once the historic messages extracted from EncroChat’s in-phone database, called Realm, and live text messages sent from thousands of phones were processed, the NCA sent intelligence packages in the form of CSV files to Regional Organised Crime Units; the Police Service of Northern Ireland; Police Scotland; the Metropolitan Police; Border Force; the Prison Service; and HM Revenue & Customs.

These organisations were then responsible for analysing the data for further indications of threats to life, the drugs trade and other criminal activity.

The NCCU had been developing a cloud-based platform to analyse data for over three years before the EncroChat operation. Digital transformation consultancy Contino won the contract to build the platform on AWS.

By shifting from its on-premise infrastructure to the cloud, the NCCU said it has been able to spend more time on investigations, and less time on procuring and maintaining hardware and managing IT infrastructure.

“Previously, we had on-premises infrastructure, which required a lot of management and prevented us from doing the data science we wanted to do,” said an NCCU spokesperson. “Our small tech team spent a considerable amount of time building and managing infrastructure.

“This was a problem, because our recruitment and retention are based on providing people with engaging and challenging work fighting cyber crime, not administering IT.”

Advanced data processing

Within a year of beginning its pilot of the analytics platform – which used services including Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Relational Database Service (Amazon RDS) – the NCCU introduced more advanced data processing capabilities.

This included the Amazon EMR big data platform, which helps scale and automate data processing, and AWS Glue, a serverless data integration service that can combine and organise data from a wide range of sources.

As a law enforcement agency that handles sensitive and therefore potentially harmful data, the NCA and NCCU also needed the platform to be secure, so used Amazon GuardDuty to monitor network activity to shield it from malicious activity.

“Moving data outside of our perimeter is not a decision we take lightly,” said an NCCU spokesperson. “The transparency of AWS, its shared security model, and the access we had to documentation and experts assisted us on that journey considerably.”

Holland’s drug-talk software

At the start of May 2021, the Netherlands Forensic Institute (NFI) announced that its forensic big data analysis (FBDA) team had similarly modified a computer model it had previously developed to scan for drug-related messages sent between suspected criminals in large volumes of communications data, as part of a research and development project.

The NFI told Computer Weekly at the time that the “drug-talk” software was developed in-house before being modified for “threat-to-life” detection and passed on to the police.

Using deep learning techniques, the FBDA team initially trained the model’s neural network in generic language comprehension by having it read webpages and newspaper articles, before introducing it to the messages of suspected criminals, so it could learn how they communicate.

“The team then began using similar techniques to develop a model to recognise life-threatening messages,” said the NFI in a statement. “That model was ready when the chats from EncroChat poured into the police in Driebergen on 1 April.”

NCA turned to cloud analytics to analyse hacked EncroChat messages  

2017: Police establish that EncroChat encrypted phones were used in a number of drugs related offences.

2018: National Crime Agency begins work on developing a National Data Exploitation Capability (NDEC).

21 December 2018: French investigators copy data from an EncroChat server at the OVH datacentre in Roubaix, France. The server data reveals that over 66,000 SIM cards are registered on EncroChat. Investigators are able to decrypt 3,500 files included encrypted notes made by phone users.

May 2019: The UK’s National Crime Agency advertises for a deputy director for its National Data Exploitation Capability, which aims to deliver “industrial scale data analytics”. The NDEC programme, which aims to analyse and share data with UK and overseas law enforcement, has a budget of £30m for 2019/2020.

1 November 2019: The National Crime Agency puts out a tender for four security-cleared software developers to develop capabilities to exploit data from its offices in Vauxhall, London. Among the skills required, include the ability to write SQL database queries, data ingestion techniques and the ability to migrate data between different relational databases. The tender reveals that NCA’s National Cyber Crime Unit had previously moved its data exploitation technology from in-house onto the Amazon Web Service’s cloud.

30 January 2020: A court in Lille, France, approves the use of a data interception device on the EncroChat server and on EncroChat handsets.

3 March 2020: National Crime Agency applies for a Targeted Equipment Interference warrant to authorise its collection of material intercepted by the French from EncroChat phones

9 March 2020: The National Crime Agency takes part in a conference organised by Eurojust in the Hague with representatives of other countries to discuss how to exploit EncroChat data with the French and Dutch Joint Investigation Team working on the hacking operation.

16 March 2020: The NCA issues a £4m tender for a security cleared technology delivery partner to expand the National Data Exploitation Capability. NDEC promises industrial scale data analytics. The project includes the development of data exploitation tools, data acquisition and management. Staff will need enhanced security clearance to work on secret and top secret material.

20 March 2020: The Lille court in France approves an order to redirect data streams on the EncroChat server to enable the capture of EncroChat data.

24 March 2020: The NCA applies for an updated TEI warrant to authorise the additional collection of data about Wi-Fi hotspots that the EncroChat phones came into contact with

1 April 2020: The French and Dutch Joint Investigation team install ‘Trojan Horse’ or ‘implant’ software on an EncroChat server hosted in the OVH data centre in Roubaix, France, which goes live.  

7 April 2020: The French investigation is expanded from an investigation into the illegal supply of encryption technology in France to include illegal trade in drugs and weapons offences.

1 May 2020: The Lille court in France extends permission to continue technical measures against EncroChat’s infrastructure for one month.

1 June 2020: The Lille court extends permission to continue technical measures against EncroChat’s infrastructure for a further four months.

13 July 2020: NCA selects Kainos Software as a technology delivery partner for cloud data science services under a £4m contract, as part of a project to develop the National Data Exploitation Capability.

28 June 2020: EncroChat administrators succeed in closing down the EncroChat network after having discovered the hacking operation.

Read more on Privacy and data protection