Research and analysis

Cost-benefit awareness checklist

Published 7 November 2024

This checklist supports organisations to ensure that they have considered the costs and benefits associated with deploying privacy enhancing technologies (PETs). 

It serves as a companion to the Department for Science, Innovation and Technology (DSIT) and Information Commissioner’s Office’s (ICO) Privacy Enhancing Technologies Cost-Benefit Awareness Tool. This checklist is designed to give an overview of the different areas that should be considered before deploying a PETs based solution.   

The full tool considers how use of federated learning would impact a specific use case, compared to a baseline approach where more traditional methods are used. Alongside this, the tool considers the impacts of using a range of other PETs for input privacy and output privacy, including trusted execution environments (TEE), homomorphic encryption (HE), secure multiparty computation (SMPC), differential privacy and synthetic data. For more information about the use case underpinning the tool, or the baseline for comparison, readers should refer to the full tool.  

This checklist lists key areas that organisations should consider before deploying PETs. It then provides information (in tables) about the impacts that PETs would have on this consideration, if they were deployed to enable the use cases explored in the full tool. This additional information is not meant to exhaustively detail the impact of PETs but serve as an example of the types of impacts PETs may have.   

We recommend using this checklist in tandem with the full tool. 

PETs cost-benefit checklist 

Before deciding to deploy a PETs based solution, you should ensure that you have considered all of the following for both a PETs based solution and alternative possible approaches: 

1. The costs and benefits of data storage needed to facilitate a solution 

Baseline Federated learning Synthetic data, differential privacy    
Data stored centrally in one place  Retain local control over data.

Minimise risk from data breach by minimising aggregation.

Avoid aggregation of security requirements from multiple organisations, and inflexibility due to complex multi-org governance regimes. 
Greater privacy at rest using differential privacy or fully synthetic data. 

Partial or hybrid synthetic data may require additional PETs to be stacked to be secure in storage.
   

Read more about data storage costs and benefits at:  

2. How adopting PETs may impact on compute requirements 

Baseline Federated learning HE, TEE, SMPC Synthetic data, differential privacy    
Invest in advanced hardware in one location and use it efficiently.

The baseline example has variable latency and compute costs dependent on the scale of the data being processed.
Spread compute costs among participants.  HE solutions have a higher level of latency compared to other technologies because of processing occurs directly on encrypted data. This may also result in higher compute overheads.

TEE has lower latency compared to HE since data is processed without encryption. 

SMPC resulting in higher computational overheads than the baseline, however these are trivial in comparison to federated learning. 
Synthetic data and DP datasets have higher compute costs in generation. These costs scale with the complexity of the original/real datasets. 

Dynamic synthetic data and global DP have continuous compute overheads.
 

Read more about compute costs by at: 

3. The trade-offs between privacy-preserving infrastructures and data utility 

Baseline PETs Synthetic data, differential privacy    
High level of utility due to data being decrypted and fully visible to all users. 

Data is visible to all parties and unprotected after decryption.
HE solutions enable a greater degree of privacy than the baseline example because no external parties can view decrypted data at rest or in transit.

TEE and SMPC solutions enable a greater degree of privacy than the baseline example but rely on a greater level of trust between parties than HE.
Synthetic data and DP lose utility as privacy is injected due to the changing distribution of trends in datasets. Organisations seeking to deploy these technologies should consider trade-offs between the appropriate level of privacy when sharing data and the required level of functionality to effectively use the solutions.

 Synthetic data and DP enable sensitive data to be shared that would not previously be able to because of privacy concerns.
  

Read more about privacy-preserving infrastructures at:  

4. How the solution will create new or augment testing and troubleshooting pathways 

Baseline PETs   
The baseline uses traditional approaches to testing and patching. In HE and TEE solutions data is not visible to the data processor testing and troubleshooting may be more difficult and require mitigation at additional cost. Some data is visible to the data processor in SMPC.

TEEs are hardware enabled which may necessitate additional mitigations and resources for testing processes and iterative bug fixes.
 

Read more about testing and troubleshooting considerations at:  

5. The technical skills and resources required to implement a solution 

Baseline PETs   
Standard technical skills and resources required. Privacy enhancing technologies require a more niche skillset from specialist developers familiar with PETs libraries. This may require upskilling existing team members or hiring new talent.  

Read more about technical skills and resources at:   

6. The impacts of a potential solution on compliance with relevant regulations 

Baseline PETs   
May require additional mitigations to ensure compliance with GDPR. This may prevent sensitive data from being processed if sufficient protections cannot be implemented.  Privacy enhancing technologies may enhance compliance data protection law by providing anonymisation, protection against data breaches, or additional security. 

Organisations deploying PETs should consult legal teams and relevant ICO guidance.
 

Read more about compliance considerations at:   

7. The long-term costs and benefits that may occur as the PETs ecosystem develops 

Baseline PETs   
Adding new data sources may result in bespoke or semi-bespoke approach in each instance.

Some data assets are unmonetizable due to privacy/commercial/IP concerns.

Limited network effects due to centralised data and processes. 
Standardises approach to integrating new data sources, simplifying this process.

Value can be derived from previously inaccessible data sources through privacy preserving approaches.

More opportunities to benefit from federated learning as it becomes a more widely adopted approach.
 

Read more about long-term costs and benefits at:

8. The barriers and risks that might be encountered when developing a solution 

Develop a risk register using your own governance processes or consult the ICO guidance on identifying and addressing risks.