Don't miss out

Don't miss out

Don't miss out

Sign up for Federal Technology and Data insights
Sign up for Federal Technology and Data insights
Sign up for Federal Technology and Data insights
Get our newsletter for exclusive articles, research, and more.
Get our newsletter for exclusive articles, research, and more.
Get our newsletter for exclusive articles, research, and more.
Subscribe now

3 data privacy solutions for public health research

3 data privacy solutions for public health research
By Chuck Akin and Eddie Kirkland
Apr 22, 2024
4 MIN. READ

How can public health researchers leverage patient data without compromising privacy?

When it comes to patient data privacy, public health leaders are caught between a rock and a hard place. On one hand, they require high-quality and comprehensive data to fuel their research and develop effective interventions. But they also must protect privacy, as patient health data is frequently targeted in data breach schemes. In fact, between 2015 and 2022, 32% of all recorded data breaches were in the healthcare sector—almost double the number recorded in the financial and manufacturing sectors.

Complicating matters is the need to connect and combine patient data sets to derive insights. For example, if a patient’s name and social security number are the link between their HIV test results and their medical risk factors, a data breach could publicize the patient’s HIV status, name, social security number, and more. Likewise, for smaller test groups like historically underrepresented communities and rare disease patients, it can be difficult to make progress on research because the ability to identify sensitive data becomes greater as the sample size reduces.

“Despite the challenges, data sharing is essential. The public and health care sectors need to share data to prevent and control infectious disease outbreaks, chronic diseases, and other risks to the public. We saw the importance of such sharing during the COVID-19 pandemic when policymakers and the public wanted the most accurate assessment of risk. But such sharing must be done with the utmost caution and privacy protections, or other serious problems will result—including loss of trust in the health system.”

John Auerbach
John Auerbach
Senior Vice President, Federal Health

How can agencies protect highly sensitive, personally identifiable information (PII) and protected health information (PHI) without restricting it so much that it can’t be used at all?

Promising health data privacy solutions

Here are three techniques we’ve been exploring and researching for our federal health clients:

Homomorphic encryption is a specific cryptographic technique that allows analysts to perform analytics and data processing with patient-level data—without needing to decrypt it first. Because the data is fully encrypted and never exposed, it remains unreadable even by those doing the computations, protecting patient privacy while offering the full research value of the data to agencies.

Using homomorphic encryption, our data scientists have successfully carried out analytics and trained classification models on data while it was fully encrypted—in other words, the data was not only encrypted both at rest and in motion/transport, but also while in use. We conducted analytics as both single-party and multi-party computations for a leading U.S. public health agency, helping them assess the limitations and opportunities homomorphic encryption presents for public health research.

Homomorphic encryption is best suited for simple computations on small to moderately sized quantitative datasets. However, advancements in techniques and hardware acceleration are gradually improving its performance.

Confidential computing is an infrastructure technology that protects data as it’s being used by analyzing it in a secure area of a main processor, which prevents unauthorized access or data manipulation. Confidential computing works by establishing a security boundary, or secure enclave called a trusted execution environment (TEE), to isolate the computation from the rest of the system. Data is decrypted only within the TEE—once the computation is complete, the data is re-encrypted and returned to its original state.

Our data scientists have developed a proof-of-concept that demonstrates single- and multi-party computational analytics in a TEE in the cloud. While there are many intricacies to the confidential computing architecture, this technique is suitable for complex workloads and large datasets, and often requires collaboration with a cloud provider or an enterprise partner.

Privacy-preserved datasets use a hybrid of masking privacy techniques to create variance data sets, so they can be shared and protected at different levels for different purposes and population sizes, with varying levels of granularity. This bypasses the typical limitations seen in sophisticated analysis by mixing synthetic data in with the real data, or by masking certain fields, without losing the significance of the data set. The original data can then be multi-purposed in a variety of ways. We are exploring public health uses cases with Anonos and their patented implementation of privacy-preserved datasets, Variant Twins.

These three techniques—homomorphic encryption, confidential computing, and privacy-preserved datasets—make it easier for risk-averse data owners to share their data, and the promise of privacy-preserving technologies is likely to play a prominent role in shaping the legal and regulatory landscape surrounding public health data management and sharing.

Making a choice

These are just three of many data privacy technologies now available. Some can be combined at scale, but since none are the single-best solution across the board of public health data privacy challenges, it can be hard to know what to look for, especially with new techniques frequently coming online.

Our initial R&D work has helped our public health agency clients understand the fundamental differences in the use cases that these techniques apply to—when it’s prudent to use one versus another—and will help them make informed decisions moving forward.

A trusted partner with experience not only in data privacy research and development in general, but in the public health sector as well, is vital to applying the right technology to your unique challenge. Explore our health IT and data and analytics capabilities.

Meet the authors
  1. Chuck Akin, Vice President, Innovation Management, Health Informatics, and Technology

    Chuck is an expert in innovation management and technology investment strategy with over 25 years of experience. View bio

  2. Eddie Kirkland, Principal Data Scientist

    Eddie is a statistics and data science expert with more than 20 years of experience in data and software engineering. He specializes in guiding data-rich projects from concept to delivery, working directly with clients to identify areas of need, developing custom solutions in an agile framework, and delivering clear and meaningful results. View bio

Your mission, modernized.

Subscribe for insights, research, and more on topics like AI-powered government, unlocking the full potential of your data, improving core business processes, and accelerating mission impact.