Healthcare Data Access in the Age of Data Privacy with Terry Myerson, CEO at Truveta

The Challenge of Accessing Healthcare Data while Maintaining Privacy

Healthcare data is incredibly private, and it can be difficult to balance the need for accessibility with the need to maintain confidentiality. To address this challenge, there are several laws and regulations that govern the handling of healthcare data in the United States.

One such law is HIPAA, also known as the Health Insurance Portability and Accountability Act. HIPAA sets standards for the protection of individually identifiable health information (PHI) and requires covered entities to implement security measures to safeguard PHI. However, HIPAA does not provide a comprehensive framework for de-identifying or de-duplicating PHI.

To address this limitation, there are two paths under HIPAA: Safe Harbor and Safe Harbor with modifications. The Safe Harbor provisions provide a framework for de-identifying PHI by stripping out all personally identifiable information (PII) such as social security numbers, names, addresses, and dates of birth. However, the Safe Harbor provisions have limitations when it comes to analyzing data from specific populations or conditions.

For example, if an individual is born in a specific location on a certain date, it becomes easier to identify them through subsequent data points. Therefore, there are two paths under HIPAA: Safe Harbor and Safe Harbor with modifications. The Safe Harbor provisions provide a framework for de-identifying PHI by stripping out all personally identifiable information (PII) such as social security numbers, names, addresses, and dates of birth.

However, the Safe Harbor provisions have limitations when it comes to analyzing data from specific populations or conditions. For instance, if an individual has a rare disease, it may be more difficult to de-identify their PHI without losing valuable insights into that condition. To address this challenge, researchers are using statistical methods to remove identifiable information while preserving meaningful data.

In addition to Safe Harbor, there is another path under HIPAA: Safe Harbor with modifications. This provision provides a framework for analyzing data from specific populations or conditions by reducing the granularity of the data. For example, instead of analyzing data at the individual level, researchers may use aggregated data to preserve anonymity while still capturing valuable insights.

The Safe Harbor with modifications provisions also provide guidance on how to determine the appropriate level of granularity to use when de-identifying PHI. According to the guidelines, if a study requires more precise data than what is available through de-identification, the researcher should take steps to remove or modify identifiable information without compromising the integrity of the data.

For instance, if a researcher needs to analyze data from a specific geographic region or medical condition, they may need to use aggregated data that preserves anonymity while still capturing valuable insights. In such cases, researchers can use statistical methods to estimate demographic characteristics and disease prevalence without compromising confidentiality.

In summary, HIPAA provides two paths for de-identifying PHI: Safe Harbor and Safe Harbor with modifications. These provisions provide a framework for analyzing data from specific populations or conditions by reducing the granularity of the data while preserving meaningful insights into healthcare outcomes.

To balance utility and anonymity, researchers are using statistical methods to remove identifiable information while preserving valuable insights. They also need to consider the limitations of de-identification when working with rare diseases or specific populations.