On July 15, 2024, Singapore’s Personal Data Protection Commission Singapore (PDPC) released a Proposed Guide on Synthetic Data Generation (Guide). The Guide is a resource within the Privacy Enhancing Technology (PET) Sandbox which aims to assist organisations in understanding the techniques and potential applications of Synthetic Data generation, particularly in the context of artificial intelligence (AI) without compromising sensitive data.

In healthcare, where patient privacy and data accessibility are critical concerns, synthetic data presents a promising solution. For instance, it:

  • Preserves privacy: With strict privacy and confidentiality necessary to protect highly sensitive and data, these rules often make data sharing difficult, slowing down research and innovation. Synthetic data offers a way around these challenges by mimicking the trends of real healthcare data without linking to actual patients. This allows healthcare organisations to avoid the complexities of data sharing agreements and privacy restrictions, thereby accelerating research and development while staying compliant.
  • Improves treatments: By simulating patient data, healthcare providers can evaluate the effectiveness and safety of treatments and tools without compromising patient privacy. This not only improves the accuracy of medical tools but also helps identify potential issues before they are used in clinical settings.
  • Enhances research and development: Synthetic data enhances the training and validation of machine learning models, particularly in medical imaging. This provides the ability to augment datasets with diverse and realistic images where real data is limited, thereby reducing costs and labour associated with annotating real images and allowing researchers to refine their models more effectively.
  • Enables predictive analytics and personalised medicine: Synthetic data is a valuable tool in predictive analytics which is crucial for personalised medicine. Machine learning models trained on synthetic data can more accurately predict how patients will respond to treatments, leading to more personalised and effective care strategies.
  • Reduces risks of data breaches: By mimicking real data without including personal information, synthetic data offers a secure alternative for organisations relying on data for insights.
  • Expands data use in education and research: Synthetic data’s benefits extend beyond its primary applications. It can be used for educational purposes, allowing students and professionals to practice on realistic scenarios without compromising patient confidentiality.

To learn more about Singapore’s Guide, please see our original blog post on our Privacy World blog.