Publication | Open Access
Synthea: An approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record
399
Citations
16
References
2017
Year
The scarcity of freely distributable health records and healthcare’s lag in IT and interoperability have hindered innovation, prompting the need for synthetic data solutions. The study aims to create a freely available, privacy‑safe source of synthetic electronic health records for industrial, research, and educational use. Synthea is an open‑source simulation platform that models patient lifespans, common primary‑care encounters, and chronic conditions, using a scalable, community‑extendable framework to generate realistic, privacy‑preserving synthetic EHRs at scale. One million synthetic patient records are freely available in HL7 FHIR and CDA formats via an API,.
Our objective is to create a source of synthetic electronic health records that is readily available; suited to industrial, innovation, research, and educational uses; and free of legal, privacy, security, and intellectual property restrictions.We developed Synthea, an open-source software package that simulates the lifespans of synthetic patients, modeling the 10 most frequent reasons for primary care encounters and the 10 chronic conditions with the highest morbidity in the United States.Synthea adheres to a previously developed conceptual framework, scales via open-source deployment on the Internet, and may be extended with additional disease and treatment modules developed by its user community. One million synthetic patient records are now freely available online, encoded in standard formats (eg, Health Level-7 [HL7] Fast Healthcare Interoperability Resources [FHIR] and Consolidated-Clinical Document Architecture), and accessible through an HL7 FHIR application program interface.Health care lags other industries in information technology, data exchange, and interoperability. The lack of freely distributable health records has long hindered innovation in health care. Approaches and tools are available to inexpensively generate synthetic health records at scale without accidental disclosure risk, lowering current barriers to entry for promising early-stage developments. By engaging a growing community of users, the synthetic data generated will become increasingly comprehensive, detailed, and realistic over time.Synthetic patients can be simulated with models of disease progression and corresponding standards of care to produce risk-free realistic synthetic health care records at scale.
| Year | Citations | |
|---|---|---|
Page 1
Page 1