Concepedia

Publication | Open Access

Design and implementation of a standardized framework to generate and evaluate patient-level prediction models using observational healthcare data

216

Citations

11

References

2018

Year

TLDR

The study aims to develop a standardized framework and accompanying open‑source software for building, sharing, and reproducing patient‑level prediction models across observational healthcare databases. The authors propose a five‑step standardized framework—defining the problem, selecting datasets, constructing variables, learning the model, and validating performance—and implement it as open‑source software built on the OMOP Common Data Model to enable model sharing and reproducibility across datasets. In a proof‑of‑concept study, the software produced 84 reproducible prediction models for 21 depression‑related outcomes across four observational databases, all of which are publicly available, demonstrating the framework’s capacity to facilitate model sharing and support extensive external validation.

Abstract

Abstract Objective To develop a conceptual prediction model framework containing standardized steps and describe the corresponding open-source software developed to consistently implement the framework across computational environments and observational healthcare databases to enable model sharing and reproducibility. Methods Based on existing best practices we propose a 5 step standardized framework for: (1) transparently defining the problem; (2) selecting suitable datasets; (3) constructing variables from the observational data; (4) learning the predictive model; and (5) validating the model performance. We implemented this framework as open-source software utilizing the Observational Medical Outcomes Partnership Common Data Model to enable convenient sharing of models and reproduction of model evaluation across multiple observational datasets. The software implementation contains default covariates and classifiers but the framework enables customization and extension. Results As a proof-of-concept, demonstrating the transparency and ease of model dissemination using the software, we developed prediction models for 21 different outcomes within a target population of people suffering from depression across 4 observational databases. All 84 models are available in an accessible online repository to be implemented by anyone with access to an observational database in the Common Data Model format. Conclusions The proof-of-concept study illustrates the framework’s ability to develop reproducible models that can be readily shared and offers the potential to perform extensive external validation of models, and improve their likelihood of clinical uptake. In future work the framework will be applied to perform an “all-by-all” prediction analysis to assess the observational data prediction domain across numerous target populations, outcomes and time, and risk settings.

References

YearCitations

Page 1