top of page

Prognostic model research
(clinical prediction models)

Sample size for developing a prognostic model

  • Calculating the sample size required for developing a clinical prediction model (PDF)

  • A note on estimating the Cox‐Snell R2 from a reported C statistic (AUROC) to inform sample size calculations for developing a prediction model with a binary outcome (PDF)

  • Minimum sample size for developing a multivariable prediction model: PART I ‐ continuous outcomes  (PDF)

  • Minimum sample size for developing a multivariable prediction model: PART II ‐ binary & time‐to‐event outcomes (PDF)

  • Sample size for binary logistic prediction models: Beyond events per variable criteria (PDF)

  • No rationale for 1 variable per 10 events criterion for binary logistic regression analysis (PDF)

  • The problems with using a split-sample for model development and validation (blog)

  • Adaptive sample size determination for the development of clinical prediction models (PDF)

  • How Can Machine Learning be Reliable When the Sample is Adequate for Only One Feature? (blog)

  • Modern Modelling Techniques Are Data Hungry: A Simulation Study for Predicting Dichotomous Endpoints (PDF)

  • To tune or not to tune, a case study of ridge logistic regression in small or sparse datasets (PDF)

  • Developing clinical prediction models when adhering to minimum sample size recommendations: The importance of quantifying bootstrap variability in tuning parameters and predictive performance (PDF)

  • Impact of sample size on the stability of risk scores from clinical prediction models: a case study in cardiovascular disease (PDF)

Videos on sample size for model development & the pmsampsize software are available here

"Why the EPV ≥ 10 sample size rule is rubbish and what to use instead" - slides by Dr Maarten van Smeden available here

Sample size for external validation of a prognostic model

  • Minimum sample size calculations for external validation of a clinical prediction model with a time-to-event outcome (PDF)

  • Minimum sample size for external validation of a clinical prediction model with a binary outcome (PDF)

  • External validation of clinical prediction models: simulation-based sample size calculations were more reliable than rules of thumb (PDF)

  • Minimum sample size for external validation of a clinical prediction model with a continuous outcome (PDF)

(also see corresponding videos here)​

  • Sample size considerations for the external validation of a multivariable prognostic model: a resampling study  (PDF)

  • A calibration hierarchy for risk models was defined: from utopia to empirical data (PDF)

  • Substantial effective sample sizes were required for external validation studies of predictive logistic regression models (PDF)

  • Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data (PDF)

  • Estimation of required sample size for external validation of risk models for binary outcomes (PDF)

Stages of prognostic model research
  • Prognosis Research Strategy (PROGRESS) 3: Prognostic Model Research (PDF)

  • Prognosis and prognostic research: what, why, and how?  (PDF)

  • Prognosis and prognostic research: developing a prognostic model (PDF

  • Prognosis and prognostic research: Validating a prognostic model (PDF)

  • Prognosis and prognostic research: application and impact of prognostic models in clinical practice (PDF)

  • Guide to presenting clinical prediction models for use in clinical settings (PDF)

  • Presentation of multivariate data for clinical use: The Framingham Study risk score functions (PDF)

Improving prognostic model research

  • Key steps and common pitfalls in developing and validating risk models (PDF)

  • Clinical prediction models to predict the risk of multiple binary outcomes: a comparison of approaches (PDF)

  • Clinical prediction models: diagnosis versus prognosis (PDF)

  • Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, ... (PDF)

  • Towards better clinical prediction models: seven steps for development and an ABCD for validation (PDF)

  • Everything you always wanted to know about evaluating prediction models (but were too afraid to ask) (PDF)

  • Variable selection – A review and recommendations for the practicing statistician (PDF)

  • State of the art in selection of variables and functional forms in multivariable analysis—outstanding issues (PDF)

  • Harnessing repeated measurements of predictor variables for clinical risk prediction: a review of existing methods (PDF)

Notes of caution:

  • Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small (PDF) (also see video here)

  • Inappropriate use of bivariable analysis to screen risk factors for use in multivariable analysis (PDF)

  • Poor performance of clinical prediction models: the harm of commonly applied methods (PDF)

  • Regression shrinkage methods for clinical prediction models do not guarantee improved performance ... (PDF)

  • Three myths about risk thresholds for prediction models (PDF)

  • Quantifying the impact of different approaches for handling continuous predictors on the performance of a prognostic model (PDF)

  • Calibration of clinical prediction rules does not just assess bias (PDF)

  • The harm of class imbalance corrections for risk prediction models: illustration and simulation using logistic regression (PDF)

  • Fine‐Gray subdistribution hazard models to simultaneously estimate the absolute risk of different event types: Cumulative total failure probability may exceed 1 (PDF)

Video on controversies in prediction modelling using statistical methods and machine learning available here

Video on COVID-19 related prediction models available here

Video on categorisation of continuous variables (and why not to do it!) available here 

Video on the issue of treatment in prediction models available here

Evaluating the performance of a prognostic model

  • A calibration hierarchy for risk models was defined: from utopia to empirical data (PDF)

  • Calibration: the Achilles heel of predictive analytics (PDF)

  • Internal validation of predictive models: efficiency of some procedures for logistic regression analysis (PDF)

  • Prediction models need appropriate internal, internal-external, and external validation (PDF)

  • Construction and validation of a prognostic model across several studies ... (PDF)

  • Re-evaluation of the comparative effectiveness of bootstrap-based optimism correction methods in the development of multivariable clinical prediction models (PDF)

  • Assessment of predictive performance in incomplete data by combining internal validation & multiple imputation (PDF)

  • Validation and updating of risk models based on multinomial logistic regression (PDF)

  • Assessing calibration of multinomial risk prediction models (PDF)

  • A spline-based tool to assess and visualize the calibration of multiclass risk predictions (PDF)

  • Risk prediction models for discrete ordinal outcomes: Calibration and the impact of the proportional odds assumption (PDF)

  • External validation of a Cox prognostic model: principles and methods (PDF)

  • Assessing performance and clinical usefulness in prediction models with survival outcomes: practical guidance for Cox proportional hazards models (PDF)

  • Lessons learnt when accounting for competing events in the external validation of time-to-event prognostic models (PDF)

  • Tools for checking calibration of a Cox model in external validation: Approach based on individual event probabilities (PDF)

  • Graphical calibration curves and the integrated calibration index (ICI) for survival models (PDF)

  • External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis (PDF)

  • Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests (PDF)

  • Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers (PDF)

  • A simple, step-by-step guide to interpreting decision curve analysis (PDF)

  • Calibration of risk prediction models: impact on decision-analytic performance (PDF)


Improving prognostic survival models

  • Temporal recalibration for improving prognostic model development and risk predictions ... (PDF)

  • Prognostic Models With Competing Risks: Methods and Application to Coronary Risk Prediction (PDF)

  • Validation, calibration, revision and combination of prognostic survival models (PDF)

  • Flexible Parametric Survival Analysis Using Stata: Beyond the Cox Model (PDF)

  • Dynamic models to predict health outcomes: current status and methodological challenges (PDF)

  • Estimation of the Absolute Risk of Cardiovascular Disease and Other Events: Issues With the Use of Multiple Fine-Gray Subdistribution Hazard Models (PDF)

Counterfactual prediction models

  • A scoping review of causal methods enabling predictions under hypothetical interventions (PDF)

  • Treatment Drop-in—Making the Case for Causal Prediction (PDF)

  • Causal inference and counterfactual prediction in machine learning for actionable healthcare (PDF)

  • Explicit causal reasoning is needed to prevent prognostic models being victims of their own success (PDF)

  • Using marginal structural models to adjust for treatment drop-in when developing clinical prediction models (PDF)

VIDEOS: see  here for lectures on counterfactual prediction and treatment ​

bottom of page