Encrypted login | home

Program Information

Regression Modeling for Individualized Cancer Risk Prediction Among American Adults

F Guo

F Guo*, J Deng , D Roffman , Yale New Haven Hospital, New Haven, CT


SU-I-GPD-T-92 (Sunday, July 30, 2017) 3:00 PM - 6:00 PM Room: Exhibit Hall

Purpose: As the burden of cancer continues to grow, accurate and easy-to-implement cancer risk predictive models are needed to improve early cancer detection and prevention. The goal of this study is to build a multivariate logistic regression model for individualized cancer risk prediction based on a large cohort of American adults and assess its efficacy.

Methods: Our data is drawn from the National Health Interview Survey Sample Adult Files from 1997-2015 available on the CDC website. We included survey answers from 555,183 individuals in our analyses. Literature review was conducted to identify some major cancer risk factors, which include sex, race, alcohol consumption, smoking status, body mass index (BMI), age, cancer family history, diabetes status, and cardiovascular diseases status etc. Logistic regression was used to model the risk of getting cancer given the status of these risk factors. Backward elimination was implemented to select the best fitted model. Receiver Operating Characteristic (ROC) curve was graphed to assess the goodness of model fit.

Results: Our multivariate logistic regression study indicates that female, white adults, non-Hispanic ethnicity, alcohol consumption, smoking, underweight, older age, cancer family history, diabetes and cardiovascular diseases are positively correlated with cancer risk with odds ratio estimates ranging from 1.06 to 2.15 (p<0.05). The effect of overweight and obesity on cancer risk is not statistically significant. The Area Under the Curve (AUC) in the ROC graph reached 0.8018 for our model, significantly higher than the 0.54-0.66 AUC values often observed in radiomic biomarker studies.

Conclusion: This regression model is simple-to-implement and relatively robust in predicting cancer risk for individuals. With some easy-to-obtain individual information as input, this model can be used by the clinicians and public health professionals to estimate personal cancer risk, which would help to identify high risk individuals for targeted cancer prevention and early screening interventions.

Contact Email: