Date of Award
2016
Degree Type
Thesis
Degree Name
Master of Science in Statistics
Department
Computer Science and Statistics
First Advisor
Steffen Ventz
Abstract
Sparse regression models are an actively burgeoning area of statistical learning research. A subset of these models seek to separate out significant and non-trivial main effects from noise effects within the regression framework (yielding so-called “sparse” coefficient estimates, where many estimated effects are zero) by imposing penalty terms on a likelihood-based estimator. As this area of the field is relatively recent, many published techniques have not yet been investigated under a wide range of applications. Our goal is to fit several penalty-based estimators for the Cox semiparametric survival model in the context of genomic covariates on breast cancer survival data where there are potentially many more covariates than observations. We use the elastic net family of estimators, special cases of which are the LASSO and ridge regression. Simultaneously, we aim to investigate whether the finer resolution of next-generation genetic sequencing techniques adds improved predictive power to the breast cancer patient survival models. Models are compared using estimates of concordance, namely the c-statistic and a variant which we refer to as Uno’s C. We find that ridge regression models best fit our dataset. Concordance estimates suggest finer resolution genetic covariates improve model predictions, though further work with more observations is required.
Recommended Citation
Amin, Daven, "Risk Classification in High Dimensional Survival Models" (2016). Open Access Master's Theses. Paper 958.
https://digitalcommons.uri.edu/theses/958
Terms of Use
All rights reserved under copyright.