E-Thesis 497 views 342 downloads
Time-Series Embedded Feature Selection Using Deep Learning: Data Mining Electronic Health Records for Novel Biomarkers / GAVIN TSANG
Swansea University Author: GAVIN TSANG
DOI (Published version): 10.23889/SUthesis.61814
Abstract
As health information technologies continue to advance, routine collection and digitisation of patient health records in the form of electronic health records present as an ideal opportunity for data-mining and exploratory analysis of biomarkers and risk factors indicative of a potentially diverse d...
Published: |
Swansea
2022
|
---|---|
Institution: | Swansea University |
Degree level: | Doctoral |
Degree name: | Ph.D |
Supervisor: | Xie, Xianghua |
URI: | https://cronfa.swan.ac.uk/Record/cronfa61814 |
Abstract: |
As health information technologies continue to advance, routine collection and digitisation of patient health records in the form of electronic health records present as an ideal opportunity for data-mining and exploratory analysis of biomarkers and risk factors indicative of a potentially diverse domain of patient outcomes. Patient records have continually become more widely available through various initiatives enabling open access whilst maintaining critical patient privacy. In spite of such progress, health records remain not widely adopted within the current clinical statistical analysis domain due to challenging issues derived from such “big data”.Deep learning based temporal modelling approaches present an ideal solution to health record challenges through automated self-optimisation of representation learning, able to man-ageably compose the high-dimensional domain of patient records into data representations able to model complex data associations. Such representations can serve to condense and reduce dimensionality to emphasise feature sparsity and importance through novel embedded feature selection approaches. Accordingly, application towards patient records enable complex mod-elling and analysis of the full domain of clinical features to select biomarkers of predictive relevance.Firstly, we propose a novel entropy regularised neural network ensemble able to highlight risk factors associated with hospitalisation risk of individuals with dementia. The application of which, was able to reduce a large domain of unique medical events to a small set of relevant risk factors able to maintain hospitalisation discrimination.Following on, we continue our work on ensemble architecture approaches with a novel cas-cading LSTM ensembles to predict severe sepsis onset within critical patients in an ICU critical care centre. We demonstrate state-of-the-art performance capabilities able to outperform that of current related literature.Finally, we propose a novel embedded feature selection application dubbed 1D convolu-tion feature selection using sparsity regularisation. Said methodology was evaluated on both domains of dementia and sepsis prediction objectives to highlight model capability and generalisability. We further report a selection of potential biomarkers for the aforementioned case study objectives highlighting clinical relevance and potential novelty value for future clinical analysis.Accordingly, we demonstrate the effective capability of embedded feature selection ap-proaches through the application of temporal based deep learning architectures in the discovery of effective biomarkers across a variety of challenging clinical applications. |
---|---|
Item Description: |
ORCiD identifier: https://orcid.org/0000-0002-2035-1452 |
Keywords: |
Machine Learning, Feature Selection, Electronic Health Record, Deep Learning, Sepsis, Dementia |
College: |
Faculty of Science and Engineering |