E-Thesis 497 views 342 downloads
Time-Series Embedded Feature Selection Using Deep Learning: Data Mining Electronic Health Records for Novel Biomarkers / GAVIN TSANG
Swansea University Author: GAVIN TSANG
DOI (Published version): 10.23889/SUthesis.61814
Abstract
As health information technologies continue to advance, routine collection and digitisation of patient health records in the form of electronic health records present as an ideal opportunity for data-mining and exploratory analysis of biomarkers and risk factors indicative of a potentially diverse d...
Published: |
Swansea
2022
|
---|---|
Institution: | Swansea University |
Degree level: | Doctoral |
Degree name: | Ph.D |
Supervisor: | Xie, Xianghua |
URI: | https://cronfa.swan.ac.uk/Record/cronfa61814 |
first_indexed |
2022-11-08T11:35:23Z |
---|---|
last_indexed |
2023-01-13T19:22:49Z |
id |
cronfa61814 |
recordtype |
RisThesis |
fullrecord |
<?xml version="1.0"?><rfc1807><datestamp>2022-11-08T11:44:39.3198379</datestamp><bib-version>v2</bib-version><id>61814</id><entry>2022-11-08</entry><title>Time-Series Embedded Feature Selection Using Deep Learning: Data Mining Electronic Health Records for Novel Biomarkers</title><swanseaauthors><author><sid>35ba5aa06ef4ebb54bfac247a47c1022</sid><firstname>GAVIN</firstname><surname>TSANG</surname><name>GAVIN TSANG</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2022-11-08</date><abstract>As health information technologies continue to advance, routine collection and digitisation of patient health records in the form of electronic health records present as an ideal opportunity for data-mining and exploratory analysis of biomarkers and risk factors indicative of a potentially diverse domain of patient outcomes. Patient records have continually become more widely available through various initiatives enabling open access whilst maintaining critical patient privacy. In spite of such progress, health records remain not widely adopted within the current clinical statistical analysis domain due to challenging issues derived from such “big data”.Deep learning based temporal modelling approaches present an ideal solution to health record challenges through automated self-optimisation of representation learning, able to man-ageably compose the high-dimensional domain of patient records into data representations able to model complex data associations. Such representations can serve to condense and reduce dimensionality to emphasise feature sparsity and importance through novel embedded feature selection approaches. Accordingly, application towards patient records enable complex mod-elling and analysis of the full domain of clinical features to select biomarkers of predictive relevance.Firstly, we propose a novel entropy regularised neural network ensemble able to highlight risk factors associated with hospitalisation risk of individuals with dementia. The application of which, was able to reduce a large domain of unique medical events to a small set of relevant risk factors able to maintain hospitalisation discrimination.Following on, we continue our work on ensemble architecture approaches with a novel cas-cading LSTM ensembles to predict severe sepsis onset within critical patients in an ICU critical care centre. We demonstrate state-of-the-art performance capabilities able to outperform that of current related literature.Finally, we propose a novel embedded feature selection application dubbed 1D convolu-tion feature selection using sparsity regularisation. Said methodology was evaluated on both domains of dementia and sepsis prediction objectives to highlight model capability and generalisability. We further report a selection of potential biomarkers for the aforementioned case study objectives highlighting clinical relevance and potential novelty value for future clinical analysis.Accordingly, we demonstrate the effective capability of embedded feature selection ap-proaches through the application of temporal based deep learning architectures in the discovery of effective biomarkers across a variety of challenging clinical applications.</abstract><type>E-Thesis</type><journal/><volume/><journalNumber/><paginationStart/><paginationEnd/><publisher/><placeOfPublication>Swansea</placeOfPublication><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic/><keywords>Machine Learning, Feature Selection, Electronic Health Record, Deep Learning, Sepsis, Dementia</keywords><publishedDay>7</publishedDay><publishedMonth>11</publishedMonth><publishedYear>2022</publishedYear><publishedDate>2022-11-07</publishedDate><doi>10.23889/SUthesis.61814</doi><url/><notes>ORCiD identifier: https://orcid.org/0000-0002-2035-1452</notes><college>COLLEGE NANME</college><CollegeCode>COLLEGE CODE</CollegeCode><institution>Swansea University</institution><supervisor>Xie, Xianghua</supervisor><degreelevel>Doctoral</degreelevel><degreename>Ph.D</degreename><degreesponsorsfunders>EPSRC (EP/N028139/1)</degreesponsorsfunders><apcterm/><funders/><projectreference/><lastEdited>2022-11-08T11:44:39.3198379</lastEdited><Created>2022-11-08T11:31:55.3132942</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>GAVIN</firstname><surname>TSANG</surname><order>1</order></author></authors><documents><document><filename>61814__25688__4674b8f75f814726ad5879fe11c7e2e7.pdf</filename><originalFilename>Tsang_Gavin_PhD_Thesis_Final_Redacted_Signature.pdf</originalFilename><uploaded>2022-11-08T11:42:00.7407336</uploaded><type>Output</type><contentLength>1670287</contentLength><contentType>application/pdf</contentType><version>E-Thesis – open access</version><cronfaStatus>true</cronfaStatus><documentNotes>Copyright: The author, Gavin Tsang, 2022.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language></document></documents><OutputDurs/></rfc1807> |
spelling |
2022-11-08T11:44:39.3198379 v2 61814 2022-11-08 Time-Series Embedded Feature Selection Using Deep Learning: Data Mining Electronic Health Records for Novel Biomarkers 35ba5aa06ef4ebb54bfac247a47c1022 GAVIN TSANG GAVIN TSANG true false 2022-11-08 As health information technologies continue to advance, routine collection and digitisation of patient health records in the form of electronic health records present as an ideal opportunity for data-mining and exploratory analysis of biomarkers and risk factors indicative of a potentially diverse domain of patient outcomes. Patient records have continually become more widely available through various initiatives enabling open access whilst maintaining critical patient privacy. In spite of such progress, health records remain not widely adopted within the current clinical statistical analysis domain due to challenging issues derived from such “big data”.Deep learning based temporal modelling approaches present an ideal solution to health record challenges through automated self-optimisation of representation learning, able to man-ageably compose the high-dimensional domain of patient records into data representations able to model complex data associations. Such representations can serve to condense and reduce dimensionality to emphasise feature sparsity and importance through novel embedded feature selection approaches. Accordingly, application towards patient records enable complex mod-elling and analysis of the full domain of clinical features to select biomarkers of predictive relevance.Firstly, we propose a novel entropy regularised neural network ensemble able to highlight risk factors associated with hospitalisation risk of individuals with dementia. The application of which, was able to reduce a large domain of unique medical events to a small set of relevant risk factors able to maintain hospitalisation discrimination.Following on, we continue our work on ensemble architecture approaches with a novel cas-cading LSTM ensembles to predict severe sepsis onset within critical patients in an ICU critical care centre. We demonstrate state-of-the-art performance capabilities able to outperform that of current related literature.Finally, we propose a novel embedded feature selection application dubbed 1D convolu-tion feature selection using sparsity regularisation. Said methodology was evaluated on both domains of dementia and sepsis prediction objectives to highlight model capability and generalisability. We further report a selection of potential biomarkers for the aforementioned case study objectives highlighting clinical relevance and potential novelty value for future clinical analysis.Accordingly, we demonstrate the effective capability of embedded feature selection ap-proaches through the application of temporal based deep learning architectures in the discovery of effective biomarkers across a variety of challenging clinical applications. E-Thesis Swansea Machine Learning, Feature Selection, Electronic Health Record, Deep Learning, Sepsis, Dementia 7 11 2022 2022-11-07 10.23889/SUthesis.61814 ORCiD identifier: https://orcid.org/0000-0002-2035-1452 COLLEGE NANME COLLEGE CODE Swansea University Xie, Xianghua Doctoral Ph.D EPSRC (EP/N028139/1) 2022-11-08T11:44:39.3198379 2022-11-08T11:31:55.3132942 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science GAVIN TSANG 1 61814__25688__4674b8f75f814726ad5879fe11c7e2e7.pdf Tsang_Gavin_PhD_Thesis_Final_Redacted_Signature.pdf 2022-11-08T11:42:00.7407336 Output 1670287 application/pdf E-Thesis – open access true Copyright: The author, Gavin Tsang, 2022. true eng |
title |
Time-Series Embedded Feature Selection Using Deep Learning: Data Mining Electronic Health Records for Novel Biomarkers |
spellingShingle |
Time-Series Embedded Feature Selection Using Deep Learning: Data Mining Electronic Health Records for Novel Biomarkers GAVIN TSANG |
title_short |
Time-Series Embedded Feature Selection Using Deep Learning: Data Mining Electronic Health Records for Novel Biomarkers |
title_full |
Time-Series Embedded Feature Selection Using Deep Learning: Data Mining Electronic Health Records for Novel Biomarkers |
title_fullStr |
Time-Series Embedded Feature Selection Using Deep Learning: Data Mining Electronic Health Records for Novel Biomarkers |
title_full_unstemmed |
Time-Series Embedded Feature Selection Using Deep Learning: Data Mining Electronic Health Records for Novel Biomarkers |
title_sort |
Time-Series Embedded Feature Selection Using Deep Learning: Data Mining Electronic Health Records for Novel Biomarkers |
author_id_str_mv |
35ba5aa06ef4ebb54bfac247a47c1022 |
author_id_fullname_str_mv |
35ba5aa06ef4ebb54bfac247a47c1022_***_GAVIN TSANG |
author |
GAVIN TSANG |
author2 |
GAVIN TSANG |
format |
E-Thesis |
publishDate |
2022 |
institution |
Swansea University |
doi_str_mv |
10.23889/SUthesis.61814 |
college_str |
Faculty of Science and Engineering |
hierarchytype |
|
hierarchy_top_id |
facultyofscienceandengineering |
hierarchy_top_title |
Faculty of Science and Engineering |
hierarchy_parent_id |
facultyofscienceandengineering |
hierarchy_parent_title |
Faculty of Science and Engineering |
department_str |
School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science |
document_store_str |
1 |
active_str |
0 |
description |
As health information technologies continue to advance, routine collection and digitisation of patient health records in the form of electronic health records present as an ideal opportunity for data-mining and exploratory analysis of biomarkers and risk factors indicative of a potentially diverse domain of patient outcomes. Patient records have continually become more widely available through various initiatives enabling open access whilst maintaining critical patient privacy. In spite of such progress, health records remain not widely adopted within the current clinical statistical analysis domain due to challenging issues derived from such “big data”.Deep learning based temporal modelling approaches present an ideal solution to health record challenges through automated self-optimisation of representation learning, able to man-ageably compose the high-dimensional domain of patient records into data representations able to model complex data associations. Such representations can serve to condense and reduce dimensionality to emphasise feature sparsity and importance through novel embedded feature selection approaches. Accordingly, application towards patient records enable complex mod-elling and analysis of the full domain of clinical features to select biomarkers of predictive relevance.Firstly, we propose a novel entropy regularised neural network ensemble able to highlight risk factors associated with hospitalisation risk of individuals with dementia. The application of which, was able to reduce a large domain of unique medical events to a small set of relevant risk factors able to maintain hospitalisation discrimination.Following on, we continue our work on ensemble architecture approaches with a novel cas-cading LSTM ensembles to predict severe sepsis onset within critical patients in an ICU critical care centre. We demonstrate state-of-the-art performance capabilities able to outperform that of current related literature.Finally, we propose a novel embedded feature selection application dubbed 1D convolu-tion feature selection using sparsity regularisation. Said methodology was evaluated on both domains of dementia and sepsis prediction objectives to highlight model capability and generalisability. We further report a selection of potential biomarkers for the aforementioned case study objectives highlighting clinical relevance and potential novelty value for future clinical analysis.Accordingly, we demonstrate the effective capability of embedded feature selection ap-proaches through the application of temporal based deep learning architectures in the discovery of effective biomarkers across a variety of challenging clinical applications. |
published_date |
2022-11-07T05:21:04Z |
_version_ |
1821381612419416064 |
score |
11.04748 |