No Cover Image

E-Thesis 627 views

A platform methodology for optimised cancer detection from Raman spectroscopy of human blood serum / FREYA WOODS

Swansea University Author: FREYA WOODS

  • E-Thesis – open access under embargo until: 22nd June 2027

DOI (Published version): 10.23889/SUthesis.60374

Abstract

Cancer remains the most lethal condition in the world, accounting for 10 million deaths worldwide i.e. 1/6 of all deaths. Currently, cancer detection is primar-ily through symptomatic routes wherein patients present with serious symptoms and undergo imaging and subsequently biopsy of suspected cance...

Full description

Published: Swansea 2022
Institution: Swansea University
Degree level: Doctoral
Degree name: Ph.D
Supervisor: Dunstan, Peter R. ; Harris, Dean A.
URI: https://cronfa.swan.ac.uk/Record/cronfa60374
first_indexed 2022-07-04T11:15:58Z
last_indexed 2023-01-13T19:20:27Z
id cronfa60374
recordtype RisThesis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2022-07-04T12:28:24.4288551</datestamp><bib-version>v2</bib-version><id>60374</id><entry>2022-07-04</entry><title>A platform methodology for optimised cancer detection from Raman spectroscopy of human blood serum</title><swanseaauthors><author><sid>63cd8479c64e3970dc1c124789dc8cd0</sid><firstname>FREYA</firstname><surname>WOODS</surname><name>FREYA WOODS</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2022-07-04</date><abstract>Cancer remains the most lethal condition in the world, accounting for 10 million deaths worldwide i.e. 1/6 of all deaths. Currently, cancer detection is primar-ily through symptomatic routes wherein patients present with serious symptoms and undergo imaging and subsequently biopsy of suspected cancer growths for histopathological confirmation of cancer. A diagnostic tool through serum anal-ysis could revolutionise current cancer pathways. Raman spectroscopy offers the ability to measure a complex biochemical fingerprint of a sample through vibra-tional energy shifts. Numerous studies with Raman of biological samples (serum, plasma, tissue, cellular) exist showing promising results for the detection of can-cer. However, these studies are typically limited to proof-of-concept and halt at larger scale studies or forward looking to how these methods might inter-rupt current clinical pathways. Presented in this thesis are a methodology for the optimisation of cancer detection from Raman spectroscopy of human blood serum. The first results chapters show tools in R for the pre-processing of Raman spectra using a variety of techniques. This is followed by an application for qual-ity control of Raman spectra with a view of the necessary safety nets required for tools to integrate into a clinical setting. These tools are then utilised for the task of optimising pre-processing specifically for colorectal cancer detection with human blood serum spectra using high-performance computing (HPC). 2.4 million different pre-processing permutations are trialled in total. This method-ology saw an improvement in diagnostic abilities, with sensitivity increasing by 14.6%, specificity increasing by 6.9%, positive predictive value (PPV) increasing by 3.4%, and negative predictive value increasing by 2.4% when compared to a standard pre-processing optimisation. A similar methodology using HPC is then utilised in chapter 7 to optimise machine learning algorithm selection and feature reduction for colorectal cancer detection of serum spectra. Feature reduction methods principle component analysis (PCA), factor analysis (FA), ElasticNet (EN), and random forest feature selection (RFFS) combined with model types k nearest-neighbours (KNN), logistic regression (LR), support vector machines (SVM), and random forest (RF) are trialled. Traditional feature reduction meth-ods such as PCA and FA were found to perform poorly compared to techniques EN and RFFS. In addition, model types SVM and RF outperform methods LR and KNN. This chapter also shows results from applying artificial neural net-work architectures for colorectal cancer detection and finding that linear machine learning methods (SVM/RF) outperform neural networks. Finally, the last re-sults chapter presents the culmination of these optimised methods applied to building machine learning models for the detection of other cancer types; breast, pancreatic, lung, colorectal cancer. The models achieved 90.9% sensitivity and 77.3% specificity for pancreatic cancer vs. controls, 90.3% sensitivity and 64.5%specificity for lung cancer vs. controls, 92.1% sensitivity and 65.8% specificity for breast cancer and controls, and finally 91.3% sensitivity and 44.0% specificity for colorectal cancer vs. controls where models are thresholded to achieve a minimum of 90% sensitivity. This chapter also focuses on differences in metabolite profile from the Raman spectra between different cancer types and aims to elucidate biochemical linkages.</abstract><type>E-Thesis</type><journal/><volume/><journalNumber/><paginationStart/><paginationEnd/><publisher/><placeOfPublication>Swansea</placeOfPublication><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic/><keywords>Raman spectroscopy, cancer detection, machine learning</keywords><publishedDay>22</publishedDay><publishedMonth>6</publishedMonth><publishedYear>2022</publishedYear><publishedDate>2022-06-22</publishedDate><doi>10.23889/SUthesis.60374</doi><url/><notes>ORCiD identifier: https://orcid.org/0000-0001-9412-0967</notes><college>COLLEGE NANME</college><CollegeCode>COLLEGE CODE</CollegeCode><institution>Swansea University</institution><supervisor>Dunstan, Peter R. ; Harris, Dean A.</supervisor><degreelevel>Doctoral</degreelevel><degreename>Ph.D</degreename><degreesponsorsfunders>Cancer Research Wales; Research grant number: Raman-Usc-Crc - Cr Wales - Peter Dunstan</degreesponsorsfunders><apcterm/><lastEdited>2022-07-04T12:28:24.4288551</lastEdited><Created>2022-07-04T12:12:16.9347236</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Biosciences, Geography and Physics - Physics</level></path><authors><author><firstname>FREYA</firstname><surname>WOODS</surname><order>1</order></author></authors><documents><document><filename>Under embargo</filename><originalFilename>Under embargo</originalFilename><uploaded>2022-07-04T12:20:48.4492194</uploaded><type>Output</type><contentLength>23456000</contentLength><contentType>application/pdf</contentType><version>E-Thesis &#x2013; open access</version><cronfaStatus>true</cronfaStatus><embargoDate>2027-06-22T00:00:00.0000000</embargoDate><documentNotes>Copyright: The author, Freya E. R. Woods, 2022.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language></document></documents><OutputDurs/></rfc1807>
spelling 2022-07-04T12:28:24.4288551 v2 60374 2022-07-04 A platform methodology for optimised cancer detection from Raman spectroscopy of human blood serum 63cd8479c64e3970dc1c124789dc8cd0 FREYA WOODS FREYA WOODS true false 2022-07-04 Cancer remains the most lethal condition in the world, accounting for 10 million deaths worldwide i.e. 1/6 of all deaths. Currently, cancer detection is primar-ily through symptomatic routes wherein patients present with serious symptoms and undergo imaging and subsequently biopsy of suspected cancer growths for histopathological confirmation of cancer. A diagnostic tool through serum anal-ysis could revolutionise current cancer pathways. Raman spectroscopy offers the ability to measure a complex biochemical fingerprint of a sample through vibra-tional energy shifts. Numerous studies with Raman of biological samples (serum, plasma, tissue, cellular) exist showing promising results for the detection of can-cer. However, these studies are typically limited to proof-of-concept and halt at larger scale studies or forward looking to how these methods might inter-rupt current clinical pathways. Presented in this thesis are a methodology for the optimisation of cancer detection from Raman spectroscopy of human blood serum. The first results chapters show tools in R for the pre-processing of Raman spectra using a variety of techniques. This is followed by an application for qual-ity control of Raman spectra with a view of the necessary safety nets required for tools to integrate into a clinical setting. These tools are then utilised for the task of optimising pre-processing specifically for colorectal cancer detection with human blood serum spectra using high-performance computing (HPC). 2.4 million different pre-processing permutations are trialled in total. This method-ology saw an improvement in diagnostic abilities, with sensitivity increasing by 14.6%, specificity increasing by 6.9%, positive predictive value (PPV) increasing by 3.4%, and negative predictive value increasing by 2.4% when compared to a standard pre-processing optimisation. A similar methodology using HPC is then utilised in chapter 7 to optimise machine learning algorithm selection and feature reduction for colorectal cancer detection of serum spectra. Feature reduction methods principle component analysis (PCA), factor analysis (FA), ElasticNet (EN), and random forest feature selection (RFFS) combined with model types k nearest-neighbours (KNN), logistic regression (LR), support vector machines (SVM), and random forest (RF) are trialled. Traditional feature reduction meth-ods such as PCA and FA were found to perform poorly compared to techniques EN and RFFS. In addition, model types SVM and RF outperform methods LR and KNN. This chapter also shows results from applying artificial neural net-work architectures for colorectal cancer detection and finding that linear machine learning methods (SVM/RF) outperform neural networks. Finally, the last re-sults chapter presents the culmination of these optimised methods applied to building machine learning models for the detection of other cancer types; breast, pancreatic, lung, colorectal cancer. The models achieved 90.9% sensitivity and 77.3% specificity for pancreatic cancer vs. controls, 90.3% sensitivity and 64.5%specificity for lung cancer vs. controls, 92.1% sensitivity and 65.8% specificity for breast cancer and controls, and finally 91.3% sensitivity and 44.0% specificity for colorectal cancer vs. controls where models are thresholded to achieve a minimum of 90% sensitivity. This chapter also focuses on differences in metabolite profile from the Raman spectra between different cancer types and aims to elucidate biochemical linkages. E-Thesis Swansea Raman spectroscopy, cancer detection, machine learning 22 6 2022 2022-06-22 10.23889/SUthesis.60374 ORCiD identifier: https://orcid.org/0000-0001-9412-0967 COLLEGE NANME COLLEGE CODE Swansea University Dunstan, Peter R. ; Harris, Dean A. Doctoral Ph.D Cancer Research Wales; Research grant number: Raman-Usc-Crc - Cr Wales - Peter Dunstan 2022-07-04T12:28:24.4288551 2022-07-04T12:12:16.9347236 Faculty of Science and Engineering School of Biosciences, Geography and Physics - Physics FREYA WOODS 1 Under embargo Under embargo 2022-07-04T12:20:48.4492194 Output 23456000 application/pdf E-Thesis – open access true 2027-06-22T00:00:00.0000000 Copyright: The author, Freya E. R. Woods, 2022. true eng
title A platform methodology for optimised cancer detection from Raman spectroscopy of human blood serum
spellingShingle A platform methodology for optimised cancer detection from Raman spectroscopy of human blood serum
FREYA WOODS
title_short A platform methodology for optimised cancer detection from Raman spectroscopy of human blood serum
title_full A platform methodology for optimised cancer detection from Raman spectroscopy of human blood serum
title_fullStr A platform methodology for optimised cancer detection from Raman spectroscopy of human blood serum
title_full_unstemmed A platform methodology for optimised cancer detection from Raman spectroscopy of human blood serum
title_sort A platform methodology for optimised cancer detection from Raman spectroscopy of human blood serum
author_id_str_mv 63cd8479c64e3970dc1c124789dc8cd0
author_id_fullname_str_mv 63cd8479c64e3970dc1c124789dc8cd0_***_FREYA WOODS
author FREYA WOODS
author2 FREYA WOODS
format E-Thesis
publishDate 2022
institution Swansea University
doi_str_mv 10.23889/SUthesis.60374
college_str Faculty of Science and Engineering
hierarchytype
hierarchy_top_id facultyofscienceandengineering
hierarchy_top_title Faculty of Science and Engineering
hierarchy_parent_id facultyofscienceandengineering
hierarchy_parent_title Faculty of Science and Engineering
department_str School of Biosciences, Geography and Physics - Physics{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Biosciences, Geography and Physics - Physics
document_store_str 0
active_str 0
description Cancer remains the most lethal condition in the world, accounting for 10 million deaths worldwide i.e. 1/6 of all deaths. Currently, cancer detection is primar-ily through symptomatic routes wherein patients present with serious symptoms and undergo imaging and subsequently biopsy of suspected cancer growths for histopathological confirmation of cancer. A diagnostic tool through serum anal-ysis could revolutionise current cancer pathways. Raman spectroscopy offers the ability to measure a complex biochemical fingerprint of a sample through vibra-tional energy shifts. Numerous studies with Raman of biological samples (serum, plasma, tissue, cellular) exist showing promising results for the detection of can-cer. However, these studies are typically limited to proof-of-concept and halt at larger scale studies or forward looking to how these methods might inter-rupt current clinical pathways. Presented in this thesis are a methodology for the optimisation of cancer detection from Raman spectroscopy of human blood serum. The first results chapters show tools in R for the pre-processing of Raman spectra using a variety of techniques. This is followed by an application for qual-ity control of Raman spectra with a view of the necessary safety nets required for tools to integrate into a clinical setting. These tools are then utilised for the task of optimising pre-processing specifically for colorectal cancer detection with human blood serum spectra using high-performance computing (HPC). 2.4 million different pre-processing permutations are trialled in total. This method-ology saw an improvement in diagnostic abilities, with sensitivity increasing by 14.6%, specificity increasing by 6.9%, positive predictive value (PPV) increasing by 3.4%, and negative predictive value increasing by 2.4% when compared to a standard pre-processing optimisation. A similar methodology using HPC is then utilised in chapter 7 to optimise machine learning algorithm selection and feature reduction for colorectal cancer detection of serum spectra. Feature reduction methods principle component analysis (PCA), factor analysis (FA), ElasticNet (EN), and random forest feature selection (RFFS) combined with model types k nearest-neighbours (KNN), logistic regression (LR), support vector machines (SVM), and random forest (RF) are trialled. Traditional feature reduction meth-ods such as PCA and FA were found to perform poorly compared to techniques EN and RFFS. In addition, model types SVM and RF outperform methods LR and KNN. This chapter also shows results from applying artificial neural net-work architectures for colorectal cancer detection and finding that linear machine learning methods (SVM/RF) outperform neural networks. Finally, the last re-sults chapter presents the culmination of these optimised methods applied to building machine learning models for the detection of other cancer types; breast, pancreatic, lung, colorectal cancer. The models achieved 90.9% sensitivity and 77.3% specificity for pancreatic cancer vs. controls, 90.3% sensitivity and 64.5%specificity for lung cancer vs. controls, 92.1% sensitivity and 65.8% specificity for breast cancer and controls, and finally 91.3% sensitivity and 44.0% specificity for colorectal cancer vs. controls where models are thresholded to achieve a minimum of 90% sensitivity. This chapter also focuses on differences in metabolite profile from the Raman spectra between different cancer types and aims to elucidate biochemical linkages.
published_date 2022-06-22T14:20:54Z
_version_ 1821415576015208448
score 11.048107