E-Thesis 625 views 415 downloads
Using Novel Data Types for Big Data Research in Epilepsy: Patient Records, Clinic Letters and Genetic Mutation / Arron S. Lacey
Swansea University Author: Arron S. Lacey
-
PDF | E-Thesis – open access
Download (6.34MB)
DOI (Published version): 10.23889/Suthesis.48905
Abstract
Introduction: The aims of this thesis was to explore novel data types in healthcare that could enhance epidemiology studies in epilepsy and to develop novel methods of analysing routinely collected linked healthcare data, unstructured free text in hospital clinic letters and genetic variation.Method...
Published: |
2019
|
---|---|
Institution: | Swansea University |
Degree level: | Doctoral |
Degree name: | Ph.D |
URI: | https://cronfa.swan.ac.uk/Record/cronfa48905 |
first_indexed |
2019-02-19T20:06:25Z |
---|---|
last_indexed |
2025-03-20T07:23:07Z |
id |
cronfa48905 |
recordtype |
RisThesis |
fullrecord |
<?xml version="1.0"?><rfc1807><datestamp>2025-03-19T10:45:56.4932870</datestamp><bib-version>v2</bib-version><id>48905</id><entry>2019-02-19</entry><title>Using Novel Data Types for Big Data Research in Epilepsy: Patient Records, Clinic Letters and Genetic Mutation</title><swanseaauthors><author><sid>7af5c8bdd1197f85720e4f3d65e803eb</sid><ORCID>0000-0001-7983-8073</ORCID><firstname>Arron S.</firstname><surname>Lacey</surname><name>Arron S. Lacey</name><active>true</active><ethesisStudent>true</ethesisStudent></author></swanseaauthors><date>2019-02-19</date><abstract>Introduction: The aims of this thesis was to explore novel data types in healthcare that could enhance epidemiology studies in epilepsy and to develop novel methods of analysing routinely collected linked healthcare data, unstructured free text in hospital clinic letters and genetic variation.Method: The SAIL Databank was used to source linked healthcare data for people with epilepsy across Wales to study the effects of epilepsy and social deprivation, coding of epilepsy in GP records and the educational attainment of children born to mothers with epilepsy. Hospital clinic letters from Morriston Hospital in Swansea were analysed using Natural Language Processing techniques to extract rich clinic data not typically recorded as part of routinely collected data. An automated pipeline was developed to predict the pathogenicity of Single Nucleotide Polymorphisms to prioritize potential disease-causing genetic variation in epilepsy for further in-vitro analysis.Results: Incidence and prevalence of epilepsy was found to be strongly correlated with increased social deprivation, however a 10 year retrospective follow-up study found that there was no increase in deprivation following a diagnosis of epilepsy, pointing to deprivation contributing to social causation of epilepsy rather than epilepsy causing social drift. An algorithm was developed to accurately source epilepsy patients from GP records. Sodium Valproate was found to reduce educational attainment in 7 year olds by 12%. A Natural Language Processing pipeline was developed to extract rich epilepsy information from clinic letters. A pipeline was created to predict pathogencity of epilepsy SNPs that performed better than commonly used software.Conclusion: This thesis presents novel studies in epilepsy using population level healthcare data, unstructured clinic letters and genetic variation. New methods were developed that have the potential to be applied to other disease areas and used to link different data types into routinely collected healthcare records to enhance further research.</abstract><type>E-Thesis</type><journal/><volume/><journalNumber/><paginationStart/><paginationEnd/><publisher/><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic/><keywords>Big Data, Natural Language Processing, Genomics, Epilepsy</keywords><publishedDay>31</publishedDay><publishedMonth>12</publishedMonth><publishedYear>2019</publishedYear><publishedDate>2019-12-31</publishedDate><doi>10.23889/Suthesis.48905</doi><url/><notes/><college>COLLEGE NANME</college><department>Medicine</department><CollegeCode>COLLEGE CODE</CollegeCode><institution>Swansea University</institution><degreelevel>Doctoral</degreelevel><degreename>Ph.D</degreename><apcterm/><funders/><projectreference/><lastEdited>2025-03-19T10:45:56.4932870</lastEdited><Created>2019-02-19T12:50:16.2444788</Created><path><level id="1">Faculty of Medicine, Health and Life Sciences</level><level id="2">Swansea University Medical School - Medicine</level></path><authors><author><firstname>Arron S.</firstname><surname>Lacey</surname><orcid>0000-0001-7983-8073</orcid><order>1</order></author></authors><documents><document><filename>0048905-19022019141052.pdf</filename><originalFilename>Lacey_Arron_S_PhD_Thesis_Final.pdf</originalFilename><uploaded>2019-02-19T14:10:52.0570000</uploaded><type>Output</type><contentLength>6656199</contentLength><contentType>application/pdf</contentType><version>E-Thesis – open access</version><cronfaStatus>true</cronfaStatus><embargoDate>2019-02-18T00:00:00.0000000</embargoDate><copyrightCorrect>true</copyrightCorrect></document></documents><OutputDurs/></rfc1807> |
spelling |
2025-03-19T10:45:56.4932870 v2 48905 2019-02-19 Using Novel Data Types for Big Data Research in Epilepsy: Patient Records, Clinic Letters and Genetic Mutation 7af5c8bdd1197f85720e4f3d65e803eb 0000-0001-7983-8073 Arron S. Lacey Arron S. Lacey true true 2019-02-19 Introduction: The aims of this thesis was to explore novel data types in healthcare that could enhance epidemiology studies in epilepsy and to develop novel methods of analysing routinely collected linked healthcare data, unstructured free text in hospital clinic letters and genetic variation.Method: The SAIL Databank was used to source linked healthcare data for people with epilepsy across Wales to study the effects of epilepsy and social deprivation, coding of epilepsy in GP records and the educational attainment of children born to mothers with epilepsy. Hospital clinic letters from Morriston Hospital in Swansea were analysed using Natural Language Processing techniques to extract rich clinic data not typically recorded as part of routinely collected data. An automated pipeline was developed to predict the pathogenicity of Single Nucleotide Polymorphisms to prioritize potential disease-causing genetic variation in epilepsy for further in-vitro analysis.Results: Incidence and prevalence of epilepsy was found to be strongly correlated with increased social deprivation, however a 10 year retrospective follow-up study found that there was no increase in deprivation following a diagnosis of epilepsy, pointing to deprivation contributing to social causation of epilepsy rather than epilepsy causing social drift. An algorithm was developed to accurately source epilepsy patients from GP records. Sodium Valproate was found to reduce educational attainment in 7 year olds by 12%. A Natural Language Processing pipeline was developed to extract rich epilepsy information from clinic letters. A pipeline was created to predict pathogencity of epilepsy SNPs that performed better than commonly used software.Conclusion: This thesis presents novel studies in epilepsy using population level healthcare data, unstructured clinic letters and genetic variation. New methods were developed that have the potential to be applied to other disease areas and used to link different data types into routinely collected healthcare records to enhance further research. E-Thesis Big Data, Natural Language Processing, Genomics, Epilepsy 31 12 2019 2019-12-31 10.23889/Suthesis.48905 COLLEGE NANME Medicine COLLEGE CODE Swansea University Doctoral Ph.D 2025-03-19T10:45:56.4932870 2019-02-19T12:50:16.2444788 Faculty of Medicine, Health and Life Sciences Swansea University Medical School - Medicine Arron S. Lacey 0000-0001-7983-8073 1 0048905-19022019141052.pdf Lacey_Arron_S_PhD_Thesis_Final.pdf 2019-02-19T14:10:52.0570000 Output 6656199 application/pdf E-Thesis – open access true 2019-02-18T00:00:00.0000000 true |
title |
Using Novel Data Types for Big Data Research in Epilepsy: Patient Records, Clinic Letters and Genetic Mutation |
spellingShingle |
Using Novel Data Types for Big Data Research in Epilepsy: Patient Records, Clinic Letters and Genetic Mutation Arron S. Lacey |
title_short |
Using Novel Data Types for Big Data Research in Epilepsy: Patient Records, Clinic Letters and Genetic Mutation |
title_full |
Using Novel Data Types for Big Data Research in Epilepsy: Patient Records, Clinic Letters and Genetic Mutation |
title_fullStr |
Using Novel Data Types for Big Data Research in Epilepsy: Patient Records, Clinic Letters and Genetic Mutation |
title_full_unstemmed |
Using Novel Data Types for Big Data Research in Epilepsy: Patient Records, Clinic Letters and Genetic Mutation |
title_sort |
Using Novel Data Types for Big Data Research in Epilepsy: Patient Records, Clinic Letters and Genetic Mutation |
author_id_str_mv |
7af5c8bdd1197f85720e4f3d65e803eb |
author_id_fullname_str_mv |
7af5c8bdd1197f85720e4f3d65e803eb_***_Arron S. Lacey |
author |
Arron S. Lacey |
author2 |
Arron S. Lacey |
format |
E-Thesis |
publishDate |
2019 |
institution |
Swansea University |
doi_str_mv |
10.23889/Suthesis.48905 |
college_str |
Faculty of Medicine, Health and Life Sciences |
hierarchytype |
|
hierarchy_top_id |
facultyofmedicinehealthandlifesciences |
hierarchy_top_title |
Faculty of Medicine, Health and Life Sciences |
hierarchy_parent_id |
facultyofmedicinehealthandlifesciences |
hierarchy_parent_title |
Faculty of Medicine, Health and Life Sciences |
department_str |
Swansea University Medical School - Medicine{{{_:::_}}}Faculty of Medicine, Health and Life Sciences{{{_:::_}}}Swansea University Medical School - Medicine |
document_store_str |
1 |
active_str |
0 |
description |
Introduction: The aims of this thesis was to explore novel data types in healthcare that could enhance epidemiology studies in epilepsy and to develop novel methods of analysing routinely collected linked healthcare data, unstructured free text in hospital clinic letters and genetic variation.Method: The SAIL Databank was used to source linked healthcare data for people with epilepsy across Wales to study the effects of epilepsy and social deprivation, coding of epilepsy in GP records and the educational attainment of children born to mothers with epilepsy. Hospital clinic letters from Morriston Hospital in Swansea were analysed using Natural Language Processing techniques to extract rich clinic data not typically recorded as part of routinely collected data. An automated pipeline was developed to predict the pathogenicity of Single Nucleotide Polymorphisms to prioritize potential disease-causing genetic variation in epilepsy for further in-vitro analysis.Results: Incidence and prevalence of epilepsy was found to be strongly correlated with increased social deprivation, however a 10 year retrospective follow-up study found that there was no increase in deprivation following a diagnosis of epilepsy, pointing to deprivation contributing to social causation of epilepsy rather than epilepsy causing social drift. An algorithm was developed to accurately source epilepsy patients from GP records. Sodium Valproate was found to reduce educational attainment in 7 year olds by 12%. A Natural Language Processing pipeline was developed to extract rich epilepsy information from clinic letters. A pipeline was created to predict pathogencity of epilepsy SNPs that performed better than commonly used software.Conclusion: This thesis presents novel studies in epilepsy using population level healthcare data, unstructured clinic letters and genetic variation. New methods were developed that have the potential to be applied to other disease areas and used to link different data types into routinely collected healthcare records to enhance further research. |
published_date |
2019-12-31T07:31:24Z |
_version_ |
1828180810411278336 |
score |
11.057131 |