No Cover Image

Journal article 648 views 89 downloads

Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration

Hoda Abbasizanjani Orcid Logo, Fatemeh Torabi Orcid Logo, Stuart Bedston, Thomas Bolton, Gareth Davies, Spiros Denaxas, Rowena Griffiths, Laura Herbert Orcid Logo, Sam Hollings, Spencer Keene, Kamlesh Khunti, Emily Lowthian, Jane Lyons, Mehrdad A. Mizani, John Nolan, Cathie Sudlow, Venexia Walker, William Whiteley, Angela Wood, Ashley Akbari Orcid Logo, (CVD-COVID-UK/COVID-IMPACT Consortium)

BMC Medical Informatics and Decision Making, Volume: 23, Issue: 1

Swansea University Authors: Hoda Abbasizanjani Orcid Logo, Fatemeh Torabi Orcid Logo, Stuart Bedston, Gareth Davies, Rowena Griffiths, Laura Herbert Orcid Logo, Emily Lowthian, Jane Lyons, Ashley Akbari Orcid Logo

  • 62338_VoR.pdf

    PDF | Version of Record

    © The Author(s) 2023. This article is licensed under a Creative Commons Attribution 4.0 International License

    Download (3.88MB)

Abstract

BackgroundThe CVD-COVID-UK consortium was formed to understand the relationship between COVID-19 and cardiovascular diseases through analyses of harmonised electronic health records (EHRs) across the four UK nations. Beyond COVID-19, data harmonisation and common approaches enable analysis within an...

Full description

Published in: BMC Medical Informatics and Decision Making
ISSN: 1472-6947
Published: Springer Science and Business Media LLC 2023
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa62338
Tags: Add Tag
No Tags, Be the first to tag this record!
first_indexed 2023-02-17T15:50:46Z
last_indexed 2023-02-18T04:13:54Z
id cronfa62338
recordtype SURis
fullrecord <?xml version="1.0" encoding="utf-8"?><rfc1807 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><bib-version>v2</bib-version><id>62338</id><entry>2023-01-17</entry><title>Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration</title><swanseaauthors><author><sid>93dd7e747f3118a99566c68592a3ddcc</sid><ORCID>0000-0002-9575-4758</ORCID><firstname>Hoda</firstname><surname>Abbasizanjani</surname><name>Hoda Abbasizanjani</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>f569591e1bfb0e405b8091f99fec45d3</sid><ORCID>0000-0002-5853-4625</ORCID><firstname>Fatemeh</firstname><surname>Torabi</surname><name>Fatemeh Torabi</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>c79d07eaba5c9515c0df82b372b76a41</sid><firstname>Stuart</firstname><surname>Bedston</surname><name>Stuart Bedston</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>ab7964da48dd7d1b74a81f2935dc564b</sid><firstname>Gareth</firstname><surname>Davies</surname><name>Gareth Davies</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>381464f639f98bd388c29326ca7f862c</sid><firstname>Rowena</firstname><surname>Griffiths</surname><name>Rowena Griffiths</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>0d5765f5486b80e173366af9a61ee200</sid><ORCID>0000-0001-7580-7413</ORCID><firstname>Laura</firstname><surname>Herbert</surname><name>Laura Herbert</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>db5bc529b8a9dfca2b4a268d14e03479</sid><firstname>Emily</firstname><surname>Lowthian</surname><name>Emily Lowthian</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>1b74fa5125a88451c52c45bcf20e0b47</sid><ORCID/><firstname>Jane</firstname><surname>Lyons</surname><name>Jane Lyons</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>aa1b025ec0243f708bb5eb0a93d6fb52</sid><ORCID>0000-0003-0814-0801</ORCID><firstname>Ashley</firstname><surname>Akbari</surname><name>Ashley Akbari</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2023-01-17</date><deptcode>HDAT</deptcode><abstract>BackgroundThe CVD-COVID-UK consortium was formed to understand the relationship between COVID-19 and cardiovascular diseases through analyses of harmonised electronic health records (EHRs) across the four UK nations. Beyond COVID-19, data harmonisation and common approaches enable analysis within and across independent Trusted Research Environments. Here we describe the reproducible harmonisation method developed using large-scale EHRs in Wales to accommodate the fast and efficient implementation of cross-nation analysis in England and Wales as part of the CVD-COVID-UK programme. We characterise current challenges and share lessons learnt.MethodsServing the scope and scalability of multiple study protocols, we used linked, anonymised individual-level EHR, demographic and administrative data held within the SAIL Databank for the population of Wales. The harmonisation method was implemented as a four-layer reproducible process, starting from raw data in the first layer. Then each of the layers two to four is framed by, but not limited to, the characterised challenges and lessons learnt. We achieved curated data as part of our second layer, followed by extracting phenotyped data in the third layer. We captured any project-specific requirements in the fourth layer.ResultsUsing the implemented four-layer harmonisation method, we retrieved approximately 100 health-related variables for the 3.2 million individuals in Wales, which are harmonised with corresponding variables for &gt; 56 million individuals in England. We processed 13 data sources into the first layer of our harmonisation method: five of these are updated daily or weekly, and the rest at various frequencies providing sufficient data flow updates for frequent capturing of up-to-date demographic, administrative and clinical information.ConclusionsWe implemented an efficient, transparent, scalable, and reproducible harmonisation method that enables multi-nation collaborative research. With a current focus on COVID-19 and its relationship with cardiovascular outcomes, the harmonised data has supported a wide range of research activities across the UK.</abstract><type>Journal Article</type><journal>BMC Medical Informatics and Decision Making</journal><volume>23</volume><journalNumber>1</journalNumber><paginationStart/><paginationEnd/><publisher>Springer Science and Business Media LLC</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic>1472-6947</issnElectronic><keywords>Population health, Data harmonisation, Common data model, Electronic health record, Trusted Research Environments, Reproducible research, SAIL databank, NHS digital TRE for England, COVID-19</keywords><publishedDay>16</publishedDay><publishedMonth>1</publishedMonth><publishedYear>2023</publishedYear><publishedDate>2023-01-16</publishedDate><doi>10.1186/s12911-022-02093-0</doi><url/><notes/><college>COLLEGE NANME</college><department>Health Data Science</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>HDAT</DepartmentCode><institution>Swansea University</institution><apcterm>External research funder(s) paid the OA fee (includes OA grants disbursed by the Library)</apcterm><funders>The British Heart Foundation Data Science Centre (Grant No SP/19/3/34678, awarded to Health Data Research (HDR) UK) funded co-development (with NHS Digital) of the trusted research environment, provision of linked datasets, data access, user software licences, computational usage, and data management and wrangling support, with additional contributions from the HDR UK Data and Connectivity component of the UK Government Chief Scientific Adviser’s National Core Studies programme to coordinate national COVID-19 priority research. Consortium partner organisations funded the time of contributing data analysts, biostatisticians, epidemiologists, and clinicians. This work was supported by the Con-COV team funded by the Medical Research Council (Grant Number: MR/V028367/1). This work was supported by Health Data Research UK, which receives its funding from HDR UK Ltd (HDR-9006) funded by the UK Medical Research Council, Engineering and Physical Sciences Research Council, Economic and Social Research Council, Department of Health and Social Care (England), Chief Scientist Office of the Scottish Government Health and Social Care Directorates, Health and Social Care Research and Development Division (Welsh Government), Public Health Agency (Northern Ireland), British Heart Foundation (BHF) and the Wellcome Trust. This work was supported by the ADR Wales programme of work. The ADR Wales programme of work is aligned to the priority themes as identified in the Welsh Government’s national strategy: Prosperity for All. ADR Wales brings together data science experts at Swansea University Medical School, staff from the Wales Institute of Social and Economic Research, Data and Methods (WISERD) at Cardiff University and specialist teams within the Welsh Government to develop new evidence which supports Prosperity for All by using the SAIL Databank at Swansea University, to link and analyse anonymised data. ADR Wales is part of the Economic and Social Research Council (part of UK Research and Innovation) funded ADR UK (Grant ES/S007393/1). This work was supported by the Wales COVID-19 Evidence Centre, funded by Health and Care Research Wales.</funders><projectreference/><lastEdited>2023-06-12T14:27:34.9938501</lastEdited><Created>2023-01-17T10:09:14.1424216</Created><path><level id="1">Faculty of Medicine, Health and Life Sciences</level><level id="2">Swansea University Medical School - Health Data Science</level></path><authors><author><firstname>Hoda</firstname><surname>Abbasizanjani</surname><orcid>0000-0002-9575-4758</orcid><order>1</order></author><author><firstname>Fatemeh</firstname><surname>Torabi</surname><orcid>0000-0002-5853-4625</orcid><order>2</order></author><author><firstname>Stuart</firstname><surname>Bedston</surname><order>3</order></author><author><firstname>Thomas</firstname><surname>Bolton</surname><order>4</order></author><author><firstname>Gareth</firstname><surname>Davies</surname><order>5</order></author><author><firstname>Spiros</firstname><surname>Denaxas</surname><order>6</order></author><author><firstname>Rowena</firstname><surname>Griffiths</surname><order>7</order></author><author><firstname>Laura</firstname><surname>Herbert</surname><orcid>0000-0001-7580-7413</orcid><order>8</order></author><author><firstname>Sam</firstname><surname>Hollings</surname><order>9</order></author><author><firstname>Spencer</firstname><surname>Keene</surname><order>10</order></author><author><firstname>Kamlesh</firstname><surname>Khunti</surname><order>11</order></author><author><firstname>Emily</firstname><surname>Lowthian</surname><order>12</order></author><author><firstname>Jane</firstname><surname>Lyons</surname><orcid/><order>13</order></author><author><firstname>Mehrdad A.</firstname><surname>Mizani</surname><order>14</order></author><author><firstname>John</firstname><surname>Nolan</surname><order>15</order></author><author><firstname>Cathie</firstname><surname>Sudlow</surname><order>16</order></author><author><firstname>Venexia</firstname><surname>Walker</surname><order>17</order></author><author><firstname>William</firstname><surname>Whiteley</surname><order>18</order></author><author><firstname>Angela</firstname><surname>Wood</surname><order>19</order></author><author><firstname>Ashley</firstname><surname>Akbari</surname><orcid>0000-0003-0814-0801</orcid><order>20</order></author><author><firstname>(CVD-COVID-UK/COVID-IMPACT</firstname><surname>Consortium)</surname><order>21</order></author></authors><documents><document><filename>62338__26616__75a7a5dcf8714498b19267ad2f58b239.pdf</filename><originalFilename>62338_VoR.pdf</originalFilename><uploaded>2023-02-17T15:51:25.9533593</uploaded><type>Output</type><contentLength>4073093</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>© The Author(s) 2023. This article is licensed under a Creative Commons Attribution 4.0 International License</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>http://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807>
spelling v2 62338 2023-01-17 Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration 93dd7e747f3118a99566c68592a3ddcc 0000-0002-9575-4758 Hoda Abbasizanjani Hoda Abbasizanjani true false f569591e1bfb0e405b8091f99fec45d3 0000-0002-5853-4625 Fatemeh Torabi Fatemeh Torabi true false c79d07eaba5c9515c0df82b372b76a41 Stuart Bedston Stuart Bedston true false ab7964da48dd7d1b74a81f2935dc564b Gareth Davies Gareth Davies true false 381464f639f98bd388c29326ca7f862c Rowena Griffiths Rowena Griffiths true false 0d5765f5486b80e173366af9a61ee200 0000-0001-7580-7413 Laura Herbert Laura Herbert true false db5bc529b8a9dfca2b4a268d14e03479 Emily Lowthian Emily Lowthian true false 1b74fa5125a88451c52c45bcf20e0b47 Jane Lyons Jane Lyons true false aa1b025ec0243f708bb5eb0a93d6fb52 0000-0003-0814-0801 Ashley Akbari Ashley Akbari true false 2023-01-17 HDAT BackgroundThe CVD-COVID-UK consortium was formed to understand the relationship between COVID-19 and cardiovascular diseases through analyses of harmonised electronic health records (EHRs) across the four UK nations. Beyond COVID-19, data harmonisation and common approaches enable analysis within and across independent Trusted Research Environments. Here we describe the reproducible harmonisation method developed using large-scale EHRs in Wales to accommodate the fast and efficient implementation of cross-nation analysis in England and Wales as part of the CVD-COVID-UK programme. We characterise current challenges and share lessons learnt.MethodsServing the scope and scalability of multiple study protocols, we used linked, anonymised individual-level EHR, demographic and administrative data held within the SAIL Databank for the population of Wales. The harmonisation method was implemented as a four-layer reproducible process, starting from raw data in the first layer. Then each of the layers two to four is framed by, but not limited to, the characterised challenges and lessons learnt. We achieved curated data as part of our second layer, followed by extracting phenotyped data in the third layer. We captured any project-specific requirements in the fourth layer.ResultsUsing the implemented four-layer harmonisation method, we retrieved approximately 100 health-related variables for the 3.2 million individuals in Wales, which are harmonised with corresponding variables for > 56 million individuals in England. We processed 13 data sources into the first layer of our harmonisation method: five of these are updated daily or weekly, and the rest at various frequencies providing sufficient data flow updates for frequent capturing of up-to-date demographic, administrative and clinical information.ConclusionsWe implemented an efficient, transparent, scalable, and reproducible harmonisation method that enables multi-nation collaborative research. With a current focus on COVID-19 and its relationship with cardiovascular outcomes, the harmonised data has supported a wide range of research activities across the UK. Journal Article BMC Medical Informatics and Decision Making 23 1 Springer Science and Business Media LLC 1472-6947 Population health, Data harmonisation, Common data model, Electronic health record, Trusted Research Environments, Reproducible research, SAIL databank, NHS digital TRE for England, COVID-19 16 1 2023 2023-01-16 10.1186/s12911-022-02093-0 COLLEGE NANME Health Data Science COLLEGE CODE HDAT Swansea University External research funder(s) paid the OA fee (includes OA grants disbursed by the Library) The British Heart Foundation Data Science Centre (Grant No SP/19/3/34678, awarded to Health Data Research (HDR) UK) funded co-development (with NHS Digital) of the trusted research environment, provision of linked datasets, data access, user software licences, computational usage, and data management and wrangling support, with additional contributions from the HDR UK Data and Connectivity component of the UK Government Chief Scientific Adviser’s National Core Studies programme to coordinate national COVID-19 priority research. Consortium partner organisations funded the time of contributing data analysts, biostatisticians, epidemiologists, and clinicians. This work was supported by the Con-COV team funded by the Medical Research Council (Grant Number: MR/V028367/1). This work was supported by Health Data Research UK, which receives its funding from HDR UK Ltd (HDR-9006) funded by the UK Medical Research Council, Engineering and Physical Sciences Research Council, Economic and Social Research Council, Department of Health and Social Care (England), Chief Scientist Office of the Scottish Government Health and Social Care Directorates, Health and Social Care Research and Development Division (Welsh Government), Public Health Agency (Northern Ireland), British Heart Foundation (BHF) and the Wellcome Trust. This work was supported by the ADR Wales programme of work. The ADR Wales programme of work is aligned to the priority themes as identified in the Welsh Government’s national strategy: Prosperity for All. ADR Wales brings together data science experts at Swansea University Medical School, staff from the Wales Institute of Social and Economic Research, Data and Methods (WISERD) at Cardiff University and specialist teams within the Welsh Government to develop new evidence which supports Prosperity for All by using the SAIL Databank at Swansea University, to link and analyse anonymised data. ADR Wales is part of the Economic and Social Research Council (part of UK Research and Innovation) funded ADR UK (Grant ES/S007393/1). This work was supported by the Wales COVID-19 Evidence Centre, funded by Health and Care Research Wales. 2023-06-12T14:27:34.9938501 2023-01-17T10:09:14.1424216 Faculty of Medicine, Health and Life Sciences Swansea University Medical School - Health Data Science Hoda Abbasizanjani 0000-0002-9575-4758 1 Fatemeh Torabi 0000-0002-5853-4625 2 Stuart Bedston 3 Thomas Bolton 4 Gareth Davies 5 Spiros Denaxas 6 Rowena Griffiths 7 Laura Herbert 0000-0001-7580-7413 8 Sam Hollings 9 Spencer Keene 10 Kamlesh Khunti 11 Emily Lowthian 12 Jane Lyons 13 Mehrdad A. Mizani 14 John Nolan 15 Cathie Sudlow 16 Venexia Walker 17 William Whiteley 18 Angela Wood 19 Ashley Akbari 0000-0003-0814-0801 20 (CVD-COVID-UK/COVID-IMPACT Consortium) 21 62338__26616__75a7a5dcf8714498b19267ad2f58b239.pdf 62338_VoR.pdf 2023-02-17T15:51:25.9533593 Output 4073093 application/pdf Version of Record true © The Author(s) 2023. This article is licensed under a Creative Commons Attribution 4.0 International License true eng http://creativecommons.org/licenses/by/4.0/
title Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration
spellingShingle Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration
Hoda Abbasizanjani
Fatemeh Torabi
Stuart Bedston
Gareth Davies
Rowena Griffiths
Laura Herbert
Emily Lowthian
Jane Lyons
Ashley Akbari
title_short Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration
title_full Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration
title_fullStr Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration
title_full_unstemmed Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration
title_sort Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration
author_id_str_mv 93dd7e747f3118a99566c68592a3ddcc
f569591e1bfb0e405b8091f99fec45d3
c79d07eaba5c9515c0df82b372b76a41
ab7964da48dd7d1b74a81f2935dc564b
381464f639f98bd388c29326ca7f862c
0d5765f5486b80e173366af9a61ee200
db5bc529b8a9dfca2b4a268d14e03479
1b74fa5125a88451c52c45bcf20e0b47
aa1b025ec0243f708bb5eb0a93d6fb52
author_id_fullname_str_mv 93dd7e747f3118a99566c68592a3ddcc_***_Hoda Abbasizanjani
f569591e1bfb0e405b8091f99fec45d3_***_Fatemeh Torabi
c79d07eaba5c9515c0df82b372b76a41_***_Stuart Bedston
ab7964da48dd7d1b74a81f2935dc564b_***_Gareth Davies
381464f639f98bd388c29326ca7f862c_***_Rowena Griffiths
0d5765f5486b80e173366af9a61ee200_***_Laura Herbert
db5bc529b8a9dfca2b4a268d14e03479_***_Emily Lowthian
1b74fa5125a88451c52c45bcf20e0b47_***_Jane Lyons
aa1b025ec0243f708bb5eb0a93d6fb52_***_Ashley Akbari
author Hoda Abbasizanjani
Fatemeh Torabi
Stuart Bedston
Gareth Davies
Rowena Griffiths
Laura Herbert
Emily Lowthian
Jane Lyons
Ashley Akbari
author2 Hoda Abbasizanjani
Fatemeh Torabi
Stuart Bedston
Thomas Bolton
Gareth Davies
Spiros Denaxas
Rowena Griffiths
Laura Herbert
Sam Hollings
Spencer Keene
Kamlesh Khunti
Emily Lowthian
Jane Lyons
Mehrdad A. Mizani
John Nolan
Cathie Sudlow
Venexia Walker
William Whiteley
Angela Wood
Ashley Akbari
(CVD-COVID-UK/COVID-IMPACT Consortium)
format Journal article
container_title BMC Medical Informatics and Decision Making
container_volume 23
container_issue 1
publishDate 2023
institution Swansea University
issn 1472-6947
doi_str_mv 10.1186/s12911-022-02093-0
publisher Springer Science and Business Media LLC
college_str Faculty of Medicine, Health and Life Sciences
hierarchytype
hierarchy_top_id facultyofmedicinehealthandlifesciences
hierarchy_top_title Faculty of Medicine, Health and Life Sciences
hierarchy_parent_id facultyofmedicinehealthandlifesciences
hierarchy_parent_title Faculty of Medicine, Health and Life Sciences
department_str Swansea University Medical School - Health Data Science{{{_:::_}}}Faculty of Medicine, Health and Life Sciences{{{_:::_}}}Swansea University Medical School - Health Data Science
document_store_str 1
active_str 0
description BackgroundThe CVD-COVID-UK consortium was formed to understand the relationship between COVID-19 and cardiovascular diseases through analyses of harmonised electronic health records (EHRs) across the four UK nations. Beyond COVID-19, data harmonisation and common approaches enable analysis within and across independent Trusted Research Environments. Here we describe the reproducible harmonisation method developed using large-scale EHRs in Wales to accommodate the fast and efficient implementation of cross-nation analysis in England and Wales as part of the CVD-COVID-UK programme. We characterise current challenges and share lessons learnt.MethodsServing the scope and scalability of multiple study protocols, we used linked, anonymised individual-level EHR, demographic and administrative data held within the SAIL Databank for the population of Wales. The harmonisation method was implemented as a four-layer reproducible process, starting from raw data in the first layer. Then each of the layers two to four is framed by, but not limited to, the characterised challenges and lessons learnt. We achieved curated data as part of our second layer, followed by extracting phenotyped data in the third layer. We captured any project-specific requirements in the fourth layer.ResultsUsing the implemented four-layer harmonisation method, we retrieved approximately 100 health-related variables for the 3.2 million individuals in Wales, which are harmonised with corresponding variables for > 56 million individuals in England. We processed 13 data sources into the first layer of our harmonisation method: five of these are updated daily or weekly, and the rest at various frequencies providing sufficient data flow updates for frequent capturing of up-to-date demographic, administrative and clinical information.ConclusionsWe implemented an efficient, transparent, scalable, and reproducible harmonisation method that enables multi-nation collaborative research. With a current focus on COVID-19 and its relationship with cardiovascular outcomes, the harmonised data has supported a wide range of research activities across the UK.
published_date 2023-01-16T14:27:33Z
_version_ 1768503591599865856
score 11.037319