Journal article 648 views 89 downloads
Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration
BMC Medical Informatics and Decision Making, Volume: 23, Issue: 1
Swansea University Authors: Hoda Abbasizanjani , Fatemeh Torabi , Stuart Bedston, Gareth Davies, Rowena Griffiths, Laura Herbert , Emily Lowthian, Jane Lyons, Ashley Akbari
-
PDF | Version of Record
© The Author(s) 2023. This article is licensed under a Creative Commons Attribution 4.0 International License
Download (3.88MB)
DOI (Published version): 10.1186/s12911-022-02093-0
Abstract
BackgroundThe CVD-COVID-UK consortium was formed to understand the relationship between COVID-19 and cardiovascular diseases through analyses of harmonised electronic health records (EHRs) across the four UK nations. Beyond COVID-19, data harmonisation and common approaches enable analysis within an...
Published in: | BMC Medical Informatics and Decision Making |
---|---|
ISSN: | 1472-6947 |
Published: |
Springer Science and Business Media LLC
2023
|
Online Access: |
Check full text
|
URI: | https://cronfa.swan.ac.uk/Record/cronfa62338 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
first_indexed |
2023-02-17T15:50:46Z |
---|---|
last_indexed |
2023-02-18T04:13:54Z |
id |
cronfa62338 |
recordtype |
SURis |
fullrecord |
<?xml version="1.0" encoding="utf-8"?><rfc1807 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><bib-version>v2</bib-version><id>62338</id><entry>2023-01-17</entry><title>Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration</title><swanseaauthors><author><sid>93dd7e747f3118a99566c68592a3ddcc</sid><ORCID>0000-0002-9575-4758</ORCID><firstname>Hoda</firstname><surname>Abbasizanjani</surname><name>Hoda Abbasizanjani</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>f569591e1bfb0e405b8091f99fec45d3</sid><ORCID>0000-0002-5853-4625</ORCID><firstname>Fatemeh</firstname><surname>Torabi</surname><name>Fatemeh Torabi</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>c79d07eaba5c9515c0df82b372b76a41</sid><firstname>Stuart</firstname><surname>Bedston</surname><name>Stuart Bedston</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>ab7964da48dd7d1b74a81f2935dc564b</sid><firstname>Gareth</firstname><surname>Davies</surname><name>Gareth Davies</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>381464f639f98bd388c29326ca7f862c</sid><firstname>Rowena</firstname><surname>Griffiths</surname><name>Rowena Griffiths</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>0d5765f5486b80e173366af9a61ee200</sid><ORCID>0000-0001-7580-7413</ORCID><firstname>Laura</firstname><surname>Herbert</surname><name>Laura Herbert</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>db5bc529b8a9dfca2b4a268d14e03479</sid><firstname>Emily</firstname><surname>Lowthian</surname><name>Emily Lowthian</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>1b74fa5125a88451c52c45bcf20e0b47</sid><ORCID/><firstname>Jane</firstname><surname>Lyons</surname><name>Jane Lyons</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>aa1b025ec0243f708bb5eb0a93d6fb52</sid><ORCID>0000-0003-0814-0801</ORCID><firstname>Ashley</firstname><surname>Akbari</surname><name>Ashley Akbari</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2023-01-17</date><deptcode>HDAT</deptcode><abstract>BackgroundThe CVD-COVID-UK consortium was formed to understand the relationship between COVID-19 and cardiovascular diseases through analyses of harmonised electronic health records (EHRs) across the four UK nations. Beyond COVID-19, data harmonisation and common approaches enable analysis within and across independent Trusted Research Environments. Here we describe the reproducible harmonisation method developed using large-scale EHRs in Wales to accommodate the fast and efficient implementation of cross-nation analysis in England and Wales as part of the CVD-COVID-UK programme. We characterise current challenges and share lessons learnt.MethodsServing the scope and scalability of multiple study protocols, we used linked, anonymised individual-level EHR, demographic and administrative data held within the SAIL Databank for the population of Wales. The harmonisation method was implemented as a four-layer reproducible process, starting from raw data in the first layer. Then each of the layers two to four is framed by, but not limited to, the characterised challenges and lessons learnt. We achieved curated data as part of our second layer, followed by extracting phenotyped data in the third layer. We captured any project-specific requirements in the fourth layer.ResultsUsing the implemented four-layer harmonisation method, we retrieved approximately 100 health-related variables for the 3.2 million individuals in Wales, which are harmonised with corresponding variables for > 56 million individuals in England. We processed 13 data sources into the first layer of our harmonisation method: five of these are updated daily or weekly, and the rest at various frequencies providing sufficient data flow updates for frequent capturing of up-to-date demographic, administrative and clinical information.ConclusionsWe implemented an efficient, transparent, scalable, and reproducible harmonisation method that enables multi-nation collaborative research. With a current focus on COVID-19 and its relationship with cardiovascular outcomes, the harmonised data has supported a wide range of research activities across the UK.</abstract><type>Journal Article</type><journal>BMC Medical Informatics and Decision Making</journal><volume>23</volume><journalNumber>1</journalNumber><paginationStart/><paginationEnd/><publisher>Springer Science and Business Media LLC</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic>1472-6947</issnElectronic><keywords>Population health, Data harmonisation, Common data model, Electronic health record, Trusted Research Environments, Reproducible research, SAIL databank, NHS digital TRE for England, COVID-19</keywords><publishedDay>16</publishedDay><publishedMonth>1</publishedMonth><publishedYear>2023</publishedYear><publishedDate>2023-01-16</publishedDate><doi>10.1186/s12911-022-02093-0</doi><url/><notes/><college>COLLEGE NANME</college><department>Health Data Science</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>HDAT</DepartmentCode><institution>Swansea University</institution><apcterm>External research funder(s) paid the OA fee (includes OA grants disbursed by the Library)</apcterm><funders>The British Heart Foundation Data Science Centre (Grant No SP/19/3/34678, awarded to Health Data Research (HDR) UK) funded co-development (with NHS Digital) of the trusted research environment, provision of linked datasets, data access, user software licences, computational usage, and data management and wrangling support, with additional contributions from the HDR UK Data and Connectivity component of the UK Government Chief Scientific Adviser’s National Core Studies programme to coordinate national COVID-19 priority research. Consortium partner organisations funded the time of contributing data analysts, biostatisticians, epidemiologists, and clinicians. This work was supported by the Con-COV team funded by the Medical Research Council (Grant Number: MR/V028367/1). This work was supported by Health Data Research UK, which receives its funding from HDR UK Ltd (HDR-9006) funded by the UK Medical Research Council, Engineering and Physical Sciences Research Council, Economic and Social Research Council, Department of Health and Social Care (England), Chief Scientist Office of the Scottish Government Health and Social Care Directorates, Health and Social Care Research and Development Division (Welsh Government), Public Health Agency (Northern Ireland), British Heart Foundation (BHF) and the Wellcome Trust. This work was supported by the ADR Wales programme of work. The ADR Wales programme of work is aligned to the priority themes as identified in the Welsh Government’s national strategy: Prosperity for All. ADR Wales brings together data science experts at Swansea University Medical School, staff from the Wales Institute of Social and Economic Research, Data and Methods (WISERD) at Cardiff University and specialist teams within the Welsh Government to develop new evidence which supports Prosperity for All by using the SAIL Databank at Swansea University, to link and analyse anonymised data. ADR Wales is part of the Economic and Social Research Council (part of UK Research and Innovation) funded ADR UK (Grant ES/S007393/1). This work was supported by the Wales COVID-19 Evidence Centre, funded by Health and Care Research Wales.</funders><projectreference/><lastEdited>2023-06-12T14:27:34.9938501</lastEdited><Created>2023-01-17T10:09:14.1424216</Created><path><level id="1">Faculty of Medicine, Health and Life Sciences</level><level id="2">Swansea University Medical School - Health Data Science</level></path><authors><author><firstname>Hoda</firstname><surname>Abbasizanjani</surname><orcid>0000-0002-9575-4758</orcid><order>1</order></author><author><firstname>Fatemeh</firstname><surname>Torabi</surname><orcid>0000-0002-5853-4625</orcid><order>2</order></author><author><firstname>Stuart</firstname><surname>Bedston</surname><order>3</order></author><author><firstname>Thomas</firstname><surname>Bolton</surname><order>4</order></author><author><firstname>Gareth</firstname><surname>Davies</surname><order>5</order></author><author><firstname>Spiros</firstname><surname>Denaxas</surname><order>6</order></author><author><firstname>Rowena</firstname><surname>Griffiths</surname><order>7</order></author><author><firstname>Laura</firstname><surname>Herbert</surname><orcid>0000-0001-7580-7413</orcid><order>8</order></author><author><firstname>Sam</firstname><surname>Hollings</surname><order>9</order></author><author><firstname>Spencer</firstname><surname>Keene</surname><order>10</order></author><author><firstname>Kamlesh</firstname><surname>Khunti</surname><order>11</order></author><author><firstname>Emily</firstname><surname>Lowthian</surname><order>12</order></author><author><firstname>Jane</firstname><surname>Lyons</surname><orcid/><order>13</order></author><author><firstname>Mehrdad A.</firstname><surname>Mizani</surname><order>14</order></author><author><firstname>John</firstname><surname>Nolan</surname><order>15</order></author><author><firstname>Cathie</firstname><surname>Sudlow</surname><order>16</order></author><author><firstname>Venexia</firstname><surname>Walker</surname><order>17</order></author><author><firstname>William</firstname><surname>Whiteley</surname><order>18</order></author><author><firstname>Angela</firstname><surname>Wood</surname><order>19</order></author><author><firstname>Ashley</firstname><surname>Akbari</surname><orcid>0000-0003-0814-0801</orcid><order>20</order></author><author><firstname>(CVD-COVID-UK/COVID-IMPACT</firstname><surname>Consortium)</surname><order>21</order></author></authors><documents><document><filename>62338__26616__75a7a5dcf8714498b19267ad2f58b239.pdf</filename><originalFilename>62338_VoR.pdf</originalFilename><uploaded>2023-02-17T15:51:25.9533593</uploaded><type>Output</type><contentLength>4073093</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>© The Author(s) 2023. This article is licensed under a Creative Commons Attribution 4.0 International License</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>http://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807> |
spelling |
v2 62338 2023-01-17 Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration 93dd7e747f3118a99566c68592a3ddcc 0000-0002-9575-4758 Hoda Abbasizanjani Hoda Abbasizanjani true false f569591e1bfb0e405b8091f99fec45d3 0000-0002-5853-4625 Fatemeh Torabi Fatemeh Torabi true false c79d07eaba5c9515c0df82b372b76a41 Stuart Bedston Stuart Bedston true false ab7964da48dd7d1b74a81f2935dc564b Gareth Davies Gareth Davies true false 381464f639f98bd388c29326ca7f862c Rowena Griffiths Rowena Griffiths true false 0d5765f5486b80e173366af9a61ee200 0000-0001-7580-7413 Laura Herbert Laura Herbert true false db5bc529b8a9dfca2b4a268d14e03479 Emily Lowthian Emily Lowthian true false 1b74fa5125a88451c52c45bcf20e0b47 Jane Lyons Jane Lyons true false aa1b025ec0243f708bb5eb0a93d6fb52 0000-0003-0814-0801 Ashley Akbari Ashley Akbari true false 2023-01-17 HDAT BackgroundThe CVD-COVID-UK consortium was formed to understand the relationship between COVID-19 and cardiovascular diseases through analyses of harmonised electronic health records (EHRs) across the four UK nations. Beyond COVID-19, data harmonisation and common approaches enable analysis within and across independent Trusted Research Environments. Here we describe the reproducible harmonisation method developed using large-scale EHRs in Wales to accommodate the fast and efficient implementation of cross-nation analysis in England and Wales as part of the CVD-COVID-UK programme. We characterise current challenges and share lessons learnt.MethodsServing the scope and scalability of multiple study protocols, we used linked, anonymised individual-level EHR, demographic and administrative data held within the SAIL Databank for the population of Wales. The harmonisation method was implemented as a four-layer reproducible process, starting from raw data in the first layer. Then each of the layers two to four is framed by, but not limited to, the characterised challenges and lessons learnt. We achieved curated data as part of our second layer, followed by extracting phenotyped data in the third layer. We captured any project-specific requirements in the fourth layer.ResultsUsing the implemented four-layer harmonisation method, we retrieved approximately 100 health-related variables for the 3.2 million individuals in Wales, which are harmonised with corresponding variables for > 56 million individuals in England. We processed 13 data sources into the first layer of our harmonisation method: five of these are updated daily or weekly, and the rest at various frequencies providing sufficient data flow updates for frequent capturing of up-to-date demographic, administrative and clinical information.ConclusionsWe implemented an efficient, transparent, scalable, and reproducible harmonisation method that enables multi-nation collaborative research. With a current focus on COVID-19 and its relationship with cardiovascular outcomes, the harmonised data has supported a wide range of research activities across the UK. Journal Article BMC Medical Informatics and Decision Making 23 1 Springer Science and Business Media LLC 1472-6947 Population health, Data harmonisation, Common data model, Electronic health record, Trusted Research Environments, Reproducible research, SAIL databank, NHS digital TRE for England, COVID-19 16 1 2023 2023-01-16 10.1186/s12911-022-02093-0 COLLEGE NANME Health Data Science COLLEGE CODE HDAT Swansea University External research funder(s) paid the OA fee (includes OA grants disbursed by the Library) The British Heart Foundation Data Science Centre (Grant No SP/19/3/34678, awarded to Health Data Research (HDR) UK) funded co-development (with NHS Digital) of the trusted research environment, provision of linked datasets, data access, user software licences, computational usage, and data management and wrangling support, with additional contributions from the HDR UK Data and Connectivity component of the UK Government Chief Scientific Adviser’s National Core Studies programme to coordinate national COVID-19 priority research. Consortium partner organisations funded the time of contributing data analysts, biostatisticians, epidemiologists, and clinicians. This work was supported by the Con-COV team funded by the Medical Research Council (Grant Number: MR/V028367/1). This work was supported by Health Data Research UK, which receives its funding from HDR UK Ltd (HDR-9006) funded by the UK Medical Research Council, Engineering and Physical Sciences Research Council, Economic and Social Research Council, Department of Health and Social Care (England), Chief Scientist Office of the Scottish Government Health and Social Care Directorates, Health and Social Care Research and Development Division (Welsh Government), Public Health Agency (Northern Ireland), British Heart Foundation (BHF) and the Wellcome Trust. This work was supported by the ADR Wales programme of work. The ADR Wales programme of work is aligned to the priority themes as identified in the Welsh Government’s national strategy: Prosperity for All. ADR Wales brings together data science experts at Swansea University Medical School, staff from the Wales Institute of Social and Economic Research, Data and Methods (WISERD) at Cardiff University and specialist teams within the Welsh Government to develop new evidence which supports Prosperity for All by using the SAIL Databank at Swansea University, to link and analyse anonymised data. ADR Wales is part of the Economic and Social Research Council (part of UK Research and Innovation) funded ADR UK (Grant ES/S007393/1). This work was supported by the Wales COVID-19 Evidence Centre, funded by Health and Care Research Wales. 2023-06-12T14:27:34.9938501 2023-01-17T10:09:14.1424216 Faculty of Medicine, Health and Life Sciences Swansea University Medical School - Health Data Science Hoda Abbasizanjani 0000-0002-9575-4758 1 Fatemeh Torabi 0000-0002-5853-4625 2 Stuart Bedston 3 Thomas Bolton 4 Gareth Davies 5 Spiros Denaxas 6 Rowena Griffiths 7 Laura Herbert 0000-0001-7580-7413 8 Sam Hollings 9 Spencer Keene 10 Kamlesh Khunti 11 Emily Lowthian 12 Jane Lyons 13 Mehrdad A. Mizani 14 John Nolan 15 Cathie Sudlow 16 Venexia Walker 17 William Whiteley 18 Angela Wood 19 Ashley Akbari 0000-0003-0814-0801 20 (CVD-COVID-UK/COVID-IMPACT Consortium) 21 62338__26616__75a7a5dcf8714498b19267ad2f58b239.pdf 62338_VoR.pdf 2023-02-17T15:51:25.9533593 Output 4073093 application/pdf Version of Record true © The Author(s) 2023. This article is licensed under a Creative Commons Attribution 4.0 International License true eng http://creativecommons.org/licenses/by/4.0/ |
title |
Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration |
spellingShingle |
Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration Hoda Abbasizanjani Fatemeh Torabi Stuart Bedston Gareth Davies Rowena Griffiths Laura Herbert Emily Lowthian Jane Lyons Ashley Akbari |
title_short |
Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration |
title_full |
Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration |
title_fullStr |
Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration |
title_full_unstemmed |
Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration |
title_sort |
Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration |
author_id_str_mv |
93dd7e747f3118a99566c68592a3ddcc f569591e1bfb0e405b8091f99fec45d3 c79d07eaba5c9515c0df82b372b76a41 ab7964da48dd7d1b74a81f2935dc564b 381464f639f98bd388c29326ca7f862c 0d5765f5486b80e173366af9a61ee200 db5bc529b8a9dfca2b4a268d14e03479 1b74fa5125a88451c52c45bcf20e0b47 aa1b025ec0243f708bb5eb0a93d6fb52 |
author_id_fullname_str_mv |
93dd7e747f3118a99566c68592a3ddcc_***_Hoda Abbasizanjani f569591e1bfb0e405b8091f99fec45d3_***_Fatemeh Torabi c79d07eaba5c9515c0df82b372b76a41_***_Stuart Bedston ab7964da48dd7d1b74a81f2935dc564b_***_Gareth Davies 381464f639f98bd388c29326ca7f862c_***_Rowena Griffiths 0d5765f5486b80e173366af9a61ee200_***_Laura Herbert db5bc529b8a9dfca2b4a268d14e03479_***_Emily Lowthian 1b74fa5125a88451c52c45bcf20e0b47_***_Jane Lyons aa1b025ec0243f708bb5eb0a93d6fb52_***_Ashley Akbari |
author |
Hoda Abbasizanjani Fatemeh Torabi Stuart Bedston Gareth Davies Rowena Griffiths Laura Herbert Emily Lowthian Jane Lyons Ashley Akbari |
author2 |
Hoda Abbasizanjani Fatemeh Torabi Stuart Bedston Thomas Bolton Gareth Davies Spiros Denaxas Rowena Griffiths Laura Herbert Sam Hollings Spencer Keene Kamlesh Khunti Emily Lowthian Jane Lyons Mehrdad A. Mizani John Nolan Cathie Sudlow Venexia Walker William Whiteley Angela Wood Ashley Akbari (CVD-COVID-UK/COVID-IMPACT Consortium) |
format |
Journal article |
container_title |
BMC Medical Informatics and Decision Making |
container_volume |
23 |
container_issue |
1 |
publishDate |
2023 |
institution |
Swansea University |
issn |
1472-6947 |
doi_str_mv |
10.1186/s12911-022-02093-0 |
publisher |
Springer Science and Business Media LLC |
college_str |
Faculty of Medicine, Health and Life Sciences |
hierarchytype |
|
hierarchy_top_id |
facultyofmedicinehealthandlifesciences |
hierarchy_top_title |
Faculty of Medicine, Health and Life Sciences |
hierarchy_parent_id |
facultyofmedicinehealthandlifesciences |
hierarchy_parent_title |
Faculty of Medicine, Health and Life Sciences |
department_str |
Swansea University Medical School - Health Data Science{{{_:::_}}}Faculty of Medicine, Health and Life Sciences{{{_:::_}}}Swansea University Medical School - Health Data Science |
document_store_str |
1 |
active_str |
0 |
description |
BackgroundThe CVD-COVID-UK consortium was formed to understand the relationship between COVID-19 and cardiovascular diseases through analyses of harmonised electronic health records (EHRs) across the four UK nations. Beyond COVID-19, data harmonisation and common approaches enable analysis within and across independent Trusted Research Environments. Here we describe the reproducible harmonisation method developed using large-scale EHRs in Wales to accommodate the fast and efficient implementation of cross-nation analysis in England and Wales as part of the CVD-COVID-UK programme. We characterise current challenges and share lessons learnt.MethodsServing the scope and scalability of multiple study protocols, we used linked, anonymised individual-level EHR, demographic and administrative data held within the SAIL Databank for the population of Wales. The harmonisation method was implemented as a four-layer reproducible process, starting from raw data in the first layer. Then each of the layers two to four is framed by, but not limited to, the characterised challenges and lessons learnt. We achieved curated data as part of our second layer, followed by extracting phenotyped data in the third layer. We captured any project-specific requirements in the fourth layer.ResultsUsing the implemented four-layer harmonisation method, we retrieved approximately 100 health-related variables for the 3.2 million individuals in Wales, which are harmonised with corresponding variables for > 56 million individuals in England. We processed 13 data sources into the first layer of our harmonisation method: five of these are updated daily or weekly, and the rest at various frequencies providing sufficient data flow updates for frequent capturing of up-to-date demographic, administrative and clinical information.ConclusionsWe implemented an efficient, transparent, scalable, and reproducible harmonisation method that enables multi-nation collaborative research. With a current focus on COVID-19 and its relationship with cardiovascular outcomes, the harmonised data has supported a wide range of research activities across the UK. |
published_date |
2023-01-16T14:27:33Z |
_version_ |
1768503591599865856 |
score |
11.037319 |