No Cover Image

Journal article 113 views 17 downloads

Improving opportunities for data linkage within Children Looked After administrative records in Wales

Grace Bailey Orcid Logo, Alexandra Lee, Saira Ahmed, Ieuan Scanlon, Laura Cowley, Amy Stuart, Ian Farr, Caroline Brooks Orcid Logo, Laura North, Lucy Griffiths Orcid Logo

International Journal of Population Data Science, Volume: 10, Issue: 1

Swansea University Authors: Grace Bailey Orcid Logo, Alexandra Lee, Saira Ahmed, Ieuan Scanlon, Laura Cowley, Ian Farr, Caroline Brooks Orcid Logo, Laura North, Lucy Griffiths Orcid Logo

Abstract

IntroductionLinkage of population-based administrative data is a powerful tool for studying important public issues. To overcome confidentiality and disclosure issues, records are de-identified and allocated a unique identifier. Within the Secure Anonymised Information Linkage (SAIL) Databank, these...

Full description

Published in: International Journal of Population Data Science
ISSN: 2399-4908
Published: Swansea University 2025
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa68946
first_indexed 2025-02-23T16:01:37Z
last_indexed 2025-03-14T04:21:02Z
id cronfa68946
recordtype SURis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2025-03-12T15:15:32.6460714</datestamp><bib-version>v2</bib-version><id>68946</id><entry>2025-02-23</entry><title>Improving opportunities for data linkage within Children Looked After administrative records in Wales</title><swanseaauthors><author><sid>1e09a407fca9e8047e7738b18d381130</sid><ORCID>0000-0003-4646-3134</ORCID><firstname>Grace</firstname><surname>Bailey</surname><name>Grace Bailey</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>7c6dc217555b0fea264ff0dd7d0aa374</sid><firstname>Alexandra</firstname><surname>Lee</surname><name>Alexandra Lee</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>2bf49b38ca1517d326228b3e8fdf6e78</sid><firstname>Saira</firstname><surname>Ahmed</surname><name>Saira Ahmed</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>9fcb224c6bd804a4d41a2a8570a71185</sid><firstname>Ieuan</firstname><surname>Scanlon</surname><name>Ieuan Scanlon</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>a80501f280e89fee276510b25fc68e77</sid><firstname>Laura</firstname><surname>Cowley</surname><name>Laura Cowley</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>3c02e7e9c2b064ee3e96e83b9777dde4</sid><firstname>Ian</firstname><surname>Farr</surname><name>Ian Farr</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>ac99c6134cf75b4c3e5f63cbb1a149ee</sid><ORCID>0000-0002-4612-5867</ORCID><firstname>Caroline</firstname><surname>Brooks</surname><name>Caroline Brooks</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>a255822cf77a0184cb6922e9fbea39e9</sid><firstname>Laura</firstname><surname>North</surname><name>Laura North</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>e35ea6ea4b429e812ef204b048131d93</sid><ORCID>0000-0001-9230-624X</ORCID><firstname>Lucy</firstname><surname>Griffiths</surname><name>Lucy Griffiths</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2025-02-23</date><deptcode>MEDS</deptcode><abstract>IntroductionLinkage of population-based administrative data is a powerful tool for studying important public issues. To overcome confidentiality and disclosure issues, records are de-identified and allocated a unique identifier. Within the Secure Anonymised Information Linkage (SAIL) Databank, these are known as Anonymised Linking Fields (ALFs). Assignment of an ALF enables linkage of individuals across multiple routinely collected datasets. Within the Children Looked After (CLA) Wales dataset, only 37% of the children have an ALF, limiting linkage to other datasets and, as a result, potential research. There are also other known data issues, including discrepancies with the week of births, duplicate identifiers and year-on-year changes in identifiers.ObjectivesTo improve accuracy and availability of the ALFs in the CLA dataset, and overall research quality.MethodsUsing several datasets within the SAIL Databank, we developed a six-step CLA matching algorithm to improve the ALF matching rate and correct for data errors. To assess the performance of our algorithm, we benchmarked against routine ALFs already identified via the algorithm currently used by SAIL.ResultsOur algorithm increased ALF matching by 25%, assigning 61% of individuals an ALF. Inconsistent weeks of birth, and incorrect and duplicate identifiers were resolved. When benchmarking against the current ALF-assigning algorithm used by SAIL, our algorithm had an overall sensitivity of 90%.ConclusionWe have developed an algorithm which demonstrates comparable ALF matching performance to the current algorithm used within SAIL, and which greatly improves the ALF matching in the CLA dataset. This algorithm may help to overcome potential bias due to missing data, and increases the potential for linkage to other datasets. Further development and refinement could result in the algorithm being applied to other datasets in SAIL.</abstract><type>Journal Article</type><journal>International Journal of Population Data Science</journal><volume>10</volume><journalNumber>1</journalNumber><paginationStart/><paginationEnd/><publisher>Swansea University</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic>2399-4908</issnElectronic><keywords>administrative data linkage; children looked after; SAIL Databank</keywords><publishedDay>19</publishedDay><publishedMonth>2</publishedMonth><publishedYear>2025</publishedYear><publishedDate>2025-02-19</publishedDate><doi>10.23889/ijpds.v10i1.2383</doi><url/><notes/><college>COLLEGE NANME</college><department>Medical School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MEDS</DepartmentCode><institution>Swansea University</institution><apcterm>Other</apcterm><funders>This work was supported by Health and Care ResearchWales and Administrative Data Research (ADR) Wales. LJGis a member of the Children&#x2019;s Social Care Research andDevelopment Centre (CASCADE) partnership, which receivesinfrastructure funding from Health and Care Research Wales(HCRW) (517199). LEC is a research fellow, funded by Healthand Care Research Wales (SCF-22-07).</funders><projectreference/><lastEdited>2025-03-12T15:15:32.6460714</lastEdited><Created>2025-02-23T15:22:43.3531424</Created><path><level id="1">Faculty of Medicine, Health and Life Sciences</level><level id="2">Swansea University Medical School - Health Data Science</level></path><authors><author><firstname>Grace</firstname><surname>Bailey</surname><orcid>0000-0003-4646-3134</orcid><order>1</order></author><author><firstname>Alexandra</firstname><surname>Lee</surname><order>2</order></author><author><firstname>Saira</firstname><surname>Ahmed</surname><order>3</order></author><author><firstname>Ieuan</firstname><surname>Scanlon</surname><order>4</order></author><author><firstname>Laura</firstname><surname>Cowley</surname><order>5</order></author><author><firstname>Amy</firstname><surname>Stuart</surname><order>6</order></author><author><firstname>Ian</firstname><surname>Farr</surname><order>7</order></author><author><firstname>Caroline</firstname><surname>Brooks</surname><orcid>0000-0002-4612-5867</orcid><order>8</order></author><author><firstname>Laura</firstname><surname>North</surname><order>9</order></author><author><firstname>Lucy</firstname><surname>Griffiths</surname><orcid>0000-0001-9230-624X</orcid><order>10</order></author></authors><documents><document><filename>68946__33801__003651af64bc447b8db94f8d6d7c2d5b.pdf</filename><originalFilename>68946.VoR.pdf</originalFilename><uploaded>2025-03-12T15:14:10.8451184</uploaded><type>Output</type><contentLength>1881729</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>&#xA9; The Authors. Open Access under CC BY 4.0.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>https://creativecommons.org/licenses/by/4.0/deed.en</licence></document></documents><OutputDurs/></rfc1807>
spelling 2025-03-12T15:15:32.6460714 v2 68946 2025-02-23 Improving opportunities for data linkage within Children Looked After administrative records in Wales 1e09a407fca9e8047e7738b18d381130 0000-0003-4646-3134 Grace Bailey Grace Bailey true false 7c6dc217555b0fea264ff0dd7d0aa374 Alexandra Lee Alexandra Lee true false 2bf49b38ca1517d326228b3e8fdf6e78 Saira Ahmed Saira Ahmed true false 9fcb224c6bd804a4d41a2a8570a71185 Ieuan Scanlon Ieuan Scanlon true false a80501f280e89fee276510b25fc68e77 Laura Cowley Laura Cowley true false 3c02e7e9c2b064ee3e96e83b9777dde4 Ian Farr Ian Farr true false ac99c6134cf75b4c3e5f63cbb1a149ee 0000-0002-4612-5867 Caroline Brooks Caroline Brooks true false a255822cf77a0184cb6922e9fbea39e9 Laura North Laura North true false e35ea6ea4b429e812ef204b048131d93 0000-0001-9230-624X Lucy Griffiths Lucy Griffiths true false 2025-02-23 MEDS IntroductionLinkage of population-based administrative data is a powerful tool for studying important public issues. To overcome confidentiality and disclosure issues, records are de-identified and allocated a unique identifier. Within the Secure Anonymised Information Linkage (SAIL) Databank, these are known as Anonymised Linking Fields (ALFs). Assignment of an ALF enables linkage of individuals across multiple routinely collected datasets. Within the Children Looked After (CLA) Wales dataset, only 37% of the children have an ALF, limiting linkage to other datasets and, as a result, potential research. There are also other known data issues, including discrepancies with the week of births, duplicate identifiers and year-on-year changes in identifiers.ObjectivesTo improve accuracy and availability of the ALFs in the CLA dataset, and overall research quality.MethodsUsing several datasets within the SAIL Databank, we developed a six-step CLA matching algorithm to improve the ALF matching rate and correct for data errors. To assess the performance of our algorithm, we benchmarked against routine ALFs already identified via the algorithm currently used by SAIL.ResultsOur algorithm increased ALF matching by 25%, assigning 61% of individuals an ALF. Inconsistent weeks of birth, and incorrect and duplicate identifiers were resolved. When benchmarking against the current ALF-assigning algorithm used by SAIL, our algorithm had an overall sensitivity of 90%.ConclusionWe have developed an algorithm which demonstrates comparable ALF matching performance to the current algorithm used within SAIL, and which greatly improves the ALF matching in the CLA dataset. This algorithm may help to overcome potential bias due to missing data, and increases the potential for linkage to other datasets. Further development and refinement could result in the algorithm being applied to other datasets in SAIL. Journal Article International Journal of Population Data Science 10 1 Swansea University 2399-4908 administrative data linkage; children looked after; SAIL Databank 19 2 2025 2025-02-19 10.23889/ijpds.v10i1.2383 COLLEGE NANME Medical School COLLEGE CODE MEDS Swansea University Other This work was supported by Health and Care ResearchWales and Administrative Data Research (ADR) Wales. LJGis a member of the Children’s Social Care Research andDevelopment Centre (CASCADE) partnership, which receivesinfrastructure funding from Health and Care Research Wales(HCRW) (517199). LEC is a research fellow, funded by Healthand Care Research Wales (SCF-22-07). 2025-03-12T15:15:32.6460714 2025-02-23T15:22:43.3531424 Faculty of Medicine, Health and Life Sciences Swansea University Medical School - Health Data Science Grace Bailey 0000-0003-4646-3134 1 Alexandra Lee 2 Saira Ahmed 3 Ieuan Scanlon 4 Laura Cowley 5 Amy Stuart 6 Ian Farr 7 Caroline Brooks 0000-0002-4612-5867 8 Laura North 9 Lucy Griffiths 0000-0001-9230-624X 10 68946__33801__003651af64bc447b8db94f8d6d7c2d5b.pdf 68946.VoR.pdf 2025-03-12T15:14:10.8451184 Output 1881729 application/pdf Version of Record true © The Authors. Open Access under CC BY 4.0. true eng https://creativecommons.org/licenses/by/4.0/deed.en
title Improving opportunities for data linkage within Children Looked After administrative records in Wales
spellingShingle Improving opportunities for data linkage within Children Looked After administrative records in Wales
Grace Bailey
Alexandra Lee
Saira Ahmed
Ieuan Scanlon
Laura Cowley
Ian Farr
Caroline Brooks
Laura North
Lucy Griffiths
title_short Improving opportunities for data linkage within Children Looked After administrative records in Wales
title_full Improving opportunities for data linkage within Children Looked After administrative records in Wales
title_fullStr Improving opportunities for data linkage within Children Looked After administrative records in Wales
title_full_unstemmed Improving opportunities for data linkage within Children Looked After administrative records in Wales
title_sort Improving opportunities for data linkage within Children Looked After administrative records in Wales
author_id_str_mv 1e09a407fca9e8047e7738b18d381130
7c6dc217555b0fea264ff0dd7d0aa374
2bf49b38ca1517d326228b3e8fdf6e78
9fcb224c6bd804a4d41a2a8570a71185
a80501f280e89fee276510b25fc68e77
3c02e7e9c2b064ee3e96e83b9777dde4
ac99c6134cf75b4c3e5f63cbb1a149ee
a255822cf77a0184cb6922e9fbea39e9
e35ea6ea4b429e812ef204b048131d93
author_id_fullname_str_mv 1e09a407fca9e8047e7738b18d381130_***_Grace Bailey
7c6dc217555b0fea264ff0dd7d0aa374_***_Alexandra Lee
2bf49b38ca1517d326228b3e8fdf6e78_***_Saira Ahmed
9fcb224c6bd804a4d41a2a8570a71185_***_Ieuan Scanlon
a80501f280e89fee276510b25fc68e77_***_Laura Cowley
3c02e7e9c2b064ee3e96e83b9777dde4_***_Ian Farr
ac99c6134cf75b4c3e5f63cbb1a149ee_***_Caroline Brooks
a255822cf77a0184cb6922e9fbea39e9_***_Laura North
e35ea6ea4b429e812ef204b048131d93_***_Lucy Griffiths
author Grace Bailey
Alexandra Lee
Saira Ahmed
Ieuan Scanlon
Laura Cowley
Ian Farr
Caroline Brooks
Laura North
Lucy Griffiths
author2 Grace Bailey
Alexandra Lee
Saira Ahmed
Ieuan Scanlon
Laura Cowley
Amy Stuart
Ian Farr
Caroline Brooks
Laura North
Lucy Griffiths
format Journal article
container_title International Journal of Population Data Science
container_volume 10
container_issue 1
publishDate 2025
institution Swansea University
issn 2399-4908
doi_str_mv 10.23889/ijpds.v10i1.2383
publisher Swansea University
college_str Faculty of Medicine, Health and Life Sciences
hierarchytype
hierarchy_top_id facultyofmedicinehealthandlifesciences
hierarchy_top_title Faculty of Medicine, Health and Life Sciences
hierarchy_parent_id facultyofmedicinehealthandlifesciences
hierarchy_parent_title Faculty of Medicine, Health and Life Sciences
department_str Swansea University Medical School - Health Data Science{{{_:::_}}}Faculty of Medicine, Health and Life Sciences{{{_:::_}}}Swansea University Medical School - Health Data Science
document_store_str 1
active_str 0
description IntroductionLinkage of population-based administrative data is a powerful tool for studying important public issues. To overcome confidentiality and disclosure issues, records are de-identified and allocated a unique identifier. Within the Secure Anonymised Information Linkage (SAIL) Databank, these are known as Anonymised Linking Fields (ALFs). Assignment of an ALF enables linkage of individuals across multiple routinely collected datasets. Within the Children Looked After (CLA) Wales dataset, only 37% of the children have an ALF, limiting linkage to other datasets and, as a result, potential research. There are also other known data issues, including discrepancies with the week of births, duplicate identifiers and year-on-year changes in identifiers.ObjectivesTo improve accuracy and availability of the ALFs in the CLA dataset, and overall research quality.MethodsUsing several datasets within the SAIL Databank, we developed a six-step CLA matching algorithm to improve the ALF matching rate and correct for data errors. To assess the performance of our algorithm, we benchmarked against routine ALFs already identified via the algorithm currently used by SAIL.ResultsOur algorithm increased ALF matching by 25%, assigning 61% of individuals an ALF. Inconsistent weeks of birth, and incorrect and duplicate identifiers were resolved. When benchmarking against the current ALF-assigning algorithm used by SAIL, our algorithm had an overall sensitivity of 90%.ConclusionWe have developed an algorithm which demonstrates comparable ALF matching performance to the current algorithm used within SAIL, and which greatly improves the ALF matching in the CLA dataset. This algorithm may help to overcome potential bias due to missing data, and increases the potential for linkage to other datasets. Further development and refinement could result in the algorithm being applied to other datasets in SAIL.
published_date 2025-02-19T08:18:42Z
_version_ 1829814531093692416
score 11.058331