Conference Paper/Proceeding/Abstract 960 views 122 downloads
Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers
CHI Conference on Human Factors in Computing Systems (CHI '22), April 29–May 5, 2022, New Orleans, LA, USA. ACM, New York, NY, USA, Pages: 1 - 17
Swansea University Authors: Thomas Reitmaier , Matt Jones , Simon Robinson , Jen Pearson
-
PDF | Version of Record
Distributed under the terms of a Creative Commons Attribution 4.0 (CC-BY) Licence.
Download (1.77MB)
DOI (Published version): 10.1145/3491102.3517639
Abstract
Automatic Speech Recognition (ASR) researchers are turning their attention towards supporting low-resource languages, such as isiXhosa or Marathi, with only limited training resources. We report and reflect on collaborative research across ASR & HCI to situate ASR-enabled technologies to suit th...
Published in: | CHI Conference on Human Factors in Computing Systems (CHI '22), April 29–May 5, 2022, New Orleans, LA, USA. ACM, New York, NY, USA |
---|---|
ISBN: | 978-1-4503-9157-3 |
Published: |
New York, NY, USA
ACM Digital Library
2022
|
URI: | https://cronfa.swan.ac.uk/Record/cronfa59573 |
first_indexed |
2022-03-10T14:58:46Z |
---|---|
last_indexed |
2024-11-14T12:15:45Z |
id |
cronfa59573 |
recordtype |
SURis |
fullrecord |
<?xml version="1.0"?><rfc1807><datestamp>2024-07-11T15:42:26.4457754</datestamp><bib-version>v2</bib-version><id>59573</id><entry>2022-03-10</entry><title>Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers</title><swanseaauthors><author><sid>ccd66b64d11d76b9cd8b28e9d42a0ff0</sid><ORCID>0000-0003-2078-6699</ORCID><firstname>Thomas</firstname><surname>Reitmaier</surname><name>Thomas Reitmaier</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>10b46d7843c2ba53d116ca2ed9abb56e</sid><ORCID>0000-0001-7657-7373</ORCID><firstname>Matt</firstname><surname>Jones</surname><name>Matt Jones</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>cb3b57a21fa4e48ec633d6ba46455e91</sid><ORCID>0000-0001-9228-006X</ORCID><firstname>Simon</firstname><surname>Robinson</surname><name>Simon Robinson</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>6d662d9e2151b302ed384b243e2a802f</sid><ORCID>0000-0002-1960-1012</ORCID><firstname>Jen</firstname><surname>Pearson</surname><name>Jen Pearson</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2022-03-10</date><deptcode>MACS</deptcode><abstract>Automatic Speech Recognition (ASR) researchers are turning their attention towards supporting low-resource languages, such as isiXhosa or Marathi, with only limited training resources. We report and reflect on collaborative research across ASR & HCI to situate ASR-enabled technologies to suit the needs and functions of two communities of low-resource language speakers, on the outskirts of Cape Town, South Africa and in Mumbai, India. We build on longstanding community partnerships and draw on linguistics, media studies and HCI scholarship to guide our research. We demonstrate diverse design methods to: remotely engage participants; collect speech data to test ASR models; and ultimately field-test models with users. Reflecting on the research, we identify opportunities, challenges, and use-cases of ASR, in particular to support pervasive use of WhatsApp voice messaging. Finally, we uncover implications for collaborations across ASR & HCI that advance important discussions at CHI surrounding data, ethics, and AI.</abstract><type>Conference Paper/Proceeding/Abstract</type><journal>CHI Conference on Human Factors in Computing Systems (CHI '22), April 29–May 5, 2022, New Orleans, LA, USA. ACM, New York, NY, USA</journal><volume/><journalNumber/><paginationStart>1</paginationStart><paginationEnd>17</paginationEnd><publisher>ACM Digital Library</publisher><placeOfPublication>New York, NY, USA</placeOfPublication><isbnPrint>978-1-4503-9157-3</isbnPrint><isbnElectronic/><issnPrint/><issnElectronic/><keywords>Speech/language, automatic speech recognition, mobile devices</keywords><publishedDay>29</publishedDay><publishedMonth>4</publishedMonth><publishedYear>2022</publishedYear><publishedDate>2022-04-29</publishedDate><doi>10.1145/3491102.3517639</doi><url/><notes/><college>COLLEGE NANME</college><department>Mathematics and Computer Science School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MACS</DepartmentCode><institution>Swansea University</institution><apcterm>External research funder(s) paid the OA fee (includes OA grants disbursed by the Library)</apcterm><funders>UKRI</funders><projectreference>EP/T024976/1</projectreference><lastEdited>2024-07-11T15:42:26.4457754</lastEdited><Created>2022-03-10T14:57:32.7140042</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Thomas</firstname><surname>Reitmaier</surname><orcid>0000-0003-2078-6699</orcid><order>1</order></author><author><firstname>Electra</firstname><surname>Wallington</surname><order>2</order></author><author><firstname>Dani Kalarikalayil</firstname><surname>Raju</surname><order>3</order></author><author><firstname>Ondrej</firstname><surname>Klejch</surname><order>4</order></author><author><firstname>Jennifer</firstname><surname>Pearson</surname><order>5</order></author><author><firstname>Matt</firstname><surname>Jones</surname><orcid>0000-0001-7657-7373</orcid><order>6</order></author><author><firstname>Peter</firstname><surname>Bell</surname><order>7</order></author><author><firstname>Simon</firstname><surname>Robinson</surname><orcid>0000-0001-9228-006X</orcid><order>8</order></author><author><firstname>Jen</firstname><surname>Pearson</surname><orcid>0000-0002-1960-1012</orcid><order>9</order></author></authors><documents><document><filename>59573__22666__12859841e0c34bc9949e1c476fc39f76.pdf</filename><originalFilename>chi22-533.pdf</originalFilename><uploaded>2022-03-24T14:23:03.5419140</uploaded><type>Output</type><contentLength>1854554</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>Distributed under the terms of a Creative Commons Attribution 4.0 (CC-BY) Licence.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>https://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807> |
spelling |
2024-07-11T15:42:26.4457754 v2 59573 2022-03-10 Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers ccd66b64d11d76b9cd8b28e9d42a0ff0 0000-0003-2078-6699 Thomas Reitmaier Thomas Reitmaier true false 10b46d7843c2ba53d116ca2ed9abb56e 0000-0001-7657-7373 Matt Jones Matt Jones true false cb3b57a21fa4e48ec633d6ba46455e91 0000-0001-9228-006X Simon Robinson Simon Robinson true false 6d662d9e2151b302ed384b243e2a802f 0000-0002-1960-1012 Jen Pearson Jen Pearson true false 2022-03-10 MACS Automatic Speech Recognition (ASR) researchers are turning their attention towards supporting low-resource languages, such as isiXhosa or Marathi, with only limited training resources. We report and reflect on collaborative research across ASR & HCI to situate ASR-enabled technologies to suit the needs and functions of two communities of low-resource language speakers, on the outskirts of Cape Town, South Africa and in Mumbai, India. We build on longstanding community partnerships and draw on linguistics, media studies and HCI scholarship to guide our research. We demonstrate diverse design methods to: remotely engage participants; collect speech data to test ASR models; and ultimately field-test models with users. Reflecting on the research, we identify opportunities, challenges, and use-cases of ASR, in particular to support pervasive use of WhatsApp voice messaging. Finally, we uncover implications for collaborations across ASR & HCI that advance important discussions at CHI surrounding data, ethics, and AI. Conference Paper/Proceeding/Abstract CHI Conference on Human Factors in Computing Systems (CHI '22), April 29–May 5, 2022, New Orleans, LA, USA. ACM, New York, NY, USA 1 17 ACM Digital Library New York, NY, USA 978-1-4503-9157-3 Speech/language, automatic speech recognition, mobile devices 29 4 2022 2022-04-29 10.1145/3491102.3517639 COLLEGE NANME Mathematics and Computer Science School COLLEGE CODE MACS Swansea University External research funder(s) paid the OA fee (includes OA grants disbursed by the Library) UKRI EP/T024976/1 2024-07-11T15:42:26.4457754 2022-03-10T14:57:32.7140042 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Thomas Reitmaier 0000-0003-2078-6699 1 Electra Wallington 2 Dani Kalarikalayil Raju 3 Ondrej Klejch 4 Jennifer Pearson 5 Matt Jones 0000-0001-7657-7373 6 Peter Bell 7 Simon Robinson 0000-0001-9228-006X 8 Jen Pearson 0000-0002-1960-1012 9 59573__22666__12859841e0c34bc9949e1c476fc39f76.pdf chi22-533.pdf 2022-03-24T14:23:03.5419140 Output 1854554 application/pdf Version of Record true Distributed under the terms of a Creative Commons Attribution 4.0 (CC-BY) Licence. true eng https://creativecommons.org/licenses/by/4.0/ |
title |
Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers |
spellingShingle |
Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers Thomas Reitmaier Matt Jones Simon Robinson Jen Pearson |
title_short |
Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers |
title_full |
Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers |
title_fullStr |
Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers |
title_full_unstemmed |
Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers |
title_sort |
Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers |
author_id_str_mv |
ccd66b64d11d76b9cd8b28e9d42a0ff0 10b46d7843c2ba53d116ca2ed9abb56e cb3b57a21fa4e48ec633d6ba46455e91 6d662d9e2151b302ed384b243e2a802f |
author_id_fullname_str_mv |
ccd66b64d11d76b9cd8b28e9d42a0ff0_***_Thomas Reitmaier 10b46d7843c2ba53d116ca2ed9abb56e_***_Matt Jones cb3b57a21fa4e48ec633d6ba46455e91_***_Simon Robinson 6d662d9e2151b302ed384b243e2a802f_***_Jen Pearson |
author |
Thomas Reitmaier Matt Jones Simon Robinson Jen Pearson |
author2 |
Thomas Reitmaier Electra Wallington Dani Kalarikalayil Raju Ondrej Klejch Jennifer Pearson Matt Jones Peter Bell Simon Robinson Jen Pearson |
format |
Conference Paper/Proceeding/Abstract |
container_title |
CHI Conference on Human Factors in Computing Systems (CHI '22), April 29–May 5, 2022, New Orleans, LA, USA. ACM, New York, NY, USA |
container_start_page |
1 |
publishDate |
2022 |
institution |
Swansea University |
isbn |
978-1-4503-9157-3 |
doi_str_mv |
10.1145/3491102.3517639 |
publisher |
ACM Digital Library |
college_str |
Faculty of Science and Engineering |
hierarchytype |
|
hierarchy_top_id |
facultyofscienceandengineering |
hierarchy_top_title |
Faculty of Science and Engineering |
hierarchy_parent_id |
facultyofscienceandengineering |
hierarchy_parent_title |
Faculty of Science and Engineering |
department_str |
School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science |
document_store_str |
1 |
active_str |
0 |
description |
Automatic Speech Recognition (ASR) researchers are turning their attention towards supporting low-resource languages, such as isiXhosa or Marathi, with only limited training resources. We report and reflect on collaborative research across ASR & HCI to situate ASR-enabled technologies to suit the needs and functions of two communities of low-resource language speakers, on the outskirts of Cape Town, South Africa and in Mumbai, India. We build on longstanding community partnerships and draw on linguistics, media studies and HCI scholarship to guide our research. We demonstrate diverse design methods to: remotely engage participants; collect speech data to test ASR models; and ultimately field-test models with users. Reflecting on the research, we identify opportunities, challenges, and use-cases of ASR, in particular to support pervasive use of WhatsApp voice messaging. Finally, we uncover implications for collaborations across ASR & HCI that advance important discussions at CHI surrounding data, ethics, and AI. |
published_date |
2022-04-29T20:10:20Z |
_version_ |
1821346963490078720 |
score |
11.04748 |