Conference Paper/Proceeding/Abstract 478 views 69 downloads
Cultivating Spoken Language Technologies for Unwritten Languages
Proceedings of the CHI Conference on Human Factors in Computing Systems
Swansea University Authors: Thomas Reitmaier , Jen Pearson , Matt Jones , Simon Robinson
-
PDF | Version of Record
This work is licensed under a Creative Commons Attribution International 4.0 License.
Download (2.06MB)
DOI (Published version): 10.1145/3613904.3642026
Abstract
We report on community-centered, collaborative research that weaves together HCI, natural language processing, linguistic, and design insights to develop spoken language technologies for unwritten languages. Across three visits to a Banjara farming community in India, we use participatory, technical...
Published in: | Proceedings of the CHI Conference on Human Factors in Computing Systems |
---|---|
ISBN: | 979-8-4007-0330-0 |
Published: |
New York, NY, USA
ACM
2024
|
URI: | https://cronfa.swan.ac.uk/Record/cronfa65595 |
first_indexed |
2024-03-14T11:19:18Z |
---|---|
last_indexed |
2024-11-25T14:16:25Z |
id |
cronfa65595 |
recordtype |
SURis |
fullrecord |
<?xml version="1.0"?><rfc1807><datestamp>2024-07-11T14:41:10.4215649</datestamp><bib-version>v2</bib-version><id>65595</id><entry>2024-02-08</entry><title>Cultivating Spoken Language Technologies for Unwritten Languages</title><swanseaauthors><author><sid>ccd66b64d11d76b9cd8b28e9d42a0ff0</sid><ORCID>0000-0003-2078-6699</ORCID><firstname>Thomas</firstname><surname>Reitmaier</surname><name>Thomas Reitmaier</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>6d662d9e2151b302ed384b243e2a802f</sid><ORCID>0000-0002-1960-1012</ORCID><firstname>Jen</firstname><surname>Pearson</surname><name>Jen Pearson</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>10b46d7843c2ba53d116ca2ed9abb56e</sid><ORCID>0000-0001-7657-7373</ORCID><firstname>Matt</firstname><surname>Jones</surname><name>Matt Jones</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>cb3b57a21fa4e48ec633d6ba46455e91</sid><ORCID>0000-0001-9228-006X</ORCID><firstname>Simon</firstname><surname>Robinson</surname><name>Simon Robinson</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2024-02-08</date><deptcode>MACS</deptcode><abstract>We report on community-centered, collaborative research that weaves together HCI, natural language processing, linguistic, and design insights to develop spoken language technologies for unwritten languages. Across three visits to a Banjara farming community in India, we use participatory, technical, and creative methods to engage community members, collect spoken language photo annotations, and develop an information retrieval (IR) system. Drawing on orality theory, we interrogate assumptions and biases of current speech interfaces and create a simple application that leverages our IR system to match fluidly spoken queries with recorded annotations and surface corresponding photos. In-situ evaluations show how our novel approach returns reliable results and inspired the co-creation of media retrieval use-cases that are more appropriate in oral contexts. The very low (< 4h) spoken data requirements makes our approach adaptable to other contexts where languages are unwritten or have no digital language resources available.</abstract><type>Conference Paper/Proceeding/Abstract</type><journal>Proceedings of the CHI Conference on Human Factors in Computing Systems</journal><volume/><journalNumber/><paginationStart/><paginationEnd/><publisher>ACM</publisher><placeOfPublication>New York, NY, USA</placeOfPublication><isbnPrint/><isbnElectronic>979-8-4007-0330-0</isbnElectronic><issnPrint/><issnElectronic/><keywords>speech/language, zero-resource information retrieval, co-creation field study</keywords><publishedDay>11</publishedDay><publishedMonth>5</publishedMonth><publishedYear>2024</publishedYear><publishedDate>2024-05-11</publishedDate><doi>10.1145/3613904.3642026</doi><url/><notes/><college>COLLEGE NANME</college><department>Mathematics and Computer Science School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MACS</DepartmentCode><institution>Swansea University</institution><apcterm/><funders/><projectreference/><lastEdited>2024-07-11T14:41:10.4215649</lastEdited><Created>2024-02-08T11:37:14.8656282</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Thomas</firstname><surname>Reitmaier</surname><orcid>0000-0003-2078-6699</orcid><order>1</order></author><author><firstname>Dani Kalarikalayil</firstname><surname>Raju</surname><orcid>0000-0003-1854-5271</orcid><order>2</order></author><author><firstname>Ondrej</firstname><surname>Klejch</surname><orcid>0000-0001-5495-967x</orcid><order>3</order></author><author><firstname>Electra</firstname><surname>Wallington</surname><orcid>0000-0003-4113-2352</orcid><order>4</order></author><author><firstname>Nina</firstname><surname>Markl</surname><orcid>0000-0001-9906-9961</orcid><order>5</order></author><author><firstname>Jen</firstname><surname>Pearson</surname><orcid>0000-0002-1960-1012</orcid><order>6</order></author><author><firstname>Matt</firstname><surname>Jones</surname><orcid>0000-0001-7657-7373</orcid><order>7</order></author><author><firstname>Peter</firstname><surname>Bell</surname><orcid>0000-0002-9597-9615</orcid><order>8</order></author><author><firstname>Simon</firstname><surname>Robinson</surname><orcid>0000-0001-9228-006X</orcid><order>9</order></author></authors><documents><document><filename>65595__30427__80d0809bb9684e59b260a8cbe19efd14.pdf</filename><originalFilename>65595.VoR.pdf</originalFilename><uploaded>2024-05-21T16:04:59.1576306</uploaded><type>Output</type><contentLength>2156446</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>This work is licensed under a Creative Commons Attribution International 4.0 License.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>https://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807> |
spelling |
2024-07-11T14:41:10.4215649 v2 65595 2024-02-08 Cultivating Spoken Language Technologies for Unwritten Languages ccd66b64d11d76b9cd8b28e9d42a0ff0 0000-0003-2078-6699 Thomas Reitmaier Thomas Reitmaier true false 6d662d9e2151b302ed384b243e2a802f 0000-0002-1960-1012 Jen Pearson Jen Pearson true false 10b46d7843c2ba53d116ca2ed9abb56e 0000-0001-7657-7373 Matt Jones Matt Jones true false cb3b57a21fa4e48ec633d6ba46455e91 0000-0001-9228-006X Simon Robinson Simon Robinson true false 2024-02-08 MACS We report on community-centered, collaborative research that weaves together HCI, natural language processing, linguistic, and design insights to develop spoken language technologies for unwritten languages. Across three visits to a Banjara farming community in India, we use participatory, technical, and creative methods to engage community members, collect spoken language photo annotations, and develop an information retrieval (IR) system. Drawing on orality theory, we interrogate assumptions and biases of current speech interfaces and create a simple application that leverages our IR system to match fluidly spoken queries with recorded annotations and surface corresponding photos. In-situ evaluations show how our novel approach returns reliable results and inspired the co-creation of media retrieval use-cases that are more appropriate in oral contexts. The very low (< 4h) spoken data requirements makes our approach adaptable to other contexts where languages are unwritten or have no digital language resources available. Conference Paper/Proceeding/Abstract Proceedings of the CHI Conference on Human Factors in Computing Systems ACM New York, NY, USA 979-8-4007-0330-0 speech/language, zero-resource information retrieval, co-creation field study 11 5 2024 2024-05-11 10.1145/3613904.3642026 COLLEGE NANME Mathematics and Computer Science School COLLEGE CODE MACS Swansea University 2024-07-11T14:41:10.4215649 2024-02-08T11:37:14.8656282 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Thomas Reitmaier 0000-0003-2078-6699 1 Dani Kalarikalayil Raju 0000-0003-1854-5271 2 Ondrej Klejch 0000-0001-5495-967x 3 Electra Wallington 0000-0003-4113-2352 4 Nina Markl 0000-0001-9906-9961 5 Jen Pearson 0000-0002-1960-1012 6 Matt Jones 0000-0001-7657-7373 7 Peter Bell 0000-0002-9597-9615 8 Simon Robinson 0000-0001-9228-006X 9 65595__30427__80d0809bb9684e59b260a8cbe19efd14.pdf 65595.VoR.pdf 2024-05-21T16:04:59.1576306 Output 2156446 application/pdf Version of Record true This work is licensed under a Creative Commons Attribution International 4.0 License. true eng https://creativecommons.org/licenses/by/4.0/ |
title |
Cultivating Spoken Language Technologies for Unwritten Languages |
spellingShingle |
Cultivating Spoken Language Technologies for Unwritten Languages Thomas Reitmaier Jen Pearson Matt Jones Simon Robinson |
title_short |
Cultivating Spoken Language Technologies for Unwritten Languages |
title_full |
Cultivating Spoken Language Technologies for Unwritten Languages |
title_fullStr |
Cultivating Spoken Language Technologies for Unwritten Languages |
title_full_unstemmed |
Cultivating Spoken Language Technologies for Unwritten Languages |
title_sort |
Cultivating Spoken Language Technologies for Unwritten Languages |
author_id_str_mv |
ccd66b64d11d76b9cd8b28e9d42a0ff0 6d662d9e2151b302ed384b243e2a802f 10b46d7843c2ba53d116ca2ed9abb56e cb3b57a21fa4e48ec633d6ba46455e91 |
author_id_fullname_str_mv |
ccd66b64d11d76b9cd8b28e9d42a0ff0_***_Thomas Reitmaier 6d662d9e2151b302ed384b243e2a802f_***_Jen Pearson 10b46d7843c2ba53d116ca2ed9abb56e_***_Matt Jones cb3b57a21fa4e48ec633d6ba46455e91_***_Simon Robinson |
author |
Thomas Reitmaier Jen Pearson Matt Jones Simon Robinson |
author2 |
Thomas Reitmaier Dani Kalarikalayil Raju Ondrej Klejch Electra Wallington Nina Markl Jen Pearson Matt Jones Peter Bell Simon Robinson |
format |
Conference Paper/Proceeding/Abstract |
container_title |
Proceedings of the CHI Conference on Human Factors in Computing Systems |
publishDate |
2024 |
institution |
Swansea University |
isbn |
979-8-4007-0330-0 |
doi_str_mv |
10.1145/3613904.3642026 |
publisher |
ACM |
college_str |
Faculty of Science and Engineering |
hierarchytype |
|
hierarchy_top_id |
facultyofscienceandengineering |
hierarchy_top_title |
Faculty of Science and Engineering |
hierarchy_parent_id |
facultyofscienceandengineering |
hierarchy_parent_title |
Faculty of Science and Engineering |
department_str |
School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science |
document_store_str |
1 |
active_str |
0 |
description |
We report on community-centered, collaborative research that weaves together HCI, natural language processing, linguistic, and design insights to develop spoken language technologies for unwritten languages. Across three visits to a Banjara farming community in India, we use participatory, technical, and creative methods to engage community members, collect spoken language photo annotations, and develop an information retrieval (IR) system. Drawing on orality theory, we interrogate assumptions and biases of current speech interfaces and create a simple application that leverages our IR system to match fluidly spoken queries with recorded annotations and surface corresponding photos. In-situ evaluations show how our novel approach returns reliable results and inspired the co-creation of media retrieval use-cases that are more appropriate in oral contexts. The very low (< 4h) spoken data requirements makes our approach adaptable to other contexts where languages are unwritten or have no digital language resources available. |
published_date |
2024-05-11T05:32:33Z |
_version_ |
1821382334902960128 |
score |
11.04748 |