No Cover Image

Conference Paper/Proceeding/Abstract 228 views 119 downloads

Prompt Balance Matters: Understanding How Imbalanced Few-Shot Learning Affects Multilingual Sense Disambiguation in LLMs

Deshan Sumanathilaka Orcid Logo, Nicholas Micallef Orcid Logo, Julian Hough Orcid Logo

Proceedings of the Workshop on Beyond English: Natural Language Processing for all Languages in an Era of Large Language Models, Pages: 7 - 15

Swansea University Authors: Deshan Sumanathilaka Orcid Logo, Nicholas Micallef Orcid Logo, Julian Hough Orcid Logo

  • GlobalNLP_2025_Submission.pdf

    PDF | Accepted Manuscript

    Author accepted manuscript document released under the terms of a Creative Commons CC-BY licence using the Swansea University Research Publications Policy (rights retention).

    Download (461.14KB)

Abstract

Recent advances in Large Language Models (LLMs) have significantly reshaped the landscape of Natural Language Processing (NLP). Among the various prompting techniques, few-shot prompting has gained considerable attention for its practicality and effectiveness. This study investigates how few-shot pr...

Full description

Published in: Proceedings of the Workshop on Beyond English: Natural Language Processing for all Languages in an Era of Large Language Models
ISBN: 978-954-452-105-9
Published: Shoumen, Bulgaria INCOMA Ltd. 2025
Online Access: https://acl-bg.org/proceedings/2025/GlobalNLP%202025/index.html
URI: https://cronfa.swan.ac.uk/Record/cronfa70198
first_indexed 2025-08-19T12:29:58Z
last_indexed 2025-11-07T05:09:10Z
id cronfa70198
recordtype SURis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2025-11-05T12:53:12.8095236</datestamp><bib-version>v2</bib-version><id>70198</id><entry>2025-08-19</entry><title>Prompt Balance Matters: Understanding How Imbalanced Few-Shot Learning Affects Multilingual Sense Disambiguation in LLMs</title><swanseaauthors><author><sid>2fe44f0c1e7d845dc21bb6b00d5b2085</sid><ORCID>0009-0005-8933-6559</ORCID><firstname>Deshan</firstname><surname>Sumanathilaka</surname><name>Deshan Sumanathilaka</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>1cc4c84582d665b7ee08fb16f5454671</sid><ORCID>0000-0002-2683-8042</ORCID><firstname>Nicholas</firstname><surname>Micallef</surname><name>Nicholas Micallef</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>082d773ae261d2bbf49434dd2608ab40</sid><ORCID>0000-0002-4345-6759</ORCID><firstname>Julian</firstname><surname>Hough</surname><name>Julian Hough</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2025-08-19</date><deptcode>MACS</deptcode><abstract>Recent advances in Large Language Models (LLMs) have significantly reshaped the landscape of Natural Language Processing (NLP). Among the various prompting techniques, few-shot prompting has gained considerable attention for its practicality and effectiveness. This study investigates how few-shot prompting strategies impact the Word Sense Disambiguation (WSD) task, particularly focusing on the biases introduced by imbalanced sample distributions. We use the GLOSSGPT prompting method, an advanced approach for English WSD, to test its effectiveness across five languages: English, German, Spanish, French, and Italian. Our results show that imbalanced few-shot examples can cause incorrect sense predictions in multilingual languages, but this issue does not appear in English. To assess model behavior, we evaluate both the GPT-4o and LLaMA-3.1-70B models and the results highlight the sensitivity of multilingual WSD to sample distribution in few-shot settings, emphasizing the need for balanced and representative prompting strategies.</abstract><type>Conference Paper/Proceeding/Abstract</type><journal>Proceedings of the Workshop on Beyond English: Natural Language Processing for all Languages in an Era of Large Language Models</journal><volume/><journalNumber/><paginationStart>7</paginationStart><paginationEnd>15</paginationEnd><publisher>INCOMA Ltd.</publisher><placeOfPublication>Shoumen, Bulgaria</placeOfPublication><isbnPrint/><isbnElectronic>978-954-452-105-9</isbnElectronic><issnPrint/><issnElectronic/><keywords/><publishedDay>12</publishedDay><publishedMonth>9</publishedMonth><publishedYear>2025</publishedYear><publishedDate>2025-09-12</publishedDate><doi/><url>https://acl-bg.org/proceedings/2025/GlobalNLP%202025/index.html</url><notes/><college>COLLEGE NANME</college><department>Mathematics and Computer Science School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MACS</DepartmentCode><institution>Swansea University</institution><apcterm>Not Required</apcterm><funders>Hough&#x2019;s work is supported by the EPSRC grant EP/X009343/1 &#x2018;FLUIDITY&#x2019;.</funders><projectreference/><lastEdited>2025-11-05T12:53:12.8095236</lastEdited><Created>2025-08-19T13:23:59.7274514</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Deshan</firstname><surname>Sumanathilaka</surname><orcid>0009-0005-8933-6559</orcid><order>1</order></author><author><firstname>Nicholas</firstname><surname>Micallef</surname><orcid>0000-0002-2683-8042</orcid><order>2</order></author><author><firstname>Julian</firstname><surname>Hough</surname><orcid>0000-0002-4345-6759</orcid><order>3</order></author></authors><documents><document><filename>70198__34970__42f21359da4b40b2890548861d3bdf10.pdf</filename><originalFilename>GlobalNLP_2025_Submission.pdf</originalFilename><uploaded>2025-08-19T13:26:51.6417858</uploaded><type>Output</type><contentLength>472207</contentLength><contentType>application/pdf</contentType><version>Accepted Manuscript</version><cronfaStatus>true</cronfaStatus><documentNotes>Author accepted manuscript document released under the terms of a Creative Commons CC-BY licence using the Swansea University Research Publications Policy (rights retention).</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>https://creativecommons.org/licenses/by/4.0/deed.en</licence></document></documents><OutputDurs/></rfc1807>
spelling 2025-11-05T12:53:12.8095236 v2 70198 2025-08-19 Prompt Balance Matters: Understanding How Imbalanced Few-Shot Learning Affects Multilingual Sense Disambiguation in LLMs 2fe44f0c1e7d845dc21bb6b00d5b2085 0009-0005-8933-6559 Deshan Sumanathilaka Deshan Sumanathilaka true false 1cc4c84582d665b7ee08fb16f5454671 0000-0002-2683-8042 Nicholas Micallef Nicholas Micallef true false 082d773ae261d2bbf49434dd2608ab40 0000-0002-4345-6759 Julian Hough Julian Hough true false 2025-08-19 MACS Recent advances in Large Language Models (LLMs) have significantly reshaped the landscape of Natural Language Processing (NLP). Among the various prompting techniques, few-shot prompting has gained considerable attention for its practicality and effectiveness. This study investigates how few-shot prompting strategies impact the Word Sense Disambiguation (WSD) task, particularly focusing on the biases introduced by imbalanced sample distributions. We use the GLOSSGPT prompting method, an advanced approach for English WSD, to test its effectiveness across five languages: English, German, Spanish, French, and Italian. Our results show that imbalanced few-shot examples can cause incorrect sense predictions in multilingual languages, but this issue does not appear in English. To assess model behavior, we evaluate both the GPT-4o and LLaMA-3.1-70B models and the results highlight the sensitivity of multilingual WSD to sample distribution in few-shot settings, emphasizing the need for balanced and representative prompting strategies. Conference Paper/Proceeding/Abstract Proceedings of the Workshop on Beyond English: Natural Language Processing for all Languages in an Era of Large Language Models 7 15 INCOMA Ltd. Shoumen, Bulgaria 978-954-452-105-9 12 9 2025 2025-09-12 https://acl-bg.org/proceedings/2025/GlobalNLP%202025/index.html COLLEGE NANME Mathematics and Computer Science School COLLEGE CODE MACS Swansea University Not Required Hough’s work is supported by the EPSRC grant EP/X009343/1 ‘FLUIDITY’. 2025-11-05T12:53:12.8095236 2025-08-19T13:23:59.7274514 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Deshan Sumanathilaka 0009-0005-8933-6559 1 Nicholas Micallef 0000-0002-2683-8042 2 Julian Hough 0000-0002-4345-6759 3 70198__34970__42f21359da4b40b2890548861d3bdf10.pdf GlobalNLP_2025_Submission.pdf 2025-08-19T13:26:51.6417858 Output 472207 application/pdf Accepted Manuscript true Author accepted manuscript document released under the terms of a Creative Commons CC-BY licence using the Swansea University Research Publications Policy (rights retention). true eng https://creativecommons.org/licenses/by/4.0/deed.en
title Prompt Balance Matters: Understanding How Imbalanced Few-Shot Learning Affects Multilingual Sense Disambiguation in LLMs
spellingShingle Prompt Balance Matters: Understanding How Imbalanced Few-Shot Learning Affects Multilingual Sense Disambiguation in LLMs
Deshan Sumanathilaka
Nicholas Micallef
Julian Hough
title_short Prompt Balance Matters: Understanding How Imbalanced Few-Shot Learning Affects Multilingual Sense Disambiguation in LLMs
title_full Prompt Balance Matters: Understanding How Imbalanced Few-Shot Learning Affects Multilingual Sense Disambiguation in LLMs
title_fullStr Prompt Balance Matters: Understanding How Imbalanced Few-Shot Learning Affects Multilingual Sense Disambiguation in LLMs
title_full_unstemmed Prompt Balance Matters: Understanding How Imbalanced Few-Shot Learning Affects Multilingual Sense Disambiguation in LLMs
title_sort Prompt Balance Matters: Understanding How Imbalanced Few-Shot Learning Affects Multilingual Sense Disambiguation in LLMs
author_id_str_mv 2fe44f0c1e7d845dc21bb6b00d5b2085
1cc4c84582d665b7ee08fb16f5454671
082d773ae261d2bbf49434dd2608ab40
author_id_fullname_str_mv 2fe44f0c1e7d845dc21bb6b00d5b2085_***_Deshan Sumanathilaka
1cc4c84582d665b7ee08fb16f5454671_***_Nicholas Micallef
082d773ae261d2bbf49434dd2608ab40_***_Julian Hough
author Deshan Sumanathilaka
Nicholas Micallef
Julian Hough
author2 Deshan Sumanathilaka
Nicholas Micallef
Julian Hough
format Conference Paper/Proceeding/Abstract
container_title Proceedings of the Workshop on Beyond English: Natural Language Processing for all Languages in an Era of Large Language Models
container_start_page 7
publishDate 2025
institution Swansea University
isbn 978-954-452-105-9
publisher INCOMA Ltd.
college_str Faculty of Science and Engineering
hierarchytype
hierarchy_top_id facultyofscienceandengineering
hierarchy_top_title Faculty of Science and Engineering
hierarchy_parent_id facultyofscienceandengineering
hierarchy_parent_title Faculty of Science and Engineering
department_str School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science
url https://acl-bg.org/proceedings/2025/GlobalNLP%202025/index.html
document_store_str 1
active_str 0
description Recent advances in Large Language Models (LLMs) have significantly reshaped the landscape of Natural Language Processing (NLP). Among the various prompting techniques, few-shot prompting has gained considerable attention for its practicality and effectiveness. This study investigates how few-shot prompting strategies impact the Word Sense Disambiguation (WSD) task, particularly focusing on the biases introduced by imbalanced sample distributions. We use the GLOSSGPT prompting method, an advanced approach for English WSD, to test its effectiveness across five languages: English, German, Spanish, French, and Italian. Our results show that imbalanced few-shot examples can cause incorrect sense predictions in multilingual languages, but this issue does not appear in English. To assess model behavior, we evaluate both the GPT-4o and LLaMA-3.1-70B models and the results highlight the sensitivity of multilingual WSD to sample distribution in few-shot settings, emphasizing the need for balanced and representative prompting strategies.
published_date 2025-09-12T05:30:14Z
_version_ 1851097994495000576
score 11.089386