No Cover Image

E-Thesis 677 views 902 downloads

Comparative Evaluation of Translation Memory (TM) and Machine Translation (MT) Systems in Translation between Arabic and English / KHALED MILAD

Swansea University Author: KHALED MILAD

  • Comparative Evaluation of Translation Memory (TM) and Machine Translation (MT) Systems in Translation between Arabic and English.final.pdf

    PDF | E-Thesis – open access

    © 2021 by Khaled Mamer Ben Milad. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

    Download (2.26MB)

DOI (Published version): 10.23889/SUthesis.57439

Abstract

In general, advances in translation technology tools have enhanced translation quality significantly. Unfortunately, however, it seems that this is not the case for all language pairs. A concern arises when the users of translation tools want to work between different language families such as Arabi...

Full description

Published: Swansea 2021
Institution: Swansea University
Degree level: Doctoral
Degree name: Ph.D
Supervisor: Rothwell, Andrew., Parra, Maria Fernandez .
URI: https://cronfa.swan.ac.uk/Record/cronfa57439
Tags: Add Tag
No Tags, Be the first to tag this record!
first_indexed 2021-07-22T15:29:12Z
last_indexed 2021-08-19T03:32:38Z
id cronfa57439
recordtype RisThesis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2021-08-18T16:28:50.6970089</datestamp><bib-version>v2</bib-version><id>57439</id><entry>2021-07-22</entry><title>Comparative Evaluation of Translation Memory (TM) and Machine Translation (MT) Systems in Translation between Arabic and English</title><swanseaauthors><author><sid>e1cf6009b6a1ecae6d60ad61b530dcb8</sid><firstname>KHALED</firstname><surname>MILAD</surname><name>KHALED MILAD</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2021-07-22</date><abstract>In general, advances in translation technology tools have enhanced translation quality significantly. Unfortunately, however, it seems that this is not the case for all language pairs. A concern arises when the users of translation tools want to work between different language families such as Arabic and English. The main problems facing Arabic&lt;&gt;English translation tools lie in Arabic&#x2019;s characteristic free word order, richness of word inflection &#x2013; including orthographic ambiguity &#x2013; and optionality of diacritics, in addition to a lack of data resources. The aim of this study is to compare the performance of translation memory (TM) and machine translation (MT) systems in translating between Arabic and English.The research evaluates the two systems based on specific criteria relating to needs and expected results. The first part of the thesis evaluates the performance of a set of well-known TM systems when retrieving a segment of text that includes an Arabic linguistic feature. As it is widely known that TM matching metrics are based solely on the use of edit distance string measurements, it was expected that the aforementioned issues would lead to a low match percentage. The second part of the thesis evaluates multiple MT systems that use the mainstream neural machine translation (NMT) approach to translation quality. Due to a lack of training data resources and its rich morphology, it was anticipated that Arabic features would reduce the translation quality of this corpus-based approach. The systems&#x2019; output was evaluated using both automatic evaluation metrics including BLEU and hLEPOR, and TAUS human quality ranking criteria for adequacy and fluency.The study employed a black-box testing methodology to experimentally examine the TM systems through a test suite instrument and also to translate Arabic English sentences to collect the MT systems&#x2019; output. A translation threshold was used to evaluate the fuzzy matches of TM systems, while an online survey was used to collect participants&#x2019; responses to the quality of MT system&#x2019;s output. The experiments&#x2019; input of both systems was extracted from Arabic&lt;&gt;English corpora, which was examined by means of quantitative data analysis. The results show that, when retrieving translations, the current TM matching metrics are unable to recognise Arabic features and score them appropriately. In terms of automatic translation, MT produced good results for adequacy, especially when translating from Arabic to English, but the systems&#x2019; output appeared to need post-editing for fluency. Moreover, when retrievingfrom Arabic, it was found that short sentences were handled much better by MT than by TM. The findings may be given as recommendations to software developers.</abstract><type>E-Thesis</type><journal/><volume/><journalNumber/><paginationStart/><paginationEnd/><publisher/><placeOfPublication>Swansea</placeOfPublication><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic/><keywords>Translation memory retrieval, Fuzzy matches, Neural machine translation, Machine translation systems, Adequacy and Fluency, Arabic&amp;lt;&amp;gt;English translation</keywords><publishedDay>22</publishedDay><publishedMonth>7</publishedMonth><publishedYear>2021</publishedYear><publishedDate>2021-07-22</publishedDate><doi>10.23889/SUthesis.57439</doi><url/><notes>Figure 1.3 is excluded from the Creative Commons License as copyright rests with the original author and is reproduced under the section 30 educational exception of CDPA 1988.</notes><college>COLLEGE NANME</college><CollegeCode>COLLEGE CODE</CollegeCode><institution>Swansea University</institution><supervisor>Rothwell, Andrew., Parra, Maria Fernandez .</supervisor><degreelevel>Doctoral</degreelevel><degreename>Ph.D</degreename><degreesponsorsfunders>Libyan Government</degreesponsorsfunders><apcterm/><lastEdited>2021-08-18T16:28:50.6970089</lastEdited><Created>2021-07-22T16:12:32.1998688</Created><path><level id="1">Faculty of Humanities and Social Sciences</level><level id="2">School of Culture and Communication - Modern Languages, Translation, and Interpreting</level></path><authors><author><firstname>KHALED</firstname><surname>MILAD</surname><order>1</order></author></authors><documents><document><filename>57439__20442__4c79f9697fd2405abd4f4e5a203c30d3.pdf</filename><originalFilename>Comparative Evaluation of Translation Memory (TM) and Machine Translation (MT) Systems in Translation between Arabic and English.final.pdf</originalFilename><uploaded>2021-07-22T17:05:19.8235187</uploaded><type>Output</type><contentLength>2368779</contentLength><contentType>application/pdf</contentType><version>E-Thesis &#x2013; open access</version><cronfaStatus>true</cronfaStatus><documentNotes>&#xA9; 2021 by Khaled Mamer Ben Milad. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>http://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807>
spelling 2021-08-18T16:28:50.6970089 v2 57439 2021-07-22 Comparative Evaluation of Translation Memory (TM) and Machine Translation (MT) Systems in Translation between Arabic and English e1cf6009b6a1ecae6d60ad61b530dcb8 KHALED MILAD KHALED MILAD true false 2021-07-22 In general, advances in translation technology tools have enhanced translation quality significantly. Unfortunately, however, it seems that this is not the case for all language pairs. A concern arises when the users of translation tools want to work between different language families such as Arabic and English. The main problems facing Arabic<>English translation tools lie in Arabic’s characteristic free word order, richness of word inflection – including orthographic ambiguity – and optionality of diacritics, in addition to a lack of data resources. The aim of this study is to compare the performance of translation memory (TM) and machine translation (MT) systems in translating between Arabic and English.The research evaluates the two systems based on specific criteria relating to needs and expected results. The first part of the thesis evaluates the performance of a set of well-known TM systems when retrieving a segment of text that includes an Arabic linguistic feature. As it is widely known that TM matching metrics are based solely on the use of edit distance string measurements, it was expected that the aforementioned issues would lead to a low match percentage. The second part of the thesis evaluates multiple MT systems that use the mainstream neural machine translation (NMT) approach to translation quality. Due to a lack of training data resources and its rich morphology, it was anticipated that Arabic features would reduce the translation quality of this corpus-based approach. The systems’ output was evaluated using both automatic evaluation metrics including BLEU and hLEPOR, and TAUS human quality ranking criteria for adequacy and fluency.The study employed a black-box testing methodology to experimentally examine the TM systems through a test suite instrument and also to translate Arabic English sentences to collect the MT systems’ output. A translation threshold was used to evaluate the fuzzy matches of TM systems, while an online survey was used to collect participants’ responses to the quality of MT system’s output. The experiments’ input of both systems was extracted from Arabic<>English corpora, which was examined by means of quantitative data analysis. The results show that, when retrieving translations, the current TM matching metrics are unable to recognise Arabic features and score them appropriately. In terms of automatic translation, MT produced good results for adequacy, especially when translating from Arabic to English, but the systems’ output appeared to need post-editing for fluency. Moreover, when retrievingfrom Arabic, it was found that short sentences were handled much better by MT than by TM. The findings may be given as recommendations to software developers. E-Thesis Swansea Translation memory retrieval, Fuzzy matches, Neural machine translation, Machine translation systems, Adequacy and Fluency, Arabic&lt;&gt;English translation 22 7 2021 2021-07-22 10.23889/SUthesis.57439 Figure 1.3 is excluded from the Creative Commons License as copyright rests with the original author and is reproduced under the section 30 educational exception of CDPA 1988. COLLEGE NANME COLLEGE CODE Swansea University Rothwell, Andrew., Parra, Maria Fernandez . Doctoral Ph.D Libyan Government 2021-08-18T16:28:50.6970089 2021-07-22T16:12:32.1998688 Faculty of Humanities and Social Sciences School of Culture and Communication - Modern Languages, Translation, and Interpreting KHALED MILAD 1 57439__20442__4c79f9697fd2405abd4f4e5a203c30d3.pdf Comparative Evaluation of Translation Memory (TM) and Machine Translation (MT) Systems in Translation between Arabic and English.final.pdf 2021-07-22T17:05:19.8235187 Output 2368779 application/pdf E-Thesis – open access true © 2021 by Khaled Mamer Ben Milad. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. true eng http://creativecommons.org/licenses/by/4.0/
title Comparative Evaluation of Translation Memory (TM) and Machine Translation (MT) Systems in Translation between Arabic and English
spellingShingle Comparative Evaluation of Translation Memory (TM) and Machine Translation (MT) Systems in Translation between Arabic and English
KHALED MILAD
title_short Comparative Evaluation of Translation Memory (TM) and Machine Translation (MT) Systems in Translation between Arabic and English
title_full Comparative Evaluation of Translation Memory (TM) and Machine Translation (MT) Systems in Translation between Arabic and English
title_fullStr Comparative Evaluation of Translation Memory (TM) and Machine Translation (MT) Systems in Translation between Arabic and English
title_full_unstemmed Comparative Evaluation of Translation Memory (TM) and Machine Translation (MT) Systems in Translation between Arabic and English
title_sort Comparative Evaluation of Translation Memory (TM) and Machine Translation (MT) Systems in Translation between Arabic and English
author_id_str_mv e1cf6009b6a1ecae6d60ad61b530dcb8
author_id_fullname_str_mv e1cf6009b6a1ecae6d60ad61b530dcb8_***_KHALED MILAD
author KHALED MILAD
author2 KHALED MILAD
format E-Thesis
publishDate 2021
institution Swansea University
doi_str_mv 10.23889/SUthesis.57439
college_str Faculty of Humanities and Social Sciences
hierarchytype
hierarchy_top_id facultyofhumanitiesandsocialsciences
hierarchy_top_title Faculty of Humanities and Social Sciences
hierarchy_parent_id facultyofhumanitiesandsocialsciences
hierarchy_parent_title Faculty of Humanities and Social Sciences
department_str School of Culture and Communication - Modern Languages, Translation, and Interpreting{{{_:::_}}}Faculty of Humanities and Social Sciences{{{_:::_}}}School of Culture and Communication - Modern Languages, Translation, and Interpreting
document_store_str 1
active_str 0
description In general, advances in translation technology tools have enhanced translation quality significantly. Unfortunately, however, it seems that this is not the case for all language pairs. A concern arises when the users of translation tools want to work between different language families such as Arabic and English. The main problems facing Arabic<>English translation tools lie in Arabic’s characteristic free word order, richness of word inflection – including orthographic ambiguity – and optionality of diacritics, in addition to a lack of data resources. The aim of this study is to compare the performance of translation memory (TM) and machine translation (MT) systems in translating between Arabic and English.The research evaluates the two systems based on specific criteria relating to needs and expected results. The first part of the thesis evaluates the performance of a set of well-known TM systems when retrieving a segment of text that includes an Arabic linguistic feature. As it is widely known that TM matching metrics are based solely on the use of edit distance string measurements, it was expected that the aforementioned issues would lead to a low match percentage. The second part of the thesis evaluates multiple MT systems that use the mainstream neural machine translation (NMT) approach to translation quality. Due to a lack of training data resources and its rich morphology, it was anticipated that Arabic features would reduce the translation quality of this corpus-based approach. The systems’ output was evaluated using both automatic evaluation metrics including BLEU and hLEPOR, and TAUS human quality ranking criteria for adequacy and fluency.The study employed a black-box testing methodology to experimentally examine the TM systems through a test suite instrument and also to translate Arabic English sentences to collect the MT systems’ output. A translation threshold was used to evaluate the fuzzy matches of TM systems, while an online survey was used to collect participants’ responses to the quality of MT system’s output. The experiments’ input of both systems was extracted from Arabic<>English corpora, which was examined by means of quantitative data analysis. The results show that, when retrieving translations, the current TM matching metrics are unable to recognise Arabic features and score them appropriately. In terms of automatic translation, MT produced good results for adequacy, especially when translating from Arabic to English, but the systems’ output appeared to need post-editing for fluency. Moreover, when retrievingfrom Arabic, it was found that short sentences were handled much better by MT than by TM. The findings may be given as recommendations to software developers.
published_date 2021-07-22T04:13:11Z
_version_ 1763753896011890688
score 11.012924