No Cover Image

Journal article 1352 views 204 downloads

How Different Is Arabic from Other Languages?

Ahmed Masrai, Jim Milton Orcid Logo

Journal of Applied Linguistics and Language Research, Volume: 3, Issue: 1, Pages: 15 - 35

Swansea University Author: Jim Milton Orcid Logo

Abstract

This study examines Zipf’s law as a predictor of the relationship between word frequencyand lexical coverage in Arabic. Zipf’s law has been applied in a number of languages, such asEnglish, French and Greek, and revealed useful information. However, word derivationprocesses are far more regular and...

Full description

Published in: Journal of Applied Linguistics and Language Research
ISSN: 2376-760X
Published: 2016
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa26020
first_indexed 2016-02-02T01:54:45Z
last_indexed 2018-02-09T05:07:36Z
id cronfa26020
recordtype SURis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2017-08-21T09:57:54.5857275</datestamp><bib-version>v2</bib-version><id>26020</id><entry>2016-01-29</entry><title>How Different Is Arabic from Other Languages?</title><swanseaauthors><author><sid>7d251e1952cec9d77ed4fc21346fec8d</sid><ORCID>0000-0003-0446-1149</ORCID><firstname>Jim</firstname><surname>Milton</surname><name>Jim Milton</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2016-01-29</date><deptcode>CACS</deptcode><abstract>This study examines Zipf&#x2019;s law as a predictor of the relationship between word frequencyand lexical coverage in Arabic. Zipf&#x2019;s law has been applied in a number of languages, such asEnglish, French and Greek, and revealed useful information. However, word derivationprocesses are far more regular and extensive in Arabic than they are in English and it issuspected that how words are defined may significantly affect the outcome of this kind ofanalysis. The concept of the lemma as applied to English could be redrawn for Arabicentirely credibly. In this study, Arabic lemmatised frequency lists generated from a largeWeb-based corpus have been used to calculate coverage. Results show that Zipf&#x2019;s law doesapply in Arabic, and the findings suggest that the most frequent 9,000 lemmatised wordsprovide approximately 95% coverage, and 14,000 words give nearly 98% coverage. Theseresults suggest that the relationship between word frequency and coverage in Arabic iscomparable, to a certain degree, to English and Greek, but not to French. However, thedefinition of the lemma used in this study is probably more relevant to European languagesthan to Arabic and if this was changed it would significantly change the results.</abstract><type>Journal Article</type><journal>Journal of Applied Linguistics and Language Research</journal><volume>3</volume><journalNumber>1</journalNumber><paginationStart>15</paginationStart><paginationEnd>35</paginationEnd><publisher/><issnPrint>2376-760X</issnPrint><keywords>Arabic corpus, lexical coverage, word frequency, vocabulary, Zipf&#x2019;s law</keywords><publishedDay>31</publishedDay><publishedMonth>12</publishedMonth><publishedYear>2016</publishedYear><publishedDate>2016-12-31</publishedDate><doi/><url>http://www.jallr.com/index.php/JALLR/article/view/213</url><notes/><college>COLLEGE NANME</college><department>Culture and Communications School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>CACS</DepartmentCode><institution>Swansea University</institution><apcterm/><lastEdited>2017-08-21T09:57:54.5857275</lastEdited><Created>2016-01-29T10:36:29.1643406</Created><path><level id="1">Faculty of Humanities and Social Sciences</level><level id="2">School of Culture and Communication - English Language, Tesol, Applied Linguistics</level></path><authors><author><firstname>Ahmed</firstname><surname>Masrai</surname><order>1</order></author><author><firstname>Jim</firstname><surname>Milton</surname><orcid>0000-0003-0446-1149</orcid><order>2</order></author></authors><documents><document><filename>0026020-29012016103822.pdf</filename><originalFilename>Wordfrequencyandlexicalcoveragev3.pdf</originalFilename><uploaded>2016-01-29T10:38:22.6530000</uploaded><type>Output</type><contentLength>826740</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><embargoDate>2016-01-29T00:00:00.0000000</embargoDate><copyrightCorrect>true</copyrightCorrect></document></documents><OutputDurs/></rfc1807>
spelling 2017-08-21T09:57:54.5857275 v2 26020 2016-01-29 How Different Is Arabic from Other Languages? 7d251e1952cec9d77ed4fc21346fec8d 0000-0003-0446-1149 Jim Milton Jim Milton true false 2016-01-29 CACS This study examines Zipf’s law as a predictor of the relationship between word frequencyand lexical coverage in Arabic. Zipf’s law has been applied in a number of languages, such asEnglish, French and Greek, and revealed useful information. However, word derivationprocesses are far more regular and extensive in Arabic than they are in English and it issuspected that how words are defined may significantly affect the outcome of this kind ofanalysis. The concept of the lemma as applied to English could be redrawn for Arabicentirely credibly. In this study, Arabic lemmatised frequency lists generated from a largeWeb-based corpus have been used to calculate coverage. Results show that Zipf’s law doesapply in Arabic, and the findings suggest that the most frequent 9,000 lemmatised wordsprovide approximately 95% coverage, and 14,000 words give nearly 98% coverage. Theseresults suggest that the relationship between word frequency and coverage in Arabic iscomparable, to a certain degree, to English and Greek, but not to French. However, thedefinition of the lemma used in this study is probably more relevant to European languagesthan to Arabic and if this was changed it would significantly change the results. Journal Article Journal of Applied Linguistics and Language Research 3 1 15 35 2376-760X Arabic corpus, lexical coverage, word frequency, vocabulary, Zipf’s law 31 12 2016 2016-12-31 http://www.jallr.com/index.php/JALLR/article/view/213 COLLEGE NANME Culture and Communications School COLLEGE CODE CACS Swansea University 2017-08-21T09:57:54.5857275 2016-01-29T10:36:29.1643406 Faculty of Humanities and Social Sciences School of Culture and Communication - English Language, Tesol, Applied Linguistics Ahmed Masrai 1 Jim Milton 0000-0003-0446-1149 2 0026020-29012016103822.pdf Wordfrequencyandlexicalcoveragev3.pdf 2016-01-29T10:38:22.6530000 Output 826740 application/pdf Version of Record true 2016-01-29T00:00:00.0000000 true
title How Different Is Arabic from Other Languages?
spellingShingle How Different Is Arabic from Other Languages?
Jim Milton
title_short How Different Is Arabic from Other Languages?
title_full How Different Is Arabic from Other Languages?
title_fullStr How Different Is Arabic from Other Languages?
title_full_unstemmed How Different Is Arabic from Other Languages?
title_sort How Different Is Arabic from Other Languages?
author_id_str_mv 7d251e1952cec9d77ed4fc21346fec8d
author_id_fullname_str_mv 7d251e1952cec9d77ed4fc21346fec8d_***_Jim Milton
author Jim Milton
author2 Ahmed Masrai
Jim Milton
format Journal article
container_title Journal of Applied Linguistics and Language Research
container_volume 3
container_issue 1
container_start_page 15
publishDate 2016
institution Swansea University
issn 2376-760X
college_str Faculty of Humanities and Social Sciences
hierarchytype
hierarchy_top_id facultyofhumanitiesandsocialsciences
hierarchy_top_title Faculty of Humanities and Social Sciences
hierarchy_parent_id facultyofhumanitiesandsocialsciences
hierarchy_parent_title Faculty of Humanities and Social Sciences
department_str School of Culture and Communication - English Language, Tesol, Applied Linguistics{{{_:::_}}}Faculty of Humanities and Social Sciences{{{_:::_}}}School of Culture and Communication - English Language, Tesol, Applied Linguistics
url http://www.jallr.com/index.php/JALLR/article/view/213
document_store_str 1
active_str 0
description This study examines Zipf’s law as a predictor of the relationship between word frequencyand lexical coverage in Arabic. Zipf’s law has been applied in a number of languages, such asEnglish, French and Greek, and revealed useful information. However, word derivationprocesses are far more regular and extensive in Arabic than they are in English and it issuspected that how words are defined may significantly affect the outcome of this kind ofanalysis. The concept of the lemma as applied to English could be redrawn for Arabicentirely credibly. In this study, Arabic lemmatised frequency lists generated from a largeWeb-based corpus have been used to calculate coverage. Results show that Zipf’s law doesapply in Arabic, and the findings suggest that the most frequent 9,000 lemmatised wordsprovide approximately 95% coverage, and 14,000 words give nearly 98% coverage. Theseresults suggest that the relationship between word frequency and coverage in Arabic iscomparable, to a certain degree, to English and Greek, but not to French. However, thedefinition of the lemma used in this study is probably more relevant to European languagesthan to Arabic and if this was changed it would significantly change the results.
published_date 2016-12-31T18:50:52Z
_version_ 1821341964489981952
score 11.04748