Journal article 1352 views 204 downloads
How Different Is Arabic from Other Languages?
Journal of Applied Linguistics and Language Research, Volume: 3, Issue: 1, Pages: 15 - 35
Swansea University Author: Jim Milton
-
PDF | Version of Record
Download (845.27KB)
Abstract
This study examines Zipf’s law as a predictor of the relationship between word frequencyand lexical coverage in Arabic. Zipf’s law has been applied in a number of languages, such asEnglish, French and Greek, and revealed useful information. However, word derivationprocesses are far more regular and...
Published in: | Journal of Applied Linguistics and Language Research |
---|---|
ISSN: | 2376-760X |
Published: |
2016
|
Online Access: |
Check full text
|
URI: | https://cronfa.swan.ac.uk/Record/cronfa26020 |
first_indexed |
2016-02-02T01:54:45Z |
---|---|
last_indexed |
2018-02-09T05:07:36Z |
id |
cronfa26020 |
recordtype |
SURis |
fullrecord |
<?xml version="1.0"?><rfc1807><datestamp>2017-08-21T09:57:54.5857275</datestamp><bib-version>v2</bib-version><id>26020</id><entry>2016-01-29</entry><title>How Different Is Arabic from Other Languages?</title><swanseaauthors><author><sid>7d251e1952cec9d77ed4fc21346fec8d</sid><ORCID>0000-0003-0446-1149</ORCID><firstname>Jim</firstname><surname>Milton</surname><name>Jim Milton</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2016-01-29</date><deptcode>CACS</deptcode><abstract>This study examines Zipf’s law as a predictor of the relationship between word frequencyand lexical coverage in Arabic. Zipf’s law has been applied in a number of languages, such asEnglish, French and Greek, and revealed useful information. However, word derivationprocesses are far more regular and extensive in Arabic than they are in English and it issuspected that how words are defined may significantly affect the outcome of this kind ofanalysis. The concept of the lemma as applied to English could be redrawn for Arabicentirely credibly. In this study, Arabic lemmatised frequency lists generated from a largeWeb-based corpus have been used to calculate coverage. Results show that Zipf’s law doesapply in Arabic, and the findings suggest that the most frequent 9,000 lemmatised wordsprovide approximately 95% coverage, and 14,000 words give nearly 98% coverage. Theseresults suggest that the relationship between word frequency and coverage in Arabic iscomparable, to a certain degree, to English and Greek, but not to French. However, thedefinition of the lemma used in this study is probably more relevant to European languagesthan to Arabic and if this was changed it would significantly change the results.</abstract><type>Journal Article</type><journal>Journal of Applied Linguistics and Language Research</journal><volume>3</volume><journalNumber>1</journalNumber><paginationStart>15</paginationStart><paginationEnd>35</paginationEnd><publisher/><issnPrint>2376-760X</issnPrint><keywords>Arabic corpus, lexical coverage, word frequency, vocabulary, Zipf’s law</keywords><publishedDay>31</publishedDay><publishedMonth>12</publishedMonth><publishedYear>2016</publishedYear><publishedDate>2016-12-31</publishedDate><doi/><url>http://www.jallr.com/index.php/JALLR/article/view/213</url><notes/><college>COLLEGE NANME</college><department>Culture and Communications School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>CACS</DepartmentCode><institution>Swansea University</institution><apcterm/><lastEdited>2017-08-21T09:57:54.5857275</lastEdited><Created>2016-01-29T10:36:29.1643406</Created><path><level id="1">Faculty of Humanities and Social Sciences</level><level id="2">School of Culture and Communication - English Language, Tesol, Applied Linguistics</level></path><authors><author><firstname>Ahmed</firstname><surname>Masrai</surname><order>1</order></author><author><firstname>Jim</firstname><surname>Milton</surname><orcid>0000-0003-0446-1149</orcid><order>2</order></author></authors><documents><document><filename>0026020-29012016103822.pdf</filename><originalFilename>Wordfrequencyandlexicalcoveragev3.pdf</originalFilename><uploaded>2016-01-29T10:38:22.6530000</uploaded><type>Output</type><contentLength>826740</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><embargoDate>2016-01-29T00:00:00.0000000</embargoDate><copyrightCorrect>true</copyrightCorrect></document></documents><OutputDurs/></rfc1807> |
spelling |
2017-08-21T09:57:54.5857275 v2 26020 2016-01-29 How Different Is Arabic from Other Languages? 7d251e1952cec9d77ed4fc21346fec8d 0000-0003-0446-1149 Jim Milton Jim Milton true false 2016-01-29 CACS This study examines Zipf’s law as a predictor of the relationship between word frequencyand lexical coverage in Arabic. Zipf’s law has been applied in a number of languages, such asEnglish, French and Greek, and revealed useful information. However, word derivationprocesses are far more regular and extensive in Arabic than they are in English and it issuspected that how words are defined may significantly affect the outcome of this kind ofanalysis. The concept of the lemma as applied to English could be redrawn for Arabicentirely credibly. In this study, Arabic lemmatised frequency lists generated from a largeWeb-based corpus have been used to calculate coverage. Results show that Zipf’s law doesapply in Arabic, and the findings suggest that the most frequent 9,000 lemmatised wordsprovide approximately 95% coverage, and 14,000 words give nearly 98% coverage. Theseresults suggest that the relationship between word frequency and coverage in Arabic iscomparable, to a certain degree, to English and Greek, but not to French. However, thedefinition of the lemma used in this study is probably more relevant to European languagesthan to Arabic and if this was changed it would significantly change the results. Journal Article Journal of Applied Linguistics and Language Research 3 1 15 35 2376-760X Arabic corpus, lexical coverage, word frequency, vocabulary, Zipf’s law 31 12 2016 2016-12-31 http://www.jallr.com/index.php/JALLR/article/view/213 COLLEGE NANME Culture and Communications School COLLEGE CODE CACS Swansea University 2017-08-21T09:57:54.5857275 2016-01-29T10:36:29.1643406 Faculty of Humanities and Social Sciences School of Culture and Communication - English Language, Tesol, Applied Linguistics Ahmed Masrai 1 Jim Milton 0000-0003-0446-1149 2 0026020-29012016103822.pdf Wordfrequencyandlexicalcoveragev3.pdf 2016-01-29T10:38:22.6530000 Output 826740 application/pdf Version of Record true 2016-01-29T00:00:00.0000000 true |
title |
How Different Is Arabic from Other Languages? |
spellingShingle |
How Different Is Arabic from Other Languages? Jim Milton |
title_short |
How Different Is Arabic from Other Languages? |
title_full |
How Different Is Arabic from Other Languages? |
title_fullStr |
How Different Is Arabic from Other Languages? |
title_full_unstemmed |
How Different Is Arabic from Other Languages? |
title_sort |
How Different Is Arabic from Other Languages? |
author_id_str_mv |
7d251e1952cec9d77ed4fc21346fec8d |
author_id_fullname_str_mv |
7d251e1952cec9d77ed4fc21346fec8d_***_Jim Milton |
author |
Jim Milton |
author2 |
Ahmed Masrai Jim Milton |
format |
Journal article |
container_title |
Journal of Applied Linguistics and Language Research |
container_volume |
3 |
container_issue |
1 |
container_start_page |
15 |
publishDate |
2016 |
institution |
Swansea University |
issn |
2376-760X |
college_str |
Faculty of Humanities and Social Sciences |
hierarchytype |
|
hierarchy_top_id |
facultyofhumanitiesandsocialsciences |
hierarchy_top_title |
Faculty of Humanities and Social Sciences |
hierarchy_parent_id |
facultyofhumanitiesandsocialsciences |
hierarchy_parent_title |
Faculty of Humanities and Social Sciences |
department_str |
School of Culture and Communication - English Language, Tesol, Applied Linguistics{{{_:::_}}}Faculty of Humanities and Social Sciences{{{_:::_}}}School of Culture and Communication - English Language, Tesol, Applied Linguistics |
url |
http://www.jallr.com/index.php/JALLR/article/view/213 |
document_store_str |
1 |
active_str |
0 |
description |
This study examines Zipf’s law as a predictor of the relationship between word frequencyand lexical coverage in Arabic. Zipf’s law has been applied in a number of languages, such asEnglish, French and Greek, and revealed useful information. However, word derivationprocesses are far more regular and extensive in Arabic than they are in English and it issuspected that how words are defined may significantly affect the outcome of this kind ofanalysis. The concept of the lemma as applied to English could be redrawn for Arabicentirely credibly. In this study, Arabic lemmatised frequency lists generated from a largeWeb-based corpus have been used to calculate coverage. Results show that Zipf’s law doesapply in Arabic, and the findings suggest that the most frequent 9,000 lemmatised wordsprovide approximately 95% coverage, and 14,000 words give nearly 98% coverage. Theseresults suggest that the relationship between word frequency and coverage in Arabic iscomparable, to a certain degree, to English and Greek, but not to French. However, thedefinition of the lemma used in this study is probably more relevant to European languagesthan to Arabic and if this was changed it would significantly change the results. |
published_date |
2016-12-31T18:50:52Z |
_version_ |
1821341964489981952 |
score |
11.04748 |