No Cover Image

Policy briefing report 319 views

Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online

Stuart Macdonald Orcid Logo, Ashley Mattheis, David Wells

Swansea University Authors: Stuart Macdonald Orcid Logo, Ashley Mattheis, David Wells

Abstract

The focus of this report is the use of automated content-based tools – in particular those that use artificial intelligence (AI) and machine learning – to detect terrorist content online. In broad terms, such tools follow either a matching-based or a classification-based approach. Matching-based app...

Full description

Published: 2024
Online Access: https://tate.techagainstterrorism.org/news/tcoaireport
URI: https://cronfa.swan.ac.uk/Record/cronfa65450
Tags: Add Tag
No Tags, Be the first to tag this record!
first_indexed 2024-01-15T17:06:58Z
last_indexed 2024-01-15T17:06:58Z
id cronfa65450
recordtype SURis
fullrecord <?xml version="1.0" encoding="utf-8"?><rfc1807 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><bib-version>v2</bib-version><id>65450</id><entry>2024-01-15</entry><title>Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online</title><swanseaauthors><author><sid>933e714a4cc37c3ac12d4edc277f8f98</sid><ORCID>0000-0002-7483-9023</ORCID><firstname>Stuart</firstname><surname>Macdonald</surname><name>Stuart Macdonald</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>20bd641e721999fbea309db74f2d60c5</sid><firstname>Ashley</firstname><surname>Mattheis</surname><name>Ashley Mattheis</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>d3eb40ca96e1df1931ef054d32fbc4cf</sid><firstname>David</firstname><surname>Wells</surname><name>David Wells</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2024-01-15</date><deptcode>LAWD</deptcode><abstract>The focus of this report is the use of automated content-based tools – in particular those that use artificial intelligence (AI) and machine learning – to detect terrorist content online. In broad terms, such tools follow either a matching-based or a classification-based approach. Matching-based approaches rely on a technique known as hashing. The report explains the distinction between cryptographic hashing and perceptual hashing, explaining that tech companies have tended to rely on the latter for the purposes of content moderation. Classification-based approaches typically involve using a large corpus of texts, which have been manually annotated by human reviewers, to train algorithms to predict whether a new item of content belongs to a particular category (e.g., terrorist content). This approach also raises important issues, including the difficulties compiling a dataset to train the algorithms, the temporal, contextual and cultural limitations of machine learning algorithms, and the resultant danger of incorrect outcomes. In the light of this discussion, the report concludes that human input remains necessary and that oversight mechanisms are essential to correct errors and ensure accountability. It also considers capacity-building measures, including off-the-shelf content moderation solutions and collaborative initiatives, as well as potential future development of AI to address some of the challenges identified.</abstract><type>Policy briefing report</type><journal/><volume/><journalNumber/><paginationStart/><paginationEnd/><publisher/><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic/><keywords>Terrorism, counterterrorism, AI, machine learning, content moderation, social media</keywords><publishedDay>15</publishedDay><publishedMonth>1</publishedMonth><publishedYear>2024</publishedYear><publishedDate>2024-01-15</publishedDate><doi/><url>https://tate.techagainstterrorism.org/news/tcoaireport</url><notes>https://tate.techagainstterrorism.org/news/tcoaireport</notes><college>COLLEGE NANME</college><department>Law</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>LAWD</DepartmentCode><institution>Swansea University</institution><apcterm>Not Required</apcterm><funders>European Union</funders><projectreference>ISFP-2021-AG-TCO-101080101</projectreference><lastEdited>2024-03-23T12:06:52.3067121</lastEdited><Created>2024-01-15T17:02:07.1410064</Created><path><level id="1">Faculty of Humanities and Social Sciences</level><level id="2">Hilary Rodham Clinton School of Law</level></path><authors><author><firstname>Stuart</firstname><surname>Macdonald</surname><orcid>0000-0002-7483-9023</orcid><order>1</order></author><author><firstname>Ashley</firstname><surname>Mattheis</surname><order>2</order></author><author><firstname>David</firstname><surname>Wells</surname><order>3</order></author></authors><documents/><OutputDurs/></rfc1807>
spelling v2 65450 2024-01-15 Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online 933e714a4cc37c3ac12d4edc277f8f98 0000-0002-7483-9023 Stuart Macdonald Stuart Macdonald true false 20bd641e721999fbea309db74f2d60c5 Ashley Mattheis Ashley Mattheis true false d3eb40ca96e1df1931ef054d32fbc4cf David Wells David Wells true false 2024-01-15 LAWD The focus of this report is the use of automated content-based tools – in particular those that use artificial intelligence (AI) and machine learning – to detect terrorist content online. In broad terms, such tools follow either a matching-based or a classification-based approach. Matching-based approaches rely on a technique known as hashing. The report explains the distinction between cryptographic hashing and perceptual hashing, explaining that tech companies have tended to rely on the latter for the purposes of content moderation. Classification-based approaches typically involve using a large corpus of texts, which have been manually annotated by human reviewers, to train algorithms to predict whether a new item of content belongs to a particular category (e.g., terrorist content). This approach also raises important issues, including the difficulties compiling a dataset to train the algorithms, the temporal, contextual and cultural limitations of machine learning algorithms, and the resultant danger of incorrect outcomes. In the light of this discussion, the report concludes that human input remains necessary and that oversight mechanisms are essential to correct errors and ensure accountability. It also considers capacity-building measures, including off-the-shelf content moderation solutions and collaborative initiatives, as well as potential future development of AI to address some of the challenges identified. Policy briefing report Terrorism, counterterrorism, AI, machine learning, content moderation, social media 15 1 2024 2024-01-15 https://tate.techagainstterrorism.org/news/tcoaireport https://tate.techagainstterrorism.org/news/tcoaireport COLLEGE NANME Law COLLEGE CODE LAWD Swansea University Not Required European Union ISFP-2021-AG-TCO-101080101 2024-03-23T12:06:52.3067121 2024-01-15T17:02:07.1410064 Faculty of Humanities and Social Sciences Hilary Rodham Clinton School of Law Stuart Macdonald 0000-0002-7483-9023 1 Ashley Mattheis 2 David Wells 3
title Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online
spellingShingle Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online
Stuart Macdonald
Ashley Mattheis
David Wells
title_short Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online
title_full Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online
title_fullStr Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online
title_full_unstemmed Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online
title_sort Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online
author_id_str_mv 933e714a4cc37c3ac12d4edc277f8f98
20bd641e721999fbea309db74f2d60c5
d3eb40ca96e1df1931ef054d32fbc4cf
author_id_fullname_str_mv 933e714a4cc37c3ac12d4edc277f8f98_***_Stuart Macdonald
20bd641e721999fbea309db74f2d60c5_***_Ashley Mattheis
d3eb40ca96e1df1931ef054d32fbc4cf_***_David Wells
author Stuart Macdonald
Ashley Mattheis
David Wells
author2 Stuart Macdonald
Ashley Mattheis
David Wells
format Policy briefing report
publishDate 2024
institution Swansea University
college_str Faculty of Humanities and Social Sciences
hierarchytype
hierarchy_top_id facultyofhumanitiesandsocialsciences
hierarchy_top_title Faculty of Humanities and Social Sciences
hierarchy_parent_id facultyofhumanitiesandsocialsciences
hierarchy_parent_title Faculty of Humanities and Social Sciences
department_str Hilary Rodham Clinton School of Law{{{_:::_}}}Faculty of Humanities and Social Sciences{{{_:::_}}}Hilary Rodham Clinton School of Law
url https://tate.techagainstterrorism.org/news/tcoaireport
document_store_str 0
active_str 0
description The focus of this report is the use of automated content-based tools – in particular those that use artificial intelligence (AI) and machine learning – to detect terrorist content online. In broad terms, such tools follow either a matching-based or a classification-based approach. Matching-based approaches rely on a technique known as hashing. The report explains the distinction between cryptographic hashing and perceptual hashing, explaining that tech companies have tended to rely on the latter for the purposes of content moderation. Classification-based approaches typically involve using a large corpus of texts, which have been manually annotated by human reviewers, to train algorithms to predict whether a new item of content belongs to a particular category (e.g., terrorist content). This approach also raises important issues, including the difficulties compiling a dataset to train the algorithms, the temporal, contextual and cultural limitations of machine learning algorithms, and the resultant danger of incorrect outcomes. In the light of this discussion, the report concludes that human input remains necessary and that oversight mechanisms are essential to correct errors and ensure accountability. It also considers capacity-building measures, including off-the-shelf content moderation solutions and collaborative initiatives, as well as potential future development of AI to address some of the challenges identified.
published_date 2024-01-15T12:06:49Z
_version_ 1794318647265591296
score 11.036815