No Cover Image

Conference Paper/Proceeding/Abstract 578 views 314 downloads

Evaluation of a Permutation-Based Evolutionary Framework for Lyndon Factorizations

Lily Major Orcid Logo, Amanda Clare Orcid Logo, Jacqueline W. Daykin Orcid Logo, Benjamin Mora Orcid Logo, Leonel Jose Peña Gamboa Orcid Logo, Christine Zarges Orcid Logo

Parallel Problem Solving from Nature – PPSN XVI, Pages: 390 - 403

Swansea University Author: Benjamin Mora Orcid Logo

Abstract

String factorization is an important tool for partitioning data for parallel processing and other algorithmic techniques often found in the context of big data applications such as bioinformatics or compression. Duval’s well-known algorithm uniquely factors a string over an ordered alphabet into Lyn...

Full description

Published in: Parallel Problem Solving from Nature – PPSN XVI
ISBN: 9783030581114 9783030581121
ISSN: 0302-9743 1611-3349
Published: Cham Springer International Publishing 2020
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa54801
Tags: Add Tag
No Tags, Be the first to tag this record!
first_indexed 2020-09-30T11:31:21Z
last_indexed 2022-06-16T03:12:39Z
id cronfa54801
recordtype SURis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2022-06-15T15:23:15.8847059</datestamp><bib-version>v2</bib-version><id>54801</id><entry>2020-07-24</entry><title>Evaluation of a Permutation-Based Evolutionary Framework for Lyndon Factorizations</title><swanseaauthors><author><sid>557f93dfae240600e5bd4398bf203821</sid><ORCID>0000-0002-2945-3519</ORCID><firstname>Benjamin</firstname><surname>Mora</surname><name>Benjamin Mora</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2020-07-24</date><deptcode>SCS</deptcode><abstract>String factorization is an important tool for partitioning data for parallel processing and other algorithmic techniques often found in the context of big data applications such as bioinformatics or compression. Duval&#x2019;s well-known algorithm uniquely factors a string over an ordered alphabet into Lyndon words, i.e., patterned strings which arestrictly smaller than all of their cyclic rotations. While Duval&#x2019;s algorithm produces a pre-determined factorization, modern applications motivate the demand for factorizations with specific properties, e.g., those that minimize the number of factors or consist of factors with similar lengths. In this paper, we consider the problem of finding an alphabet ordering that yields a Lyndon factorization with such properties. We introduce a flexible evolutionary framework and evaluate it on biological sequence data. For the minimization case, we also propose a new problem-specific heuristic, Flexi-Duval, and a problem-specific mutation operator for Lyndon factorization. Our results show that our framework is competitive with Flexi-Duval for minimization and yields high quality and robust solutions for balancing where no problem-specific algorithm is available.</abstract><type>Conference Paper/Proceeding/Abstract</type><journal>Parallel Problem Solving from Nature &#x2013; PPSN XVI</journal><volume/><journalNumber/><paginationStart>390</paginationStart><paginationEnd>403</paginationEnd><publisher>Springer International Publishing</publisher><placeOfPublication>Cham</placeOfPublication><isbnPrint>9783030581114</isbnPrint><isbnElectronic>9783030581121</isbnElectronic><issnPrint>0302-9743</issnPrint><issnElectronic>1611-3349</issnElectronic><keywords>Alphabet Ordering; Biosequences; Duval&#x2019;s Algorithm; Lyndon words; Permutations; String Factorization</keywords><publishedDay>31</publishedDay><publishedMonth>8</publishedMonth><publishedYear>2020</publishedYear><publishedDate>2020-08-31</publishedDate><doi>10.1007/978-3-030-58112-1_27</doi><url/><notes/><college>COLLEGE NANME</college><department>Computer Science</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>SCS</DepartmentCode><institution>Swansea University</institution><apcterm/><lastEdited>2022-06-15T15:23:15.8847059</lastEdited><Created>2020-07-24T11:17:37.9307366</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Lily</firstname><surname>Major</surname><orcid>0000-0002-5783-8432</orcid><order>1</order></author><author><firstname>Amanda</firstname><surname>Clare</surname><orcid>0000-0001-8315-3659</orcid><order>2</order></author><author><firstname>Jacqueline W.</firstname><surname>Daykin</surname><orcid>0000-0003-1123-8703</orcid><order>3</order></author><author><firstname>Benjamin</firstname><surname>Mora</surname><orcid>0000-0002-2945-3519</orcid><order>4</order></author><author><firstname>Leonel Jose Pe&#xF1;a</firstname><surname>Gamboa</surname><orcid>0000-0002-7034-8438</orcid><order>5</order></author><author><firstname>Christine</firstname><surname>Zarges</surname><orcid>0000-0002-2829-4296</orcid><order>6</order></author></authors><documents><document><filename>54801__18115__865b7f643488407eb508b4b3c6f81283.pdf</filename><originalFilename>PPSN_2020___Lyndon_Factorization(7).pdf</originalFilename><uploaded>2020-09-04T10:16:38.3683944</uploaded><type>Output</type><contentLength>1991706</contentLength><contentType>application/pdf</contentType><version>Accepted Manuscript</version><cronfaStatus>true</cronfaStatus><copyrightCorrect>true</copyrightCorrect><language>English</language></document></documents><OutputDurs/></rfc1807>
spelling 2022-06-15T15:23:15.8847059 v2 54801 2020-07-24 Evaluation of a Permutation-Based Evolutionary Framework for Lyndon Factorizations 557f93dfae240600e5bd4398bf203821 0000-0002-2945-3519 Benjamin Mora Benjamin Mora true false 2020-07-24 SCS String factorization is an important tool for partitioning data for parallel processing and other algorithmic techniques often found in the context of big data applications such as bioinformatics or compression. Duval’s well-known algorithm uniquely factors a string over an ordered alphabet into Lyndon words, i.e., patterned strings which arestrictly smaller than all of their cyclic rotations. While Duval’s algorithm produces a pre-determined factorization, modern applications motivate the demand for factorizations with specific properties, e.g., those that minimize the number of factors or consist of factors with similar lengths. In this paper, we consider the problem of finding an alphabet ordering that yields a Lyndon factorization with such properties. We introduce a flexible evolutionary framework and evaluate it on biological sequence data. For the minimization case, we also propose a new problem-specific heuristic, Flexi-Duval, and a problem-specific mutation operator for Lyndon factorization. Our results show that our framework is competitive with Flexi-Duval for minimization and yields high quality and robust solutions for balancing where no problem-specific algorithm is available. Conference Paper/Proceeding/Abstract Parallel Problem Solving from Nature – PPSN XVI 390 403 Springer International Publishing Cham 9783030581114 9783030581121 0302-9743 1611-3349 Alphabet Ordering; Biosequences; Duval’s Algorithm; Lyndon words; Permutations; String Factorization 31 8 2020 2020-08-31 10.1007/978-3-030-58112-1_27 COLLEGE NANME Computer Science COLLEGE CODE SCS Swansea University 2022-06-15T15:23:15.8847059 2020-07-24T11:17:37.9307366 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Lily Major 0000-0002-5783-8432 1 Amanda Clare 0000-0001-8315-3659 2 Jacqueline W. Daykin 0000-0003-1123-8703 3 Benjamin Mora 0000-0002-2945-3519 4 Leonel Jose Peña Gamboa 0000-0002-7034-8438 5 Christine Zarges 0000-0002-2829-4296 6 54801__18115__865b7f643488407eb508b4b3c6f81283.pdf PPSN_2020___Lyndon_Factorization(7).pdf 2020-09-04T10:16:38.3683944 Output 1991706 application/pdf Accepted Manuscript true true English
title Evaluation of a Permutation-Based Evolutionary Framework for Lyndon Factorizations
spellingShingle Evaluation of a Permutation-Based Evolutionary Framework for Lyndon Factorizations
Benjamin Mora
title_short Evaluation of a Permutation-Based Evolutionary Framework for Lyndon Factorizations
title_full Evaluation of a Permutation-Based Evolutionary Framework for Lyndon Factorizations
title_fullStr Evaluation of a Permutation-Based Evolutionary Framework for Lyndon Factorizations
title_full_unstemmed Evaluation of a Permutation-Based Evolutionary Framework for Lyndon Factorizations
title_sort Evaluation of a Permutation-Based Evolutionary Framework for Lyndon Factorizations
author_id_str_mv 557f93dfae240600e5bd4398bf203821
author_id_fullname_str_mv 557f93dfae240600e5bd4398bf203821_***_Benjamin Mora
author Benjamin Mora
author2 Lily Major
Amanda Clare
Jacqueline W. Daykin
Benjamin Mora
Leonel Jose Peña Gamboa
Christine Zarges
format Conference Paper/Proceeding/Abstract
container_title Parallel Problem Solving from Nature – PPSN XVI
container_start_page 390
publishDate 2020
institution Swansea University
isbn 9783030581114
9783030581121
issn 0302-9743
1611-3349
doi_str_mv 10.1007/978-3-030-58112-1_27
publisher Springer International Publishing
college_str Faculty of Science and Engineering
hierarchytype
hierarchy_top_id facultyofscienceandengineering
hierarchy_top_title Faculty of Science and Engineering
hierarchy_parent_id facultyofscienceandengineering
hierarchy_parent_title Faculty of Science and Engineering
department_str School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science
document_store_str 1
active_str 0
description String factorization is an important tool for partitioning data for parallel processing and other algorithmic techniques often found in the context of big data applications such as bioinformatics or compression. Duval’s well-known algorithm uniquely factors a string over an ordered alphabet into Lyndon words, i.e., patterned strings which arestrictly smaller than all of their cyclic rotations. While Duval’s algorithm produces a pre-determined factorization, modern applications motivate the demand for factorizations with specific properties, e.g., those that minimize the number of factors or consist of factors with similar lengths. In this paper, we consider the problem of finding an alphabet ordering that yields a Lyndon factorization with such properties. We introduce a flexible evolutionary framework and evaluate it on biological sequence data. For the minimization case, we also propose a new problem-specific heuristic, Flexi-Duval, and a problem-specific mutation operator for Lyndon factorization. Our results show that our framework is competitive with Flexi-Duval for minimization and yields high quality and robust solutions for balancing where no problem-specific algorithm is available.
published_date 2020-08-31T04:08:34Z
_version_ 1763753605941166080
score 11.016771