No Cover Image

Conference Paper/Proceeding/Abstract 547 views 311 downloads

Evaluation of a Permutation-Based Evolutionary Framework for Lyndon Factorizations

Lily Major Orcid Logo, Amanda Clare Orcid Logo, Jacqueline W. Daykin Orcid Logo, Benjamin Mora Orcid Logo, Leonel Jose Peña Gamboa Orcid Logo, Christine Zarges Orcid Logo

Parallel Problem Solving from Nature – PPSN XVI, Pages: 390 - 403

Swansea University Author: Benjamin Mora Orcid Logo

Abstract

String factorization is an important tool for partitioning data for parallel processing and other algorithmic techniques often found in the context of big data applications such as bioinformatics or compression. Duval’s well-known algorithm uniquely factors a string over an ordered alphabet into Lyn...

Full description

Published in: Parallel Problem Solving from Nature – PPSN XVI
ISBN: 9783030581114 9783030581121
ISSN: 0302-9743 1611-3349
Published: Cham Springer International Publishing 2020
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa54801
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract: String factorization is an important tool for partitioning data for parallel processing and other algorithmic techniques often found in the context of big data applications such as bioinformatics or compression. Duval’s well-known algorithm uniquely factors a string over an ordered alphabet into Lyndon words, i.e., patterned strings which arestrictly smaller than all of their cyclic rotations. While Duval’s algorithm produces a pre-determined factorization, modern applications motivate the demand for factorizations with specific properties, e.g., those that minimize the number of factors or consist of factors with similar lengths. In this paper, we consider the problem of finding an alphabet ordering that yields a Lyndon factorization with such properties. We introduce a flexible evolutionary framework and evaluate it on biological sequence data. For the minimization case, we also propose a new problem-specific heuristic, Flexi-Duval, and a problem-specific mutation operator for Lyndon factorization. Our results show that our framework is competitive with Flexi-Duval for minimization and yields high quality and robust solutions for balancing where no problem-specific algorithm is available.
Keywords: Alphabet Ordering; Biosequences; Duval’s Algorithm; Lyndon words; Permutations; String Factorization
College: Faculty of Science and Engineering
Start Page: 390
End Page: 403