Journal article 312 views 147 downloads
MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation
mSystems, Volume: 1, Issue: 3, Start page: e00020-16
Swansea University Author: Daniel Falush
-
PDF | Version of Record
Distributed under the terms of a Creative Commons CC-BY License.
Download (5.19MB)
DOI (Published version): 10.1128/msystems.00020-16
Abstract
Metagenomic profiling is challenging in part because of the highly uneven sampling of the tree of life by genome sequencing projects and the limitations imposed by performing phylogenetic inference at fixed taxonomic ranks. We present the algorithm MetaPalette, which uses long k-mer sizes (k = 30, 5...
Published in: | mSystems |
---|---|
ISSN: | 2379-5077 |
Published: |
American Society for Microbiology
2016
|
Online Access: |
Check full text
|
URI: | https://cronfa.swan.ac.uk/Record/cronfa34179 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
first_indexed |
2017-07-21T15:19:22Z |
---|---|
last_indexed |
2020-10-28T03:45:46Z |
id |
cronfa34179 |
recordtype |
SURis |
fullrecord |
<?xml version="1.0"?><rfc1807><datestamp>2020-10-27T15:35:38.0018056</datestamp><bib-version>v2</bib-version><id>34179</id><entry>2017-06-07</entry><title>MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation</title><swanseaauthors><author><sid>bcbf3802c87745d80f3e94cc66a0fcbf</sid><firstname>Daniel</firstname><surname>Falush</surname><name>Daniel Falush</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2017-06-07</date><deptcode>PMSC</deptcode><abstract>Metagenomic profiling is challenging in part because of the highly uneven sampling of the tree of life by genome sequencing projects and the limitations imposed by performing phylogenetic inference at fixed taxonomic ranks. We present the algorithm MetaPalette, which uses long k-mer sizes (k = 30, 50) to fit a k-mer “palette” of a given sample to the k-mer palette of reference organisms. By modeling the k-mer palettes of unknown organisms, the method also gives an indication of the presence, abundance, and evolutionary relatedness of novel organisms present in the sample. The method returns a traditional, fixed-rank taxonomic profile which is shown on independently simulated data to be one of the most accurate to date. Tree figures are also returned that quantify the relatedness of novel organisms to reference sequences, and the accuracy of such figures is demonstrated on simulated spike-ins and a metagenomic soil sample. The software implementing MetaPalette is available at: https://github.com/dkoslicki/MetaPalette. Pretrained databases are included for Archaea, Bacteria, Eukaryota, and viruses.</abstract><type>Journal Article</type><journal>mSystems</journal><volume>1</volume><journalNumber>3</journalNumber><paginationStart>e00020-16</paginationStart><paginationEnd/><publisher>American Society for Microbiology</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic>2379-5077</issnElectronic><keywords>Taxonomic profiling, Metagenomics, Quantitative methods</keywords><publishedDay>28</publishedDay><publishedMonth>6</publishedMonth><publishedYear>2016</publishedYear><publishedDate>2016-06-28</publishedDate><doi>10.1128/msystems.00020-16</doi><url/><notes>Author Video: An author video summary of this article is available</notes><college>COLLEGE NANME</college><department>Medicine</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>PMSC</DepartmentCode><institution>Swansea University</institution><degreesponsorsfunders>RCUK, MR/M501608/1</degreesponsorsfunders><apcterm/><lastEdited>2020-10-27T15:35:38.0018056</lastEdited><Created>2017-06-07T15:02:58.2359410</Created><path><level id="1">Faculty of Medicine, Health and Life Sciences</level><level id="2">Swansea University Medical School - Medicine</level></path><authors><author><firstname>David</firstname><surname>Koslicki</surname><order>1</order></author><author><firstname>Daniel</firstname><surname>Falush</surname><order>2</order></author><author><firstname>Daniel</firstname><surname>Falush</surname><order>3</order></author></authors><documents><document><filename>0034179-07062017150718.pdf</filename><originalFilename>Falush.e00020-16.full.pdf</originalFilename><uploaded>2017-06-07T15:07:18.0270000</uploaded><type>Output</type><contentLength>5408399</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><embargoDate>2017-06-07T00:00:00.0000000</embargoDate><documentNotes>Distributed under the terms of a Creative Commons CC-BY License.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language></document></documents><OutputDurs/></rfc1807> |
spelling |
2020-10-27T15:35:38.0018056 v2 34179 2017-06-07 MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation bcbf3802c87745d80f3e94cc66a0fcbf Daniel Falush Daniel Falush true false 2017-06-07 PMSC Metagenomic profiling is challenging in part because of the highly uneven sampling of the tree of life by genome sequencing projects and the limitations imposed by performing phylogenetic inference at fixed taxonomic ranks. We present the algorithm MetaPalette, which uses long k-mer sizes (k = 30, 50) to fit a k-mer “palette” of a given sample to the k-mer palette of reference organisms. By modeling the k-mer palettes of unknown organisms, the method also gives an indication of the presence, abundance, and evolutionary relatedness of novel organisms present in the sample. The method returns a traditional, fixed-rank taxonomic profile which is shown on independently simulated data to be one of the most accurate to date. Tree figures are also returned that quantify the relatedness of novel organisms to reference sequences, and the accuracy of such figures is demonstrated on simulated spike-ins and a metagenomic soil sample. The software implementing MetaPalette is available at: https://github.com/dkoslicki/MetaPalette. Pretrained databases are included for Archaea, Bacteria, Eukaryota, and viruses. Journal Article mSystems 1 3 e00020-16 American Society for Microbiology 2379-5077 Taxonomic profiling, Metagenomics, Quantitative methods 28 6 2016 2016-06-28 10.1128/msystems.00020-16 Author Video: An author video summary of this article is available COLLEGE NANME Medicine COLLEGE CODE PMSC Swansea University RCUK, MR/M501608/1 2020-10-27T15:35:38.0018056 2017-06-07T15:02:58.2359410 Faculty of Medicine, Health and Life Sciences Swansea University Medical School - Medicine David Koslicki 1 Daniel Falush 2 Daniel Falush 3 0034179-07062017150718.pdf Falush.e00020-16.full.pdf 2017-06-07T15:07:18.0270000 Output 5408399 application/pdf Version of Record true 2017-06-07T00:00:00.0000000 Distributed under the terms of a Creative Commons CC-BY License. true eng |
title |
MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation |
spellingShingle |
MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation Daniel Falush |
title_short |
MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation |
title_full |
MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation |
title_fullStr |
MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation |
title_full_unstemmed |
MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation |
title_sort |
MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation |
author_id_str_mv |
bcbf3802c87745d80f3e94cc66a0fcbf |
author_id_fullname_str_mv |
bcbf3802c87745d80f3e94cc66a0fcbf_***_Daniel Falush |
author |
Daniel Falush |
author2 |
David Koslicki Daniel Falush Daniel Falush |
format |
Journal article |
container_title |
mSystems |
container_volume |
1 |
container_issue |
3 |
container_start_page |
e00020-16 |
publishDate |
2016 |
institution |
Swansea University |
issn |
2379-5077 |
doi_str_mv |
10.1128/msystems.00020-16 |
publisher |
American Society for Microbiology |
college_str |
Faculty of Medicine, Health and Life Sciences |
hierarchytype |
|
hierarchy_top_id |
facultyofmedicinehealthandlifesciences |
hierarchy_top_title |
Faculty of Medicine, Health and Life Sciences |
hierarchy_parent_id |
facultyofmedicinehealthandlifesciences |
hierarchy_parent_title |
Faculty of Medicine, Health and Life Sciences |
department_str |
Swansea University Medical School - Medicine{{{_:::_}}}Faculty of Medicine, Health and Life Sciences{{{_:::_}}}Swansea University Medical School - Medicine |
document_store_str |
1 |
active_str |
0 |
description |
Metagenomic profiling is challenging in part because of the highly uneven sampling of the tree of life by genome sequencing projects and the limitations imposed by performing phylogenetic inference at fixed taxonomic ranks. We present the algorithm MetaPalette, which uses long k-mer sizes (k = 30, 50) to fit a k-mer “palette” of a given sample to the k-mer palette of reference organisms. By modeling the k-mer palettes of unknown organisms, the method also gives an indication of the presence, abundance, and evolutionary relatedness of novel organisms present in the sample. The method returns a traditional, fixed-rank taxonomic profile which is shown on independently simulated data to be one of the most accurate to date. Tree figures are also returned that quantify the relatedness of novel organisms to reference sequences, and the accuracy of such figures is demonstrated on simulated spike-ins and a metagenomic soil sample. The software implementing MetaPalette is available at: https://github.com/dkoslicki/MetaPalette. Pretrained databases are included for Archaea, Bacteria, Eukaryota, and viruses. |
published_date |
2016-06-28T03:42:23Z |
_version_ |
1763751959089643520 |
score |
11.037603 |