University of Cape Town
Browse

Data from: Worth the fuss? Maximising informativeness for target capture-based phylogenomics in Erica (Ericaceae)

Download all (4.4 MB)
dataset
posted on 2024-10-02, 07:09 authored by Seth MuskerSeth Musker, Michael PirieMichael Pirie, Nicolai NuerkNicolai Nuerk

Plant phylogenetics has been revolutionised in the genomic era, with target capture acting as the

primary workhorse of most recent research in the new field of phylogenomics. Target capture (aka

Hyb-Seq) allows researchers to sequence hundreds of genomic regions (loci) of their choosing, at

relatively low cost per sample, from which to derive phylogenetically informative data. Although

this highly flexible and widely applicable method has rightly earned its place as the field’s de facto

standard, it does not come without its challenges. In particular, users have to specify which loci

to sequence—a surprisingly difficult task, especially when working with non-model groups, as it

requires pre-existing genomic resources in the form of assembled genomes and/or transcriptomes.

In the absence of taxon-specific genomic resources, target sets exist that are designed to work

across broad taxonomic scales. However, the highly conserved loci that they target may lack

informativeness for difficult phylogenetic problems, such as that presented by the rapid radiation

of Erica in southern Africa. We designed a target set for Erica phylogenomics intended to

maximise informativeness and minimise paralogy while maintaining universality by including genes

from the widely used Angiosperms353 set. Comprising just over 300 genes, the targets had

excellent recovery rates in roughly 90 Erica species as well as outgroups from Calluna, Daboecia,

and Rhododendron, and had high information content as measured by parsimony informative

sites and Quartet Internode Resolution Probability (QIRP) at shallow nodes. Notably, QIRP was

positively correlated with intron content, while including introns in targets—rather than recovering

them via exon-flanking “bycatch”—substantially improved intron recovery. Overall, our results

show the value of building a custom target set, and we provide a suite of open-source tools that

can be used to replicate our approach in other groups (github.com/SethMusker/TargetVet).

Funding

Deutsche Forschungsgemeinschaft (PI 1169/1-2)

History

Department/Unit

University of Cape Town Department of Biological Sciences