With the increased availability of de novo assembly algorithms, it is feasible to study entire transcriptomes of non-model organisms. Although algorithms are available that are specifically designed for performing transcriptome assembly from high-throughput sequencing data, they are memory-intensive, limiting their applications to small data sets with few libraries. The strategy used in the current project minimizes memory consumption while simultaneously obtaining comparable or improved accuracy over existing algorithms. It provides support for incremental updates of assemblies when new libraries become available. 6 figures, 4 tables, and 31 references (publisher abstract modified)
A Scalable and Memory-Efficient Algorithm for De Novo Transcriptome Assembly of Non-model Organisms
NCJ Number
255269
Journal
BMC Genomics Volume: 18 Dated: 2017
Date Published
2017
Length
26 pages
Annotation
This project developed a transcriptome assembly algorithm that recovers alternatively spliced isoforms and expression levels while using as many RNA-Seq libraries as possible that contain hundreds of gigabases of data, and new techniques were developed so that computations can be performed on a computing cluster with moderate amount of physical memory.
Abstract