U.S. flag

An official website of the United States government, Department of Justice.

NCJRS Virtual Library

The Virtual Library houses over 235,000 criminal justice resources, including all known OJP works.
Click here to search the NCJRS Virtual Library

A Scalable and Memory-Efficient Algorithm for De Novo Transcriptome Assembly of Non-model Organisms

NCJ Number
255269
Journal
BMC Genomics Volume: 18 Dated: 2017
Author(s)
Sing-Hoi Sze; Meaghan L. Pimsler; Jeffrey K Tomberlin; Corbin D. Jones; Aaron M Tarone
Date Published
2017
Length
26 pages
Annotation
This project developed a transcriptome assembly algorithm that recovers alternatively spliced isoforms and expression levels while using as many RNA-Seq libraries as possible that contain hundreds of gigabases of data, and new techniques were developed so that computations can be performed on a computing cluster with moderate amount of physical memory.
Abstract

With the increased availability of de novo assembly algorithms, it is feasible to study entire transcriptomes of non-model organisms. Although algorithms are available that are specifically designed for performing transcriptome assembly from high-throughput sequencing data, they are memory-intensive, limiting their applications to small data sets with few libraries. The strategy used in the current project minimizes memory consumption while simultaneously obtaining comparable or improved accuracy over existing algorithms. It provides support for incremental updates of assemblies when new libraries become available. 6 figures, 4 tables, and 31 references (publisher abstract modified)