U.S. flag

An official website of the United States government, Department of Justice.

NCJRS Virtual Library

The Virtual Library houses over 235,000 criminal justice resources, including all known OJP works.
Click here to search the NCJRS Virtual Library

MPrESS: An R-Package for Accurately Predicting Power for Comparisons of 16S rRNA Microbiome Taxa Distributions including Simulation by Dirichlet Mixture Modeling

NCJ Number
307126
Journal
Microorganisms Volume: 11 Issue: 5 Dated: APR 2023
Author(s)
Thomas H. Clarke; Chris Greco; Lauren Brinkac; Karen E. Nelson; Harinder Singh
Date Published
April 2023
Length
10 pages
Annotation

The authors of this paper present a novel R software package, called MPrESS, that enables researchers to determine the minimum number of samples required to address a given study hypothesis using 16S rRNA gene microbiome data with sufficient power and allows users to compute power calculations based only on a subset of DESeq2 identified taxa.

Abstract

Deep sequencing has revealed that the 16S rRNA gene composition of the human microbiome can vary between populations. However, when existing data are insufficient to address the desired study questions due to limited sample sizes, Dirichlet mixture modeling (DMM) can simulate 16S rRNA gene predictions from experimental microbiome data. The authors examined the extent to which simulated 16S rRNA gene microbiome data can accurately reflect the diversity within that identified from experimental data and calculate the power. Even when experimental and simulated datasets differed by less than 10 percent, simulation by DMM consistently overestimates power, except when using only highly discriminating taxa. Admixtures of DMM with experimental data performed poorly compared to pure simulation and did not show the same correlation with experimental data p-value and power values. While multiple replications of random sampling remain the favored method of determining the power, when the estimated sample size required to achieve a certain power exceeds the sample number, then simulated samples based on DMM can be used. The authors introduce an R-Package, MPrESS, to assist in power calculation and sample size estimation for a 16S rRNA gene microbiome dataset to detect a difference between populations. MPrESS can be downloaded from GitHub.(Publisher Abstract Provided)