U.S. flag

An official website of the United States government, Department of Justice.

NCJRS Virtual Library

The Virtual Library houses over 235,000 criminal justice resources, including all known OJP works.
Click here to search the NCJRS Virtual Library

Machine learning clustering and classification of human microbiome source body sites

NCJ Number
311033
Journal
Forensic Science International Volume: 328 Dated: November 2021
Author(s)
Antonio L. Tan-Torres; J. Paul Brooks; Baneshwar Singh; Sarah Seashols-Williams
Date Published
November 2011
Abstract

Distinct microbial signatures associated with specific human body sites can play a role in the identification of biological materials recovered from the crime scene, but at present, methods that have capability to predict origin of biological materials based on such signatures are limited. Metagenomic sequencing and machine learning (ML) offer a promising enhancement to current identification protocols. We use ML for forensic source body site identification using shotgun metagenomic sequenced data to verify the presence of microbiomic signatures capable of discriminating between source body sites and then show that accurate prediction is possible. The consistency between cluster membership and actual source body site (purity) exceeded 99% at the genus taxonomy using off-the-shelf ML clustering algorithms. Similar results were obtained at the family level. Accurate predictions were observed for genus, family, and order taxonomies, as well as with a core set of 51 genera. The accurate outcomes from our replicable process should encourage forensic scientists to seriously consider integrating ML predictors into their source body site identification protocols.

(Publisher abstract provided.)