U.S. flag

An official website of the United States government, Department of Justice.

A quantitative reliability metric for querying large database

NCJ Number
Forensic Science International Volume: 331 Dated: February 2022
Date Published
February 2022

This article presents the development of a quantitative reliability metric (QRM) used and validated with the identification of opioids from standards and seized drug samples within a constructed mass spectral database (aka a library).


Specialized and custom libraries are often used by forensic laboratoriesA redesigned quantitative reliability metric based on the F-distribution (QRMf) is reported for evaluating the reliability of library search. The QRMf provides orthogonal information to the comparison metric (e.g., dot product) and yields a probabilistic result. An intralibrary search can be considered as an idealized search because the top hit, i.e., the closest matching object, will match perfectly. If the search of an unknown object yields the same hit list as the intralibrary search, it would indicate good reliability. For each object in the hit list, a QRMf compares the order of an intralibrary and interlibrary search results and calculates a variance of interlibrary similarity metrics between the records of the intralibrary search and records in the corresponding positions of the interlibrary search. This variance that measures the discordance of the intra and interlibrary search can simply be compared to the variance of the similarity metrics within the interlibrary search results. The ratio of these variances follows an F-distribution that can be used to determine if the discordance is statistically significant and generates the probability based on the cumulative distribution function. The QRMf works for both similarity and dissimilarity and can be used for any queried object and comparison metric that is searched against a database. In this work, the QRMf was used along with the dot product similarity to query the mass spectra of novel synthetic opioids measured by gas chromatography-mass spectrometry (GC/MS). An automated pipeline was devised that used a basis set correction to assist peak detection. The basis was constructed by mass spectra obtained from the blank measurement preceding the analytical run to remove interferences from column bleed and septum degradation. After peak detection, the pipeline applied multivariate curve resolution to the chromatographic peak window to remove background components from the mass spectra. The corrected mass spectra were searched against a customized library for identification. The QRMf can be used along with the similarity metric to detect misidentifications and assist in finding the correct identification when it is not the closest match. (Published Abstract Provided)

Date Published: February 1, 2022