U.S. flag

An official website of the United States government, Department of Justice.

NCJRS Virtual Library

The Virtual Library houses over 235,000 criminal justice resources, including all known OJP works.
Click here to search the NCJRS Virtual Library

Modeling Subpopulations for Hierarchically Structured Data

NCJ Number
310181
Author(s)
Andrew Simpson; Semhar Michael; Dylan Borchert; Christopher Saunders; Larry Tang
Date Published
February 2024
Annotation

This article proposes the use of a semi-supervised mixture modeling approach for modeling subpopulation structures that are known to come from the same sample source.

Abstract

The field of forensic statistics offers a unique hierarchical data structure in which a population is composed of several subpopulations of sources and a sample is collected from each source. This subpopulation structure creates an additional layer of complexity. Hence, the data has a hierarchical structure in addition to the existence of underlying subpopulations. Finite mixtures are known for modeling heterogeneity; however, previous parameter estimation procedures assume that the data is generated through a simple random sampling process. The authors propose using a semi-supervised mixture modeling approach to model the subpopulation structure which leverages the fact that we know the collection of samples came from the same source, yet an unknown subpopulation. A simulation study and a real data analysis based on famous glass datasets and a keystroke dynamic typing data set show that the proposed approach performs better than other approaches that have been used previously in practice. (Published Abstract Provided)

Downloads