NCJ Number
253077
Date Published
July 2017
Length
205 pages
Annotation
This is the Final Technical Report on the findings and methodology of a project that addressed the lack of appropriate reference data and database tools required for the routine application of forensic mitochondrial DNA (mtDNA) in criminal investigations
Abstract
The objectives of the project were 1) to increase the large-scale availability of high-quality entire mitochondrial genome (mtGenome) reference population data and 2) to improve the information technology infrastructure required to access/search mtGnome data and use them in forensic casework. The first objective was addressed by developing a Sanger-based sequencing strategy that was performed in high-throughput fashion on robotic instrumentation. Using this strategy and an intensive, multi-step data review process, the project produced 588 full mitochondrial genome haplotypes from anonymized, randomly sampled blood serum specimens from three U.S. population groups (African-Americans, U.S. Caucasians, and U.S. Hispanics). Nearly complete resolution of the haplotypes was achieved with full mtGenome sequences for the three populations. Comparisons to published control region datasets showed that the databases developed are as representative as the reference data on which haplotype frequency estimates currently rely. The second objective was achieved by modifying the existing structure of the European DNA Profiling Group mtDNA Population database (EMPOP), in order to both store and query full mtGenome reference data. In addition, improvement was made in the utility of the database for forensic applications by the addition of a number of new features. These included software that performs automated mtDNA haplogroup estimations for both full and partial mtGenome sequences, updated population structure schemes for all mtDNA data currently housed in the database, and various tools that permit both searches and visualization of the geographic distribution of mtDNA, haplogroups, sequences, and individual sequence variants. Extensive appended data compiled in the study