Combining Probabilistic Genotyping and Kinship Analysis with DBLR™ Software

Written by Maarten Kruijver

Share this article

Six years ago, the team that revolutionized forensic DNA interpretation by creating the probabilistic genotyping software STRmix™ introduced the first version of DBLR™, a highly configurable tool for the calculation of likelihood ratios (LRs) for forensic DNA evidence.

DBLR™ (Database Likelihood Ratios) is designed to quickly calculate likelihood ratios. It facilitates rapid searches of DNA profiles against databases of individuals. The tool also enables specialized testing of DNA samples, potentially including complex familial relationships.

Since its inception, DBLR™ has undergone numerous improvements – all of which have helped to make DBLR™ an essential tool for casework activities such as identifying unidentified human remains, investigating the expected value of evidence, and generating intelligence for cold cases. Recent improvements mean that both STR and SNP evidential profiles generated using Next Generation Sequencing (NGS) technology can be imported into DBLR™ with LRs assigned for various scenarios.

Most of the changes to DBLR™ are to the Kinship module, a module that is extremely flexible and powerful, enabling the user to create virtually any pedigree of interest. The prime purpose of the Kinship module is to test which pedigree best explains the observed DNA profiles. This is applied when testing simple relationships such as paternity as well as more complex relationships.

Due to the flexible nature of the software, the Kinship module can be applied to a wider range of problems besides relationship testing. Below we discuss three ways in which the Kinship module can be used. The first application concerns identification of a missing person. The other two applications demonstrate the utility of the Kinship module for complex evidence evaluation when comparing a Person of Interest (POI) to a DNA mixture.

Database Search for Missing Persons

It is now possible to infer the probable genotypes of a pedigree member based on family references. This is useful in cases where the POI’s reference profile is unavailable. Using the genotypes of family members, a list of the missing person’s potential genotypes can be inferred just like STRmix™ infers the genotypes of contributors to DNA mixtures through deconvolution. The inferred genotypes can then be used in a database search. This is a type of familial search. The Kinship module within DBLR™ is more powerful than traditional approaches to familial searching, which are limited to pairwise relationships with known individuals (parents, children, or siblings). In DBLR™, data from arbitrary known references in a pedigree can be used in a combined analysis.

Identification through DNA references of family members is less powerful, the more distant the family members are. DBLR™ implements a simulation tool to quickly estimate the range of expected LRs for the true pedigree members, as well as the rate of adventitious matching based on the inferred genotypes of the missing person. This enables the analyst to decide whether the family references are sufficient for identification or if further information is needed.

Consider a child abduction case. You don’t have a suitable reference profile for the child to load to a missing person’s database. You do have the reference profiles of the child’s biological mother and two full siblings. The genotype of the missing child can be inferred using the genotypes of the biological mother and the two biological siblings. The weighted list of genotypes can be exported and used in several ways. It can be compared directly to a recovered evidential profile or can be searched against a database of unidentified remains. If no match is obtained, it can be automatically searched when new profiles are added to a database of unidentified remains.

DNA Mixtures with Related Contributors

Probabilistic genotyping tools such as STRmix™ are widely used to compare a POI to a DNA mixture. The weight of evidence for contribution to the mixture is given by the LR. In this analysis, the different contributors to the mixture are traditionally assumed to be unrelated to each other. This assumption is not always met. Using the Kinship module within DBLR™, we can also assign LRs for mixtures where the donors are assumed to be related.

For example, say we have recovered a mixed DNA profile believed to be originating from two contributors. In addition, we have a reference profile from one POI. The standard propositions used are:

H1: The DNA originated from the POI and one other individual, unrelated to the POI;

H2: The DNA originated from unknown individuals who are unrelated to the POI.

If case circumstances indicate that the two donors are in fact full siblings, then two further propositions become of interest:

H1S: The DNA originated from the POI and a full sibling of the POI;

H2S: The DNA originated from two full siblings, who are unrelated to the POI.

This second proposition pair still evaluates support for the contribution of the POI to the mixed profile. Unlike the first proposition pair, however, it does not assume that the mixture donors are unrelated to each other. It has been shown that correctly assuming relatedness between contributors allows for more efficient discrimination of true contributors and non-contributors. The Kinship module within DBLR™ makes it straightforward to assign LRs for propositions involving relatives as contributors to DNA mixtures.

To Condition or Not to Condition Should Not Be the Question

Conditioning on the presence of an assumed contributor to a DNA mixture allows more efficient discrimination of mixture contributors and non-contributors. In some cases it is obvious that the DNA of a known donor can be assumed to be present in a DNA mixture, for instance because the DNA of a complainant is expected to be present in evidence obtained from a sexual assault kit. In other cases, it is possible but not certain that the DNA from a person relevant to the case is present on an item. Traditionally, such cases require a binary decision to either assume contribution or assume non-contribution of this person. With DBLR™ it is possible to combine both alternatives in a single proposition where each alternative is assigned a prior weight.

This method can be used to evaluate the probability of the evidence given a complex proposition(s) that considers multiple sub-propositions with different assumptions. These assumptions are specified through so-called probabilistic links specifying DNA contributions of persons to samples. Probabilistic links can be useful in situations where there is ambiguity, uncertainty or even contention about assumptions made in the hypotheses, such as assuming a contributor’s DNA is present.

For example, if the evidence submitted for examination is a swab from the steering wheel of a recovered vehicle and it results in a mixed DNA profile, it may be contentious to condition on the contribution of the vehicle owner’s DNA. Using probabilistic links, we can set up sub hypotheses under H1 and H2 where we condition and do not condition under both hypotheses. One LR will be reported which considers the probability of the evidence given all scenarios, the most likely giving most weight to the combined statistic.

Conclusion

When used in conjunction with STRmix™ and FaSTR™ DNA, DBLR™ forms part of a comprehensive forensic workflow, from DNA analysis to interpretation, evaluation, and database searching. The Kinship module of the DBLR™ software leverages STRmix™ deconvolutions to enable complex DNA evidence evaluation involving relatives.

Maarten Kruijver is a forensic statistician who has published widely on the application of statistical methods to legal and forensic contexts, particularly in the context of DNA evidence. His work focuses on the use of DNA in kinship testing, missing persons identification, and criminal investigations. Since 2016, he has been part of the STRmix™ team, where he developed the DBLR™ software for evidence evaluation of single-source and mixed DNA profiles. He is also involved in research and development related to products such as FaSTR™ for profile analysis and STRmix™ for mixture deconvolution. For more information, visit http://www.strmix.com.