Concepedia

Abstract

A statistical model for determining whether a pair of documents, a known and a questioned, were written by the same individual is proposed. The model has the following four components: (i) discriminating elements, e.g., global features and characters, are extracted from each document; (ii) differences between corresponding elements from each document are computed; (iii) using conditional probability estimates of each difference, the log-likelihood ratio (LLR) is computed for the hypotheses that the documents were written by the same or different writers; the conditional probability estimates themselves are determined from labeled samples using either Gaussian or gamma estimates for the differences assuming their statistical independence; and (iv) distributions of the LLRs for same and different writer LLRs are analyzed to calibrate the strength of evidence into a standard nine-point scale used by questioned document examiners. The model is illustrated with experimental results for a specific set of discriminating elements.

References

YearCitations

Page 1