Detail publikace

Fusing linguistic and acoustic information for automated forensic speaker comparison

SERGIDOU, E. YPMA, R. ROHDIN, J. WORRING, M. GERADTS, Z. BOSMA, W.

Originální název

Fusing linguistic and acoustic information for automated forensic speaker comparison

Typ

článek v časopise ve Web of Science, Jimp

Jazyk

angličtina

Originální abstrakt

Verifying the speaker of a speech fragment can be crucial in attributing a crime to a suspect. The question can be addressed given disputed and reference speech material, adopting the recommended and scientifically accepted likelihood ratio framework for reporting evidential strength in court. In forensic practice, usually, auditory and acoustic analyses are performed to carry out such a verification task considering a diversity of features, such as language competence, pronunciation, or other linguistic features. Automated speaker comparison systems can also be used alongside those manual analyses. State-of-the-art automatic speaker comparison systems are based on deep neural networks that take acoustic features as input. Additional information, though, may be obtained from linguistic analysis. In this paper, we aim to answer if, when and how modern acoustic-based systems can be complemented by an authorship technique based on frequent words, within the likelihood ratio framework. We consider three different approaches to derive a combined likelihood ratio: using a support vector machine algorithm, fitting bivariate normal distributions, and passing the score of the acoustic system as additional input to the frequent-word analysis. We apply our method to the forensically relevant dataset FRIDA and the FISHER corpus, and we explore under which conditions fusion is valuable. We evaluate our results in terms of log likelihood ratio cost (Cllr) and equal error rate (EER). We show that fusion can be beneficial, especially in the case of intercepted phone calls with noise in the background.

Klíčová slova

Forensic speaker comparison; Frequent-word analysis; Likelihood ratio framework; Multi-modal analysis; Information fusion

Autoři

SERGIDOU, E.; YPMA, R.; ROHDIN, J.; WORRING, M.; GERADTS, Z.; BOSMA, W.

Vydáno

1. 9. 2024

Nakladatel

ELSEVIER SCI LTD

Místo

London

ISSN

1355-0306

Periodikum

SCIENCE & JUSTICE

Ročník

64

Číslo

5

Stát

Spojené království Velké Británie a Severního Irska

Strany od

485

Strany do

497

Strany počet

13

URL

BibTex

@article{BUT197614,
  author="Eleni Konstantina {Sergidou} and Rolf {Ypma} and Johan Andréas {Rohdin} and Marcel {Worring} and Zeno {Geradts} and Wauter {Bosma}",
  title="Fusing linguistic and acoustic information for automated forensic speaker comparison",
  journal="SCIENCE & JUSTICE",
  year="2024",
  volume="64",
  number="5",
  pages="485--497",
  doi="10.1016/j.scijus.2024.07.001",
  issn="1355-0306",
  url="https://pdf.sciencedirectassets.com/274162/1-s2.0-S1355030624X00040/1-s2.0-S135503062400056X/main.pdf?X-Amz-Security-Token=IQoJb3JpZ2luX2VjEJj%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaCXVzLWVhc3QtMSJIMEYCIQD8rjrZi%2FhPcL5bX04sCPiZ7t1gOY2P2HhckbMwnNI%2BowIhAPSL1qk91pZeBYKl64ulkkxTVtGNw3%2FhcIqIe%2FEsECyAKrIFCBAQBRoMMDU5MDAzNTQ2ODY1IgxJGUcIfesue7uRz0EqjwUbwoISPHhLCgSSGV1GRJPKCbpWqK%2FjTNoJ8dlrM8OaDty8YcGiIu6bGFuunKMWP8xvOPfBMwuBvUFIGCkck2SA1BZ3bw6jWH8Yw%2Bi5jxtOK3MUMpsz5jwX8iDTWAhPR2ZNsKsAQiR9dL%2Bw3BCfgX8UOh5Y6S66bfh"
}