Forensic Speaker Recognition: Where are we heading?

Since the publication of the report “Strengthening Forensic Science in the United States: A Path Forward” by the National Academy of Sciences (NAS) of the United States back in 2009,  a lot of controversy has been created around the reliability of Forensic reports in Courts around the world. NAS was ruthless in delivering its conclusions, finding many problems including absence of mandatory standardization, issues relating to the interpretation of forensic evidence,  the lack of established limits and measures of performance, and disparity of criteria for admissibility in Court. That debate is still continuing and affects all the different Forensic Areas, from fingerprint (Brandon Mayfield) to DNA (The case against DNA).

As one might imagine, Forensic Speaker Recognition is not free from the weaknesses mentioned by NAS in its report and, subsequently, the controversy which was spawned afterwards. As part of this process, some prominent cases have been in the spotlight lately in which deficient voice analysis has come dangerously close to causing, or has in fact caused critical mistakes in Judicial ruling. Cases like the Trayvon Martin case (Judge blocks audio expert testimony in Trayvon Martin case) in the US or the Oscar Sanchez case  (El misterio de Oscar) in Italy have clearly shown how dangerous and misleading a Forensic Report can be when the analysis is performed by pseudo-experts. Recently, a new investigative initiative (Hearing voices) created by a group of European journalists has begun an exhaustive review of the “infamy track” of cases where the forensic voice reports were based on non-rigorous analysis: analysis methods directly based on pseudo-scientific methods with no proven foundation in the scientific body.

Given this context, it is legitimate to ask: Are all Forensic Speaker Identification methods created equal? Is there a way to distinguish the charlatans form the practitioners firmly rooted in science? We believe that the answer is a clear “YES”.

It is true that there is still a lot of room for improvement in the Forensic Speaker Identification community and that there are issues which must be faced sooner rather than later. According to a survey done by Interpol this year, more than 40% of the labs that are performing Forensic Speaker Recognition today are doing so only based on “auditory” approaches (these methods involve human listening as the basis of decision and do not include any specific tool to perform measurements and/or objective comparisons) and that percentage is still much too high. Acknowledging the great skill and proficiency of many linguists and phonetic experts, reports produced exclusively based on any such expert’s subjective experience should disappear from the scientific forum.

In this scenario, automated tools like BATVOX together with a combination of proper training, and adequate certification and experience (and the enforcement of such credentials) are critical to make those headlines in the news disappear and turn a new page by performing true Forensic Science. The existence of automatic forensic recognition systems allows for the introduction of  repeatability, or the ability to repeat a scientific experiment and achieve the same result, which is essential to any scientific process. In addition, the availability of automatic systems to experts opens the door to more systematic evaluation of methods to help the community better understand the expected accuracy rates and the limits which Forensic Expert should not cross.

