Is Speaker Verification Reliable?

The first question about a new biometric technology is whether or not it is reliable. In other words, are the error rates low enough so as to leave our bank account, for example, with the only protection of our voice?. Two types of error rates are always needed in order to measure the effectiveness of a security system: false acceptance and false rejection. False acceptance refers to the percentage of times that the system erroneously admits an impostor. False rejection refers to the percentage of times that the system rejects the authorized user as legitimate. Both numbers has to be taken into account simultaneously, since we can improve one by relaxing the other.

Let’s put the problem in perspective: which numbers are we expecting to have in traditional methods? In 2003, according to a study by the Economic Crime Institute in the U.S., 4.6% of the population had suffered an identity fraud with respect to bank accounts. Let us suppose that the thieves had attempted stealing the identity of at maximum, half the population (an exaggerated proposition to say the least given the magnitude of the population in question). However, in this scenario, what can be inferred is that via traditional methods of verification (cards, PIN numbers, passwords, etc.) one can estimate an error rate up to 10%. False rejection to legitimate users is more difficult to quantify, but it is related to the number of attempts the ATM does not accept the card because it is damaged or the magnetic strip is dirty, or when we can’t access our account because we can’t remember the proper PIN number. In other words, via traditional methods these percentages are high and depend upon numerous factors.

In practice, the conditions surrounding the application, the setting of thresholds and the method of decision employed can impact the error rates in a significant way. It is not the same that voices are captured in an office environment using a land line, than to capture the same voices from a mobile (cellular) phone in situations where there is a lot of background noise such as in a car or in a public place. Therefore, it is risky to give statistics and error rates without knowledge of the concrete application. We can say, however, that for certain real applications the threshold can be set in a way that we have an false acceptance rate below 0.5% with false rejection rates of less than 5%.