24. August 2018

Speech Intelligibility in PA systems

Article from the vdt-Magazine

Speech Intelligibility in PA systems

”Sorry, I didn’t catch that.” If a thought like that is going through a listener’s head, chances are that the sound reinforcement is not working as intended – regardless if it´s in a concert hall or at a railway station. The measurement of voice intelligibility does not only provide distinct information about the current state of a PA system, it also gives important clues for optimizing it.

On December 24th, 1915, the world’s first large-scale PA system was used to amplify speech and music at a public event in San Francisco. Approximately 100,000 listeners were provided with clear, intelligible speech information. One should think that the last 100 years led to the ultimate perfection of sound reinforcement systems and that poor voice intelligibility is a relic of days long gone, but daily PA practice shows different. Hard to understand speech and a generally nasty sound are still part of the repertoire of some installed or touring sound systems. It´s annoying when you paid full price to see your favourite band or play and the vocals are barely intelligible, but it’s clearly dangerous and even illegal if spoken warnings are not comprehensible.

What is Speech Intelligibility?
The term Speech Intelligibility describes how much of the clearness and distinctiveness of spoken information is preserved when transmitted over a PA system. Early in the history of audio manually conducted tests were used to rate the speech intelligibility of sound systems: A speaker at a lectern read meaningless words and syllables, which were then written down by the audience as correct as possible. The result was given as a percentage, where 100 percent would have been a perfect score. In the 1940ies Bell Laboratories were developing first electronic measurement methods for speech intelligibility. The ALCons (Articulation Loss of Consonants) refers to the loss of pronunciation of consonants, where the percentage stands for the incorrectly understood words or consonants. An ALCons value of 0.00% would thus mean error-free transmission (no lost consonants).

XL2 STIPA Analyzer
Measurement of speech intelligibility with NTi Audio XL2

Modern measurement methods, such as STI (Speech Transmission Index) or STIPA (Speech Transmission Index of Public Address Systems) try to represent the result with a single number, with as many interference factors as possible (ergo the reality) to be included. The result is expressed as a numerical value (STI) between 0.00 (no intelligibility) and 1.00 (perfect speech intelligibility). Good sound reinforcement achieves results between 0.45 and 0.65, even in acoustically unfavourable rooms and 0.70 to 0.90 in acoustically good rooms. For comparison: A good studio transmission chain accomplishes typical STI values between 0.90 and 0.97 (as measured by the author) between microphone in the voice booth and the studio monitor in the control room.

Friends and enemies of Speech Intelligibility
Technically speaking, good speech intelligibility is always given if the modulation depth of the speech or test signal is maintained without a change. The signal is transmitted without aberrations and without masking of the amplitude, in the spectrum and on the time axis.

Good speech intelligibility has many enemies, but also powerful friends. Here is an overview of the most common pitfalls, from which implicitly the solution can be derived.

Room acoustics is one of the most important influencing factors.Long reverberation times (RT60) as well as late reflections (around 80 to 150 ms) act like a „filler“ that clogs the useful modulation depth of the speech signal by covering it, thus making the recognition of information more difficult.

Badly positioned loudspeakers with inappropriate directional characteristics might provide sound for the room, such as to opposite walls and glass fronts, but not directly to the listener. As a result, the directly radiated sound from the speakers at the position of the audience might be proportionally lower than the noise level (reflections, interferences), speech intelligibility decreases.

Undersized public-address systems that are unable to top the noise of a loud audience or drown in the noise of an incoming train are another problem. Again, the signal-to-noise ratio is worsened and speech intelligibility decreased. Sufficiently sized sound systems should be able to provide at least 10 to 15 dB more level than the loudest background noise, with natural limits (hearing protection) and disturbing masking effects to consider.

Psychoacoustic masking effects also contribute to the loss of speech intelligibility. Not only external noise sources can cover the spoken word, but also speech itself, especially at very high voice levels. Low-frequency components in speech can mask quieter, higher-frequency sounds and thus prevent their perception. Another disturbance are linear and non-linear distortions of the transmission path. These disturbances are not only caused by clipping amplifiers and bad speakers, but often by well-intentioned, but badly implemented signal processing. Too much compression of dynamics, overdriven limiters, unnecessary boosts or cuts in the frequency range and also a too stingy data reduction can worsen speech intelligibility.

Meaning of STI-values in practice
Meaning of STI-values in practice

So beware! Musically accompanied readings in former aircraft hangars, improvisational theatre in emptied swimming pools, platform announcements during the arrival of a train, in all these cases very precise planning and a greatly increased technical effort are needed to achieve reasonable or at least tolerable speech intelligibility.

© 2018 by Karl M. Slavik
Text & Translation: Karl M. Slavik and Adrian Slavik

All rights for this article including the teaser picture belong to vdt-Magazin. The content of this article as well as the pdf download are not for commercial use and permission must be obtained from vdt-Magazin.

Categories: Evacuation Systems