STI DIN60268-16 the direct method
Measurement of speech intelligibility through listening tests
Traditionally, speech intelligibility is measured by a listening test. A speaker reads a list of rhyming words. Different listeners note which words they have heard and the speech intelligibility is obtained through statistical evaluation. This method is called the subjective method. In practice, however, this is very complex and is therefore rarely used. On the other hand, listening tests are of course the reference - the golden method. All other methods must be compared with it. In the following sections we will therefore only deal with another class of methods that are not based on hearing tests.
An objective method of measuring speech intelligibility
This is about reproducible procedures that are automatically evaluated. A sensible approach would be to play back a standardized voice signal from a sound carrier. An automatic speech recognition system would assess speech quality and calculate speech intelligibility. Such a method would be possible with today's computers and will certainly be standardized in the near future.
State of the art, however, are much simpler methods that do not require an extremely complex speech recognizer. The basic idea is to use a special test signal that is influenced by the room acoustics in such a way that this change can be easily measured. This influence is converted into speech intelligibility. These methods were developed by Steeniken et al in the late 1970s and standardized in DIN/IEC60268-16. Two methods are standardized, the direct and indirect method.
The direct method according to IEC60268-16
The direct method is a bit clearer and can be found as the STIPA method in various measuring devices. The indirect method is significantly more efficient and accurate. However, in this part we only cover the direct method.
First, let's look at the STIPA test signal. It consists of a noise signal that is amplitude modulated (AM).
This signal initially has nothing to do with human language. However, many statistical variables correspond to human language.
The basic idea is that the degree of modulation of the signal is reduced by interference, especially reverberation.
Ultimately, this is what happens with human speech through reverberation. The modulation is lost and speech intelligibility is reduced.
And it is precisely this reduction in modulation that can be automatically measured in the STI signal and the speech intelligibility calculated from it. The STI method only evaluates the frequency range between 125 and 8000Hz. This frequency range is divided into 7 octave bands. (An octave is always a doubling of frequency).
An STI measurement takes a long time
14 modulation frequencies between 0.63Hz and 12Hz are used per octave band. Due to the low modulation frequencies, a measurement per modulation frequency takes about 20s. After all, all 7 octave bands can in principle be measured simultaneously. A complete STI measurement therefore takes about 5 minutes. This is quite unwieldy in practice. Therefore, there is a simplified variant, the STI-PA method.
The STI-PA method
The STI-PA method uses only two modulation frequencies per octave band. The modulation frequencies are designed in such a way that all values can be measured in one go. An STI-PA measurement therefore only takes about 20s. You will find this measuring method in various hand-held sound level meters, eg NTI Xl2 or Bedrock SM50/90 and AM100.
What does a practical STIPA measurement look like?
The STIPA signal is emitted continuously via a loudspeaker. The measuring microphone is set up at different measuring positions in the room and the measurement is started. The measurement is fully automatic and delivers a measured value after approx. 20s. Please note that the STIPA method is level dependent. The optimal volume of the voice signal is 65dBA. A very loud and a very quiet signal impairs speech intelligibility. The measurement chain must therefore be calibrated.
Consideration of background noise
Speech intelligibility at the receiving location is influenced on the one hand by parameters of the transmission system, such as room acoustics, sound level, frequency response or distortion, and on the other hand by background noise .
It is therefore not sufficient to measure the speech intelligibility of an evacuation system in an empty department store, for example. However, the speech intelligibility of such a system must be understood, especially in the noise of an emergency situation.
In many cases, it is unfortunately not practical to carry out a measurement during normal public traffic, as this leads to considerable annoyance. Furthermore, the background noise can strongly falsify the STI calculation, especially if tonal or impulsive components are included. It therefore makes sense to carry out the measurement without an audience and to consider the expected background noise separately by means of a correction factor for each octave band.
The signal-to-noise ratio (SNR) has an important influence on speech intelligibility. The SNR is calculated from the level difference between the useful signal and the interference signal. The level difference is specified in dB. The SNR should be determined for STI measurements in octave bands. The worse the SNR, the worse the speech intelligibility. For STI measurements, the SNR is in a relevant range of -15dB to +15dB. This also means that if the SNR is already +15dB, speech intelligibility can no longer be improved by increasing the level. On the contrary, speech intelligibility can deteriorate due to high levels.
Level dependence of speech intelligibility
Speech intelligibility depends on the level of the speech signal. At very low levels, noise predominates and speech intelligibility is poor. The human ear is very powerful because the level of the speech signal can even be below the noise level. If the signal level increases, the signal-to-noise ratio and speech intelligibility improve. Above a certain volume, speech intelligibility decreases again due to masking effects. This effect is simulated by the STI method. High levels therefore lead to a devaluation of speech intelligibility. For this reason, STI measurements must also be calibrated to record the absolute sound level.