7. Engineering Tradeoffs


  1. Speech Quality
  2. Complexity
  3. Delay
  4. Bit Rate


7.1 Speech Quality

The quality of the output speech from coding system should be high. A human listener should not perceive any difference between the original (input) speech, and the encoded - decoded speech.
There are two kind of measures of speech quality:

1. Subjective Quality Metrics:

  • A- B discrimination test
  • Diagnostic Rhyme Test
  • Mean Opinion Score (MOS)
  • Speech quality is usually evaluated on MOS:

    1. Hire several highly trained listeners from an expensive consulting company
    2. Encode & decode a standards set of sentences
    3. Each listener rate each sentence with one of the following labels:

      • 1 = Bad;
      • 2 = Poor;
      • 3 = Fair;
      • 4 = Good;
      • 5 = Excellent;
    4. Average the ratings across sentences, and across listener

    2.Objective Quality Metrics:

    1. Signal - to Noise Ratio SNR

      • easy to compute
      • not good measure of the perceived distortion
      • no speech coders use this metric

    2. Segmental SNR SEGSNR

      • easy to compute,but at the coder end
      • represents short - time character of speech and hearing
      • waveform coders (DPCM,ADPCM ...)

    3. Perceptually - Weighted SEGSNR - PSNR

      • the signal can mask quantization noise which occurs at the same time and the same frequency
      • hybrid vocoder
      • LPC Analysis - by - Synthesis coder

    4. Spectarl Amplitude Distortion

      • high frequencies
      • signal can mask quantization noise which occurs at the same time, at same frequencies, and at nearby frequencies


    7.2 Complexity

    The complexity af a coding algorithm is the processing effort required to implement the algirithm, and is typically measured in terms of arithmetic capability and memory requirment, or equivalently in terms cost. A large complexity can be result in high power consumption in the hardware.


    7.3 Coding Delay

    The coding delay of a speech transmission system is a factor closely related to the quality requirments. Coding delay includes algorithmic, computational and transmission factors.
    Communication delay is irrelevant for one - way communication, such as in voice - mail.


    7.4 Bit Rate

    "high qulity at low bit rates and low cost"

    Application of wideband speech coding include high quality audioconferencing with 7 kHz- bandwith speech at bit rates on the order of 16 to 32 kpbs, and high - quality stereoconferencing and dual - language programming over a basic ISDN link.
    Finally, the compression of a 20 kHz- bandwith to rates on the order of 64 kbps will create new opportunities in audio transmission and networking, electronic publishing, teleteaching, multimedia memos, and data base storage.