Speech Synthesis

Advanced Signal Processing Seminar on the topic of Speech Synthesis, held in the summer term 2008.


This seminar will focus on the two dominant state-of-the-art corpus-based methods for text-to-speech synthesis, namely unit-selection based speech synthesis and the more recently developed Hidden Markov Model (HMM) based speech synthesis. Today's commercial systems mostly employ the unit-selection method.
In unit-selection synthesis a large speech corpus is recorded and segmented. During synthesis segments/units are concatenated that minimize the distance to each other (concatenation cost) and to the target units (target cost).
In HMM based speech synthesis HMMs are trained on a corpus of speech data. During synthesis a sequence of features (spectral, pitch, and duration features) is generated from the HMMs and used for synthesizing the signal.
The following list suggests topics for presentation. It is not exhaustive and you can also use different papers to present a topic.

The first meeting (Vorbesprechung) will be on Tuesday 11.3.2008, 16:00-18:00, SR-INW.

General topics

Automatic speech segmentation

Conversational speech

Synthesis of singing

Unit-selction synthesis related topics

Basics and history of unit selection speech synthesis

Concatenation costs and target costs

HMM synthesis related topics

Basics of HMM-based speech synthesis

Speaker interpolation

Speaker adaptation

Signal generation

Context clustering

  • HMM-based speech synthesis system
  • The Festival Speech Synthesis System
  • Viennese Sociolect and Dialect Synthesis project


    Di 11.3.2008 16:00 - 18:00 Vorbesprechung M. Pucher Presentation
    Di 15.4.2008 16:00 - 19:00 Signal GenerationC. CarunchoPresentation Paper
    Di 29.4.2008 16:00 - 19:00 Basics of HMM-based speech synthesisP. Gampp, A. Sereinig Presentation1 Presentation2 Paper1 Paper2
    Speaker interpolation S. Rexeis, M. Stracka Presentation Paper
    Di 10.6.2008 16:00 - 19:00 Synthesis of singingR. Peharz, P. Meissner Presentation1 Presentation2 Paper
    Conversational speech J. Luig Presentation Paper
    Di 24.6.2008 16:00 - 19:00 VSDS/ftw. PresentationM. Pucher, F. Neubarth, C. Kranzler, M. Bruss, D. Schabus, G. Schuchmann


    Michael Pucher

    Telecommunications Research Center Vienna (FTW) Tech Gate Vienna Donau-City-Strasse 1, 3rd floor A-1220 Vienna Austria

    Phone: +43 1 505 2830-46

    Fax: +43 1 505 2830-99

    E-mail: pucher at


