In this assignment you will learn what works and doesn't work in text to speech systems. You will also learn the basic stages in building a synthetic voice.
Your tasks are:
Download the source, voices and lexicons from: http://www.speech.cs.cmu.edu/15-492/homework/hw1/packed2009/
Follow the commands for unpacking and compilation in the script http://www.speech.cs.cmu.edu/15-492/homework/hw1/packed2009/build_them . You wish to go through these commands one by one rather than running the script in case there are errors on the way.
Once Festival is installed you should be able to run it from the command line. You will need to give an explicit pathname to the festival executable (which is in ..whatever../build/festival/bin/festival)
festival Festival Speech Synthesis System 1.96:beta July 2004 Copyright (C) University of Edinburgh, 1996-2004. All rights reserved. For details type `(festival_warranty)' festival>Here are some useful commands
festival> (SayText "Hello world.") festival> (set! utt1 (SayText "Hello world.")) festival> (utt.save.wave utt1 "example.wav")
You can check for pronunciation problems, non-standard word errors etc. Higher grades will be given for more different types of error.
You should try to find some commercial system out there that offers a type in box, so you can test it.
Before starting you mist set two environment variations
export ESTDIR=...pathto../speech_tools export FESTVOXDIR=...pathto../festvoxYou can follow the instructions for building a talking clock here. You should ignore the parts about power normalization.
For building CLUSTERGEN voice, follow the instructions here and use the first 100 utterances (or more) of the ARCTIC prompt set which you can find here arctic_prompts.data
If you use your own recording mechanism please ensure you record mono, 16KHz.