15-492/18-492:
Speech Processing -- Fall 2011

Description: Speech Processing offers a practical and theoretical understanding of how human speech can be processed by computers. It covers speech recognition, speech synthesis and spoken dialog systems. The course involves practicals where the student will build working speech recognition systems, build their own synthetic voice and build a complete telephone spoken dialog system. This work will be based on existing toolkits. Details of algorithms, techniques and limitations of state of the art speech systems will also be presented. This course is designed for students wishing understand how to process real data for real applications, applying statistical and machine learning techniques as well as working with limitations in the technology.
Instructor(s): Alan W Black
Prerequisites: 15-211 for SCS undergraduates, exemption from this requirement requires the instructor's permission.
Availability: Open to juniors and seniors in the SCS undergraduate program and ECE Undergraduate program. Open to other students with the consent of an instructor.
Materials: The text required for the course will be "Spoken Language Processing" by Xuedong Huang, Alex Acero and Hsiao-wuen Hon, Prentice Hall (ISBN 0-13-22616-5). This book will be used for reading assignments, and background reading for homeworks and exams.

Homework: Homework consists of two components: occasional Weekly brief reading assignments and four programming projects (Speech Recognition, Speech Synthesis, Spoken Dialog Systems, and one other).
Grading: 10% class participation, 60% programming projects, 10% readings homework, 20% final.
Course policies: Late homework , Cheating
Time: MWF 3:30-4:20
Location: DH 1209
Final exam: Mon Dec 12th 8:30am-11:30am Closed Book DH 1209 example question
Syllabus:
Date    Topic                     Slides
Aug 29th       Course Overview slides
Aug 31st Human Speech slides
Sep 2nd Computer Speech slides
Sep 5th No lecture (Labor Day)
Sep 7th ASR: Signal Processing slides mfcc
Sep 9th ASR: Template matching slides
Sep 12th ASR: HMMs slides Reading 1 due Mon 19th September
Sep 14th ASR: Acoustic Modeling slides
Sep 16th ASR: Language Modeling slides
Sep 19th ASR: Systems slides Homework1 due before class Wednesday 5th Oct
Sep 21st ASR: Language Modeling 2 slides
Sep 23rd TTS: Text Analysis slides
Sep 26th TTS: Pronunciation slides
Sep 28th TTS: Prosody slides
Sep 30th TTS: Waveform I slides
Oct 3rd TTS: Waveform II slides
Oct 5th TTS: Voice building slides
Homework2 due before class Friday 28th Oct
Oct 7th TTS: Evaluation slides
Oct 10th TTS: Signal Processing slides
Oct 12th TTS: Talking Heads and Singing slides
Oct 14th Multilingual Speech Processing slides
Oct 17th SPICE slides
Oct 19th Speech to Speech Translation I slides
Oct 21st Mid-semester -- no lecture
Oct 24th Speech to Speech Translation II slides
Oct 26th No lecture
Oct 28th Spoken Dialog Systems: Intro slides
Oct 31st Spoken Dialog Systems: Components slides
Nov 2nd Spoken Dialog Systems: VoiceXML slides
Nov 4th Spoken Dialog Systems: beyond simple dialogs; Olympus intro slides
slides
Nov 7th Spoken Dialog Systems: Olympus II slides
Homework3 due before class Wed 22nd Oct
Nov 9th Spoken Dialog Systems: deployment slides
Nov 11th Spoken Dialog Systems: Personal Digital Assistants slides
Nov 14th Spoken Dialog Systems: Evaluation slides
Nov 16th Voice Conversion I slides
Nov 18th Speaker ID slides
Nov 21st Voice Conversion/Deidentification slides
Nov 28th Computer Aided Language Learning slides
Nov 30th Present and Future Speech Problems slides Homework4 due 3:30pm Friday Dec 9th