15-492/18-492:
Speech Processing

Description: Speech Processing offers a practical and theoretical understanding of how human speech can be processed by computers. It covers speech recognition, speech synthesis and spoken dialog systems. The course involves practicals where the student will build working speech recognition systems, build their own synthetic voice and build a complete telephone spoken dialog system. This work will be based on existing toolkits. Details of algorithms, techniques and limitations of state of the art speech systems will also be presented. This course is designed for students wishing understand how to process real data for real applications, applying statistical and machine learning techniques as well as working with limitations in the technology.
Instructor(s): Alan W Black
Teaching Assistant: David Huggins
Prerequisites: 15-211 for SCS undergraduates, exemption from this requirement requires the instructor's permission.
Availability: Open to juniors and seniors in the SCS undergraduate program and ECE Undergraduate program. Open to other students with the consent of an instructor.
Materials: The text required for the course will be "Spoken Language Processing" by Xuedong Huang, Alex Acero and Hsiao-wuen Hon, Prentice Hall (ISBN 0-13-22616-5). This book will be used for reading assignments, and background reading for homeworks and exams.

Homework: Homework consists of two components: Weekly brief reading assignments and four programming projects (Speech Recognition, Speech Synthesis, Spoken Dialog Systems, and one other).
Grading: 10% class participation, 40% programming projects, 10% readings homework, 20% midterm, 20% final.
Course policies: Late homework , Cheating
Time: MWF 3:30-4:20
Location: DH 1117
Final exam: 16th Dec, 1pm-4pm, WEH 6423 example question
Syllabus:
Date    Topic                     Slides
Aug 25th       Course Overview slides
Aug 27th Human Speech slides
Aug 29th Computer Speech slides
Sep 3rd ASR: Signal Processing slides mfcc
Sep 5th ASR: Template matching slides
Sep 8th ASR: HMMs slides Reading 1 due 15th September
Sep 10th ASR: Acoustic Modeling slides
Sep 12th ASR: Language Modeling slides
Sep 15th ASR: Language Modeling 2 slides
Homework1 due before class Monday 29th September
Sep 17th ASR: Systems slides
Sep 19th Mobile Speech slides
Sep 22nd TTS: Text analysis slides
Sep 24th TTS: Pronunciation slides
Sep 26th TTS: Prosody slides
Sep 29th TTS: Waveform slides
Oct 1st TTS: Waveform2 slides
Oct 3rd TTS: Building voices slides
Oct 6th TTS: Signal Processing slides
Homework2 due before class Monday 20th October
Oct 8th TTS: Evaluation slides
Oct 10th TTS: Talking Heads slides
Oct 13th Multilingual Processing slides
Oct 15th SPICE: Multilingual Processing slides
Oct 17th Mid-semester break no lecture
Oct 20th SPICE: Multilingual Processing (continued) slides
Oct 22nd No lecture
Oct 24th Spoken Dialog Systems: intro slides
Oct 27th Spoken Dialog Systems: components slides
Oct 27th Spoken Dialog Systems: VoiceXML slides
Oct 31st Spoken Dialog Systems: Beyond simple dialogs slides
Nov 3rd Spoken Dialog Systems: Olympus 1
slides Homework3 due before class Monday 17th November
Nov 5th Spoken Dialog Systems: Olympus 2 slides
Nov 7th Spoken Dialog Systems: Olympus Advanced slides
Nov 10th Spoken Dialog Systems: deployment slides
Nov 12th Speech to Speech Translation: intro slides
Nov 14th Speech to Speech Translation: details slides
Nov 17th Voice Conversion: 1 slides
Nov 19th Voice Conversion: 2 slides
Nov 21st Speaker ID slides
Nov 24th Speaker ID (cont) slides
Homework4 due 3:30pm Monday 8th December
Dec 1st Computer Aided Language Learning slides
Dec 3rd Review slides
Dec 5th Present and Future slides