Festival Speech Synthesis System version 1.4.1 (and Edinburgh Speech Tools 1.2.1) 30th November 1999 The Festival Speech Synthesis System is a general multi-lingual text to speech system for Unix platforms. It is written in C++, and includes a Scheme-based scripting language. Included with Festival are lexicons and voices that together form a whole text to speech system. Festival was developed at the Centre for Speech Technology Research, University of Edinburgh. The system is copyright (C) University of Edinburgh 1996-1999, all rights reserved. Festival is free sorftware and is distributed under an X11-type licence allowing unrestricted commercial and non-commercial use alike. This release is available from ftp://ftp.cstr.ed.ac.uk/pub/festival/1.4.1/ and a US mirror at CMU http://www.speech.cs.cmu.edu/festival/download.html ================= DISTRBUTION FILES ================= README-1.4.1.RELEASE This file festival-1.4.1.tar.gz The source and Scheme library of the Festival Speech Synthesis System. speech_tools-1.2.1.tar.gz The source of some basic library utilities and some basic programs used by Festival. festdoc-1.4.0.tar.gz Generated postscript, dvi, info and HTML files for the Festival Manual and Speech Tools Manual. This can be built form the source but is pre-built for your convenience. festlex_POSLEX.tar.gz Part of speech lexicons and ngram from English. Required by all British and American English voices. festlex_CMU.tar.gz CMU dict (0.4) in Festival form, required for American English voices. festlex_OALD.tar.gz Computer User's Version of Oxford Advanced Learners' Dictiction of Current English, in Festival form required for British English voices. **THIS FILE HAS COMMERCIAL RESTRICTIONS** festvox_rablpc16k.tar.gz British English RP male speaker using residual excited LPC diphone database, requires festlex_POSLEX.tar.gz and festlex_OALD.tar.gz festvox_rablpc8k.tar.gz Same as festvox_rablpc16k.tar.gz but at 8KHz sampling. requires festlex_POSLEX.tar.gz and festlex_OALD.tar.gz festvox_don.tar.gz British English RP male speaker using spike excited LPC diphone database, small database give poorer quality but runs fast. requires festlex_POSLEX.tar.gz and festlex_OALD.tar.gz festvox_kallpc16k.tar.gz American English male speaker (KAL) using residual excited LPC diphone database, requires festlex_POSLEX.tar.gz and festlex_CMU.tar.gz festvox_kallpc8k.tar.gz Same as festvox_kallpc16k.tar.gz but at 8KHz sampling. requires festlex_POSLEX.tar.gz and festlex_CMU.tar.gz festvox_kedlpc16k.tar.gz American English male speaker using residual excited LPC diphone database, requires festlex_POSLEX.tar.gz and festlex_CMU.tar.gz festvox_kedlpc8k.tar.gz Same as festvox_kedlpc16k.tar.gz but at 8KHz sampling. requires festlex_POSLEX.tar.gz and festlex_CMU.tar.gz festvox_ellpc11k.tar.gz Castilian Spanish male speaker using residual excited LPC diphone. Requires no further lexicons, not complete but adequate. festvox_en1.tar.gz British English RP male speaker (same db as rablpc16k) but uses MBROLA. This requires the MBROLA prorgram and en1 English database available from http://tcts.fpms.ac.be/synthesis/mbrola.html. This also requires festlex_POSLEX.tar.gz and festlex_OALD.tar.gz festvox_us1.tar.gz American English female speaker using the MBROLA us1 database This requires the MBROLA prorgram and us1 English database (~6Meg) available from http://tcts.fpms.ac.be/synthesis/mbrola.html. This also requires festlex_POSLEX.tar.gz and festlex_CMU.tar.gz festvox_us2.tar.gz American English male speaker using the MBROLA us2 database This requires the MBROLA prorgram and us2 English database (~6Meg) available from http://tcts.fpms.ac.be/synthesis/mbrola.html. This also requires festlex_POSLEX.tar.gz and festlex_CMU.tar.gz festvox_us3.tar.gz American English male speaker using the MBROLA us2 database. This is not as good as us1 and us2. This requires the MBROLA prorgram and us3 English database (~6Meg) available from http://tcts.fpms.ac.be/synthesis/mbrola.html. This also requires festlex_POSLEX.tar.gz and festlex_CMU.tar.gz ============ NEW FEATURES ============ New in 1.4.1 version (since 1.4.0 20th June 1999) * many small bug fixes * SSFF track support (for emulabel) * updated support for gcc-2.95.(12], gcc-2.7.2.[23], and AIX * Support for new JDKs New in Festival 1.4.0 (since 1.3.1 26th January 1999) * distributed under a free X11-type licence * generalization of stats modules, ngram, CART, wfst with viterbi so they can be shard more easily * Tidy up of Utterance/Relation/Item architecture * Initial JSAPI support * Three new us voices using MBROLA databases * Tilt code overhaul * XML load for Relations * Fringe graphic display (ALPHA) released seperately http://www.cstr.ed.ac.uk/projects/fringe.html * "Building Voices in Festival" document describing process of building new voices in the system http://www.cstr.ed.ac.uk/projects/festival/docs/festvox New in Festival 1.3.1 (since 1.3.0 24th August 1998) * Many small but important bug fixes * egcs-1.1.1 support * tobi_rules update (GM) * replace readline interface with editline (+ extensions) * cluster code tidied up * new KAL voice, US male * improved ked by power normalization * updated lexicons with addenda for US and UK * New LTS models for US and UK English New in Festival 1.3.0 (since 1.2.1 5th October 1997) * New (improved) diphone synthesizer * Better letter to sound rules (trained) for lexicons * New utterance architecture * Probabilistic parser * Improvements in training tools * Improvements in signal processing routines (can now build diphone databases from our tools alone) * Tilt intonation model support * Time and space efficiency * Sable/XML markup language support (optionally built in) * Basic Java bindings (loading Speech Tools into Java and java client access to festival server) * Many other fixes New in Festival 1.2.1 (since 1.2.0 5th September 1997) prelimary Visual C++ support Use path-append rather than string-append (in buckets of places) Minor bugs fixes throughout the code (end silences are now *always* inserted at end of utterance in tts) Linux socket bug fixed (get_url didn't work) native irix audio support New in Festival 1.2 (since 1.1.1 release in January 1997) An American English voice, diphones, lexicon (CMU-based), durations f0 etc. New homograph disambiguation: including support for numbers money, roman numerals etc. New POS tagger and Phrase break models Integrated support for CSLU's OGI toolkit Support for TCL and example PERL interface client can now receive text as well as wave data from server Postlexical rules (reduction, etc) and programmable Full control of utterance structure from Scheme Diphone support for consonant clusters and other special phone types STML support (still being modified) ToBI-by-rule implementation Castilian Spanish example voice New in Speech Tools 1.0 (since 0.96.1 in January) Complete overhaul, new names (prefixed by EST_) for most classes Include our own string class, removing dependence on libg++ New ngram class with binary file format, much faster to use. Wagon CART builder integrated (and documented) Intonation labelling code: pitch tracker, RFC labeller Native support for 16 bit linear PCM on Suns, FreeBSD and Linux Testsuite to help ensure ports are successful Lots of consolidation of names and methods Overall Should be much more portable to other C++ compilers. The test suites should point to problems in ports more easily Windows NT/95 ports now exist using the Cygnus GNU win32 suite, and to a certain extent Visual C++. They seem to work but probably require more work. =================== SUPPORTED PLATFORMS =================== We have successfully tested the system on the following systems Support is also available for shared libries unless otherwise stated Sparc Solaris 2.5.1, 2.6, 2.7 gcc-2.7.2, gcc-2.8.1 sunCC 4.1 (static only), egcs-1.1.1, egcs-1.1.2 Sparc SunOS 4.1.3 gcc-2.7.2 Intel Solaris 2.5.1 gcc-2.7.2 (shared ?) Linux (2.0.30) for Intel (RedHat 4.0,4.1,4.2,5.0,5.1,5.2,6.0) gcc-2.7.2, egcs-2.90.27 egcs-1.1.1, egcs-1.1.2 FreeBSD 2.2.1 (aout) 3.1 (elf) gcc-2.7.2 (static only) Windows NT 4.0 Cygnus' gnu-win32-b19 (&20) plus egcs Visual C++ V5.0 Windows 95 Cygnus' gnu-win32-b19 (&20) plus egcs Visual C++ V5.0 We only recommend the NT/95 ports to people with significant experience at writting and installing C++ under those platforms. We do not intend to distribute binary versions for these platforms in the foreseeable future though binary versions of Festival for Windows may become available with OGI's CSLU toolkit (http://www.cse.ogi.edu/CSLU/) We are likely to release binary distributions for Sun Sparc Solaris and Linux on intel platforms in the very near future. ============ REQUIREMENTS ============ In addition to the above sources and databases you will need GNU make A C++ compiler There are other systems and programs that may not be available on your machine. All other systems that may be needed are available from ftp://ftp.cstr.ed.ac.uk/pub/festival/extras/ Pevious versions of Festival mistakenly included support for GNU readline but due to copyright conflicts it has been removed and replaced by editline (plus extensions), most users shouldn't see any difference. ============ INSTALLATION ============ Unpack speechtools-1.2.0.tar.gz and festival-1.4.0.tar.gz in a new directory. See the files called INSTALL in speech_tools/ and festival/ ================ MORE INFORMATION ================ The Festival Home Page is http://www.cstr.ed.ac.uk/projects/festival.html This contains information, examples and an on-line demo of the system, as well as pointers to related research papers and an on-line version of the manual. News, including bug fixes, new voices etc. will be posted occasionally though the file ftp://ftp.cstr.ed.ac.uk/pub/festival/1.4.0/LATEST.NEWS Bugs may be reported to festival-bug@cstr.ed.ac.uk Help may be available through festival-help@cstr.ed.ac.uk If the number of users warrant, it a mailing list will be set up, this will be announced through the Festival Home Page. ================ ACKNOWLEDGEMENTS ================ The system was primarily written by Alan W Black, Richard Caley and Paul Taylor. We wish to acknowledge support and contributions from the following people and organisations. Alistair Conkie various low level code points and some design work Spanish synthesis, diphone module, recording Roger Steve Isard design of diphone schema, old LPC diphone code, and directorship EPSRC who fund awb and pault Sun Microsystems Laboratories For believing in us and their generosity. AT&T Research Labs For providing funding and using our work Paradigm Assoc. and George Carrett for Scheme In One Defun Simmule Turner and Rich Salz for command line editor (editline) The beta testers Thanks for wanting to use the system, you make it worth doing. (And thanks for helping me debug my code.) You all responded to my requests fast and accurately thanks, even when I dumped last minute changes on you See file ACKNOWLEDGEMENTS for full list