The CMU Sphinx Group Open Source Speech Recognition Engines

Speech at CMU   |   Sphinx at SourceForge

Introduction

General Documentation

CMUSphinx Components

Common library

Decoders

Acoustic Model Training

Language Model Training

Utilities


Latest News

Sphinx4-1.0beta3 released
2009-08-17 19:23
Read More »

SphinxTrain 1.0 Released
2009-02-12 16:05
Read More »

Sphinx-4 1.0 beta2 released
2009-02-07 18:27
Read More »

Site news archive »


External Links

Notice: if you have comments about the links below, please contact the authors directly.

CMU Sphinx documentation Wiki

Check out the code:

https://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/trunk/cmuclmtk

The repository also contains binaries for Windows and Linux. Binaries are stored by OS, in bin/x86-nt/ and bin/x86-linux/.

Configure and compile in linux:

32-bit word ID space is now default when building with configure.

aclocal && autoheader && automake --add-missing --copy && autoconf
./configure CFLAGS="-g -Wall -O0" CXXFLAGS="-g -Wall -O0"
make && make install

Configure and compile for Windows:

You should do your development underCygwin since the makefiles are autobuild-oriented. Be sure to download the developer and mingw components. You want to compile with mingw since this will give you Windows-native binaries and will therefore be portable from computer to computer.

aclocal && autoheader && automake --add-missing --copy && autoconf
./configure --enable-mingw CFLAGS="-g -Wall -O0" CXXFLAGS="-g -Wall -O0"
make && make install


For Development on Linux

  1. build with autoconf
  2. commit
  3. do an update from a windows box with cygwin
  4. compile under cygwin
  5. commit again

For Development on Windows

  1. build with cygwin
  2. commit
  3. do an update on a linux box
  4. build with autoconf
  5. commit again

If you're working on a filesystem shared by linux and windows then you can skip the middle commit and update steps.

Future Release Plans

There has not been an official release of the toolkit since the one put out by Cambridge. We intend to make one very soon. Since the toolkit itself has not been fundamentally updated aside from the 32-bit word ID change, this can be considered version 2.1 of the toolkit. Nonetheless, there will be many significant updates:

  • Perl scripts for easy language model construction from various source texts (DONE)
  • Dictionary building support for English using Festival (DONE)
  • Chinese segmentation
  • Chinese pronunciation generation

Successive releases will contain more fundamental changes to the toolkit. Mainly, we intend to add support for modified Kneser-Ney smoothing.

CMUCLMTKDevelopment (last edited 2009-10-30 19:33:23 by DavidHugginsDaines)

SourceForge.net Logo This page is maintained by David Huggins-Daines ()
CMUSphinx is a project within the Sphinx Group at Carnegie Mellon