We did some testing with CMU Sphinx-4 to create automated transcription of speech in Annodex files. The current approach is not ready to use, the recognition rate of 10-70% (especially special terms, acronyms and slang make trouble) is too low.

Links

CMU Sphinx-4

• Sphinx-4 Project home: http://cmusphinx.sourceforge.net/sphinx4/

contains many useful links: General Information, Installation, Installation using IDE Eclipse or NetBeans?, Whitepaper, FAQ

• Tutorial: How to write a Sphinx Application

http://cmusphinx.sourceforge.net/sphinx4/doc/ProgrammersGuide.html

• Javadoc Sphinx-4 source:

http://cmusphinx.sourceforge.net/sphinx4/javadoc/index.html

• Sphinx-4 Configuration management :

http://cmusphinx.sourceforge.net/sphinx4/javadoc/edu/cmu/sphinx/util/props/doc-files/ConfigurationManagement.html

• Sphinx-4 Wiki:

http://www.speech.cs.cmu.edu/cgi-bin/cmusphinx/twiki/view/Sphinx4/WebHome

• Sphinx Forums on Sourceforge:

http://sourceforge.net/forum/?group_id=1904

• General Sphinx Project home:

http://cmusphinx.sourceforge.net

• Including a comparison of the different Sphinx versions

http://cmusphinx.sourceforge.net/html/compare.php

• Language & acoustic models available for Sphinx:

http://www.speech.cs.cmu.edu/sphinx/models/

• Sphinx lexical and language model creation:

http://www.speech.cs.cmu.edu/SLM_info.html http://www.speech.cs.cmu.edu/tools/factory.html

• Sphinx full Training: http://fife.speech.cs.cmu.edu/sphinxman/fr4.html

http://www.speech.cs.cmu.edu/sphinx/tutorial.html

• Training own Acoustic & Language Models efforts:

http://www.cs.cmu.edu/~archan/10CommonPitfalls_ST.html

• Sphinx Demos: see README.html files in

http://cvs.sourceforge.net/viewcvs.py/cmusphinx/sphinx4/demo/sphinx/

• The “Live” demo allows to change configuration at runtime:

http://cvs.sourceforge.net/viewcvs.py/cmusphinx/sphinx4/tests/live/

Open Source

• The Annodex project:

http://annodex.net, http://www.annodex.org

• Annodex Content Sites:

http://www.annodex.net/taxonomy_menu/2/27

CmmlWiki:

http://www.annodex.net/software/cmmlwiki/index.html

• Audacity free audio recording tool for Windows, Mac OS X and Linux:

http://audacity.sourceforge.net/?lang=en

• Open Source Speech Recognition Toolkit Julius:

http://julius.sourceforge.jp/en/julius.html

Commercial products

• Nuance Naturally Speaking:

http://www.nuance.com/naturallyspeaking/

• Nuance ViaVoice?:

http://www.nuance.com/viavoice

• Products based on IBM technology:

http://www.wizzardsoftware.com

• Microsoft Speech Recognition:

http://www.microsoft.com/speech

• Speech in MS Vista:

http://msdn.microsoft.com/msdnmag/issues/06/01/speechinWindowsVista/

More Resources

• SpeechTEK Conference:

http://www.speechtek.com

• Collection of links and articles:

http://speech.even-zohar.com/

• Hidden Markov Model Toolkit:

http://htk.eng.cam.ac.uk/

• Wikipedia on Speech Recognition:

http://en.wikipedia.org/wiki/Speech_recognition

• Speech Wiki:

http://speechwiki.org