We did some testing with CMU Sphinx-4 to create automated transcription of speech in Annodex files. The current approach is not ready to use, the recognition rate of 10-70% (especially special terms, acronyms and slang make trouble) is too low.
Links
CMU Sphinx-4
• Sphinx-4 Project home: http://cmusphinx.sourceforge.net/sphinx4/
contains many useful links: General Information, Installation, Installation using IDE Eclipse or NetBeans?, Whitepaper, FAQ
• Tutorial: How to write a Sphinx Application
http://cmusphinx.sourceforge.net/sphinx4/doc/ProgrammersGuide.html
• Javadoc Sphinx-4 source:
• Sphinx-4 Configuration management :
• Sphinx-4 Wiki:
http://www.speech.cs.cmu.edu/cgi-bin/cmusphinx/twiki/view/Sphinx4/WebHome
• Sphinx Forums on Sourceforge:
• General Sphinx Project home:
• Including a comparison of the different Sphinx versions
• Language & acoustic models available for Sphinx:
• Sphinx lexical and language model creation:
http://www.speech.cs.cmu.edu/SLM_info.html http://www.speech.cs.cmu.edu/tools/factory.html
• Sphinx full Training: http://fife.speech.cs.cmu.edu/sphinxman/fr4.html
• Training own Acoustic & Language Models efforts:
• Sphinx Demos: see README.html files in
http://cvs.sourceforge.net/viewcvs.py/cmusphinx/sphinx4/demo/sphinx/
• The “Live” demo allows to change configuration at runtime:
http://cvs.sourceforge.net/viewcvs.py/cmusphinx/sphinx4/tests/live/
Open Source
• The Annodex project:
• Annodex Content Sites:
• CmmlWiki:
• Audacity free audio recording tool for Windows, Mac OS X and Linux:
• Open Source Speech Recognition Toolkit Julius:
Commercial products
• Nuance Naturally Speaking:
• Nuance ViaVoice?:
• Products based on IBM technology:
• Microsoft Speech Recognition:
• Speech in MS Vista:
http://msdn.microsoft.com/msdnmag/issues/06/01/speechinWindowsVista/
More Resources
• SpeechTEK Conference:
• Collection of links and articles:
• Hidden Markov Model Toolkit:
• Wikipedia on Speech Recognition:
• Speech Wiki: