Cool Mac Gear


iTunes_RGB_9mm

Sphinx: Open Source Speech Recognition In Java

276
Carnegie-Mellon University has developed a cross-platform, Java-based freeware speech recognition application called Sphinx, which is offered under an Open Source BSD license.

Sphinx can reportedly be used for general purpose speech recognition tasks, or be integrated with a particular application. There are several freely downloadable versions available, Sphinx-4 being the version written entirely in Java, plus PocketSphinx - a "lite" versiondesigned to run in real time on handhelds or integrated with live applications.

Sphinx is a general term describing a group of speech recognition systems developed at Carnegie Mellon University.For Mac users interested in trying Sphinx for their own use, Sphinx 4 - the all-Java version, appears to be the way to go.

Be forewarned, however, that Sphinx is far from being a mature application, and a substantial degree of user savvy will be required to get it up and running on your Mac. As the developers emphasize, "Those with a certain level of expertise can achieve great results with the versions of Sphinx available, but a naive user will certainly need further help. In other words, the software available... is not meant for users with no experience in speech, but for expert users."

There are fora for bug tracking and discussions on the SourceForge site, also. Also go there for help, questions, to report bugs, and to see the latest work. The work is currently pre-version 1.0, so there is a lot yet to be done.

There is also an IRC channel (#cmusphinx on irc.freenode.net) for real-time discussion.

The Sphinx Group at Carnegie Mellon University is committed to releasing the long-time, DARPA-funded Sphinx projects widely, in order to stimulate the creation of speech-using tools and applications, and to advance the state of the art both directly in speech recognition, as well as in related areas including dialog systems and speech synthesis.

Sphinx has been underdevelopment since 2000, when the Sphinx group at Carnegie Mellon committed to open source several speech recognizer components, including Sphinx 2 and later Sphinx 3 (in 2001). More generally CMU Sphinx refers to a group of open source systems related to speech recognition. The available resources include in addition software for acoustic model training, language model compilation and a public-domain pronunciation dictionary, cmudict.

The Sphinx Group has been supported for many years by funding from the Defense Advanced Research Projects Agency, and the recognition engines to be released are those that the group used for the various DARPA projects and their respective evaluations.

The licensing terms for the Sphinx engines and tools are derived from BSD, and based, in particular, upon the license for the Apache web server. There is no restriction against commercial use or redistribution. See the projecr Web page for details and instruction manuals.

CMU Sphinx is, as far as I know, the only open source, large vocabulary, continuous speech recognition project that consistently releases its work under the liberal BSD-license, and definitely the only one that can work with the Mac.

Platforms supported
* GNU/Linux, Unix variants, and Windows NT or later

For more information, visit:
http://ostatic.com/173386-software-opensource/sphinx

You can download a Sphinx pacakage here:
http://cmusphinx.sourceforge.net/html/cmusphinx.php#download

Posting Comments Requires Membership

Login   or   Register    

Name:

Email:

Location:

URL:

Smileys

Remember my personal information

Notify me of follow-up comments?

Submit the word you see below:


Most Popular

iPod




iPhone

iLife

Reviews

Software Updates

Games

Hot Topics