Daily Archives: March 15, 2005


Voice control 3

A good number of people who were at the meeting yesterday (14th March 2005 for those reading this sometime after posting) will no doubt have discussed this in the break. It got me thinking a bit since the IBM ViaVoice has been discontinued on the Linux platform and I couldn’t think of any current projects in the field. I had a vague feeling in the back of my mind that I had read something related to one of the desktops, probably Gnome as that’s the one I’ve had most interest in (although I use XFCE myself), but this could easily have been a discussion that there should be something!

Anyway, I couldn’t imagine that there was nothing at all on the subject, even if there was nothing useable, so I’ve delved into Google and my bookmarks and come up with a few useful links – and since I said I’d post anything I found on the site, here’s a new forum as well

OK, starting with the more generic stuff:

First up there’s an old artilce in the Linux Gazette, although this doesn’t get into anything technical and is far to high level to be of any real use: http://linuxgazette.net/issue87/lodato.html

There’s also an article in Linux Journal on using ViaVoice with XVoice. I’ve not read it yet, but since ViaVoice is no longer available it seems of little use: http://www.linuxjournal.com/article/6383

There’s some discussion on integrating ViaVoice with KDE as well, but I’ve not found more than some basic discussion of whether it is a good idea so far. I’ve also come across comments on GVoice for Gnome, but nothing particularly useful on it yet, and I think it is basically dictation based.

There’s a links page with various speech related sites, some of which no longer exist, here: http://www.linux-sound.org/speech.html (not all voice recognition though, much is synthesis).

Getting into the more specific implementation based sites I have have the already mentioned XVoice:

http://www.zachary.com/w/XVoice
http://xvoice.sourceforge.net/

this looks to me more dictation based unfortunately and also relies on ViaVoice.

The Open-Source Speech Recognition Initiative site looks pretty dead, but the list appears to still be active and may be worth a look: http://www.ossri.org/

The FreeSpeech project has renamed itself to Open Mind Speech and looks promising, but is still in the fairly early stages of development: http://freespeech.sourceforge.net/

There’s a site on Automated Speech Recognition that looks to be research based with some code available, although I’ve not quite managed to get my head around exactly what is going on there yet!: http://www.isip.msstate.edu/projects/speech/software/index.html

CVoiceControl appears to have taken over from KVoiceControl and then stalled and is looking for someone to take over the project: http://www.kiecza.net/daniel/linux/index.html

There’s a couple of sites on CMU Sphinx which looks interesting, but I’m not sure whether it is able to work with desktop/application control or not – it probably depends how much development work you’re willing/able to put in There’s two links:

http://cmusphinx.sourceforge.net/html/cmusphinx.php
http://www.speech.cs.cmu.edu/sphinx/Sphinx.html

Most promising of the lot looks to be PerlBox which acts as a front end to the above CMU Sphinx system (amongst others) and from a reference article looks to be able to control the desktop to some extent with PerlBox Voice. It is customisable, but looks to be mainly application launching based, so what is involved to get more application control I’m not sure. It also looks to be of most use if you are using KDE.

Hopefully the above links will be a good starting point to further investigation. I’ve not delved far into any of them yet, but given time (more elusive than the Scarlet Pimpernel that commodity!) I may. I’ve just got to get sound working on my system first, I’m afraid I’ve never seen it as a high priority and I’m not a boxed Linux user, so its not thrown on by default – my systems are mainly CLI based only or a somewhat customised Debian desktop!