SUBSIM Radio Room Forums - voice recognition?

Page 2 of 3

Show 20 post(s) from this thread on one page

SUBSIM Radio Room Forums (https://www.subsim.com/radioroom/index.php)

- SH4 Mods Workshop (https://www.subsim.com/radioroom/forumdisplay.php?f=219)

- - voice recognition? (https://www.subsim.com/radioroom/showthread.php?t=119039)

minsc_tdp

07-25-07 11:21 AM

Thanks for the encouragement. I'll be checking this thread periodically. I might start a different one when the first version is released, but will also notify here.

I've finished a reference file with all of the keyboard shortcuts that I will use, next is the mouse coordinate work. Here's the key file for anyone interested:

http://knepfler.com/sh4speech/commands.csv

corleonedk

07-25-07 12:34 PM

Well you will have a fan here thats for sure,keep up the work :up:

minsc_tdp

07-25-07 03:02 PM

There is one thing y'all could do to help. The only thing I'm skeptical on is how accurate the voice recognition will really be after it is fully trained. If you google the Microsoft Speech SDK 5.1, download that, and inside you'll find reco.exe and talkback.exe. This is a basic recognition example that I'll be using the core of. To use reco.exe, you just click the checkboxes called "Create Recognition Context", "Load Dictation" and "Activate Dictation". Talkback.exe you just run. Throw some commands at it and see how it does. Then, go into your control panel / Speech and train it. I'd suggest to do the initial session and then the additional training.

In control/panel speech, you might have a 5.1 Recognizer and a 6.1 Recognizer. Honestly I'm not sure which one these utilities (and mine) will use. If anyone can help me figure that out, great. Otherwise I'll probably end up recommending that people just train both fully (can't hurt) unless I stumble on the answer.

So that's what I'll be doing tonight mostly, training training training to see how accurate I can get it.

If it always hears the word "bananaphone" when you say "gyroscope", it's not too big of a deal since you will be able to just add that as an alias. But if it comes up with 5 or 10 different things every time you say the same word, that's going to be harder for users to deal with. Ideally we want it to only hear the word you say or, at the very least, a "consistent mistake".

Let me know what y'all find out, I'll do the same! I think I need a new headset/mic though, mine kind of sucks :)

UPDATE: I went and bought a cheap mic at Guitar Center (which mean $100 microphone) to eliminate all bad sound input and ensure that the Microsoft SAPI is working to its full potential. I've done three training sessions, plus the intro so I've been training it for 40 straight minutes. It's not bad, but it has trouble with certain terms. Rudder amidships is damn near impossible for it to figure out, a lot of the trouble is these short phrases. Instead of saying "bridge" it's more likely to understand you if you say "let's go to the bridge". This might not be a bad thing as more natural speech would be more fun than belting out single words and very short phrases. It also has a lot of trouble with "port" and "starboard"... which really sucks. I'd like to hear how other people's speech recognition fares.

panthercules

07-25-07 11:26 PM

Wow - with just the initial training plus one more module (Aesop's Fables), this really stinks at recognizing my voice commands - both in an absolute sense of thinking I'm saying something else, and in the sense of not being very consistent with what it thinks I'm saying from one time to the next (when I repeat the same commands).

I'll do some more training and see if it gets better - seems I was having much better luck with Shoot back in the SH3 days - of course, there I had a particular, lmited set of commands I gave it and was basically manually telling it what I was saying, but once I had done that it did a great job of recognizing it correctly when I said the same thing again.

[edit] Fascinating - when I went back in and just used the reco.exe this time (I was using both it and the talkback one before), before I did any more training, I was getting much more accurate and more consistent results - not perfect, but a lot better. Weird - gonna do some training and see what happens.

[edit] After further training, it seems to be doing better, but it still seems to be suffering from the method/concept that it needs to be trying to recognizing a virtually infinite number of things I might be saying out of a virtually infinite number of possible things to say (i.e,. the entire English language), and thus is trying to use logical/gramatical algorithms to try to "make sense" out of what it is hearing. This has got to be much tougher, and more prone to errors (particularly when what's being said are short nautical/military phrases that aren't what "normal" people would say) than the approach I believe Shoot (and some other/older speech recognition applications) takes. In that approach, you give it a limited set of commands (probably not more than 20 or 30, since at least for me that's about all I can remember, although you could always create a crib sheet list of commands to help you remember more I guess), and all it has to do is try to tell which one of those commands it's hearing - since it's only dealing with a very limited set of possibilities and you've basically trained it by speaking those commands and telling/confirming that it's heard you correctly, the recognition accuracy for any given level of computing power can be extremely high compared to trying to do this "figure out what I might be saying out of all possible combinations in the English language" approach this reco.exe thing seems to be using.

Is there some way to give this reco.exe thing this sort of limited vocabulary/command set, and then test it to see if it's working well?

CaptainKobuk

07-26-07 01:09 AM

Game Commander 3 is what i've been using for years on especially games like this involving a crew. It's an older program that's not being updated anymore but it still works flawlessly in all games.

I've tried "SHOOT" and it's voice recognition was very spotty in my opinion. But that was years ago. It could be fine now.

Being set free of a keyboard greatly relaxes gaming in my opinion.

CaptainCox

07-26-07 01:14 AM

My wife would divorce me for real, if i stared to talk to my PC. She thinks I am nuts with my "PC" love affair as it is anyway.

CaptainKobuk

07-26-07 01:24 AM

At 3am i do wonder what my nieghbors think.

"Dive dive dive!!!"

They gotta wonder what is going on eh? :rotfl:

skwasjer

07-26-07 03:05 AM

Nothing compared to my Quake III addiction years back. That was swearing and yelling pretty much non-stop every night on TeamSpeak. :roll:

minsc_tdp

07-26-07 03:23 AM

Panther, you've nailed the problem with generalized recognition (reco.exe method) exactly. I've been mulling this since the start. The SAPI5 SDK has a system where you can supply a limited set of words/phrases that it should compare against and I believe it will do exactly what you're talking about. Unfortunately, the SAPI5 SDK is extremely complicated and convoluted.

I thought it was just my amateur perl programming skills, but I talked to a highly paid programmer/co-worker who groaned at the mention of the SAPI5 SDK. But what you're talking about may be exactly what needs to be done.

I'm going to try to recruit his help but failing that, I might try to muscle through it using a language close to the provided samples - Visual Basic (ugh) - and with the proper IDE helping with object methods and whatnot, it should be a little easier. So all this little utility would do is try to match a phrase from a list, and when it matches, returns it to my perl script where I can do the remainder of the heavy lifting with ease.

If anyone wants to tackle this part, chime in!

kaliber7

07-26-07 01:06 PM

I have been using 'Voice Buddy 3.0' for both SH3 and SH4 and have had no problems at all. The mouse-click commands can be key-mapped and these mapped keys can be used in the voice profile and trained.

I only have to train each command 2-3 times and the more I use each command within the game, the easier it becomes.

panthercules

07-26-07 10:36 PM

Quote:

Originally Posted by kaliber7

Based on my experience with Shoot in SH3, I'm convinced that it could be used here just fine to handle any of the game commands that can already be mapped to key strokes. I haven't played around with setting up a profile in Shoot to do that yet, as I have been waiting to get the patch and mods situation stabilized and really start playing. The fascinating part of this thread has been the idea that you might be able to use voice commands to do things that, as far as I know, have to be done by mouse in game - principal ones being changing course (and actually, since I always play with the mouse rather than the keyboard whenever possible, I haven't yet really studied the key map capabilities of the game to know what can be mapped to keys and what can't) - are there any course change commands at all that can have keys assigned to them? (besides rudder settings, which I assume can be to somedegree at least). What about depth?

The more I thought about what I posted above about unlimited/natural language recognition versus limited specified-command recognition, I started getting worried because I couldn't really see any good way other than natural language approach to be able to do the course and depth changes stuff. However, then it dawned on me that maybe you could set it up so you would say the commands one number at a time (which I've heard done actually) - like "make your course one five zero", or "make your depth two one five" - that way you'd only have to specify the ten numbers 0-9, rather than every number between 0 and 359 (or more for depth).

Thus, I think the specified command recognition method could probably work as far as getting your commands recognized - the tricky part to me would be how to translate the fact that the program recognized that you said "make your course one five zero" into some form of instruction to the game to actually set your course to 150 degrees unless that can somehow be set to equate to a key stroke/combination. How does that Voice Buddy thing go about mapping the mouse clicks to keys? How does it know how to tell SH4 that you want SH4 to do something that it doesn't have a key assigned to do?

minsc_tdp

07-27-07 12:52 AM

Quote:

Originally Posted by kaliber7

Panther is right. Let me know when you can key-map "gyroscope right 5 degrees" :D Moving the mouse has already been tested and works. Moving dials is trivial.

Quote:

However, then it dawned on me that maybe you could set it up so you would say the commands one number at a time (which I've heard done actually) - like "make your course one five zero", or "make your depth two one five" - that way you'd only have to specify the ten numbers 0-9, rather than every number between 0 and 359 (or more for depth).

That's what I've had in mind the whole time. However, since strings of words must be recognized as a known string of words, I might have to plug in all 360 spoken degree combinations. But that's not a big deal. If not you just have to speak a little slower so it recognizes every word - the recognition API tends to wait before deciding on a match as it uses strings of words to make that determination. That means you'd have to pause, so putting in all the combinations would help recognition a lot.

So, for those who are hoping for voice control over more than just key-mappable functions, I have good news. I've adapted a python script I found and it works flawlessly. As panther suggested, getting SAPI to focus on trying to match only certain words makes a world of difference, versus the entire english dictionary. Here's an example - as soon as I got it working, I entered these words into the list and tried it - it nailed every word the first time without repeating.

C:\sh4speech>python hear.py
Zero
One
Two
Three
Four
Five
Six
Seven
Eight
Nine
Niner
Gyroscope
Torpedo
Rudder
Right
Left
Minsc rules
Shoot drools

C:\sh4speech>

Play with it here if you like:
http://knepfler.com/hear.py

You'll probably want ActivePython, which includes the necessary win32 extensions, and possibly wxPython as well.

I also worked today on my long forgotten middle school geometry. To move a dial 3 degrees, figuring out all those from-to pixels on multiple resolutions would be a pain. Now, all I need to define for each resolution is the x/y coordinate of the center of the dial, and the radius in pixels. With the angle(a) you supply, sin(a) and cos(a) give me the number of pixels from center to the point. Then I just click at top dead center, hold, move to the new coordinates, and release. Cool eh? I had forgotten how useful basic math could be. :)

kaliber7

07-27-07 02:28 AM

As far as diving to any particular depth is concerned, I just use the command 'Dive the boat', keep my eye on the depth guage and when approaching the required depth, give the command 'Maintain depth'.

As far as changing course is concerned, I use the command/s 'Hard to port/starboard' and when approaching the required direction, I give the command 'Rudder amidships'. Or, if I have a course plotted on the map and have strayed from it during combat, I use the command 'Return to course', once the combat situation has been resolved successfully.

Presently, I have not been able to find a way to map keys to give specific depths or compass points (using Voice Buddy 3.0), but I hope to eventually solve that problem some time in the not too distant future.

minsc_tdp

07-27-07 07:48 PM

Quote:

Originally Posted by kaliber7

You make some very good points here. You're probably right that, with voice, the dive/maintain method is perfectly fine. I have to admit that perhaps the voice control of precision manuevering, gyroscope rotation, AOB, etc. might be a bit overkill. I recall reading in some manual somewhere that you can be more easily detected if you turn your rudder too fast hard to port, and small steps were encouraged when silently avoiding a DD. Anyone know if this is true in SH4?

This conversation about key vs. mouse stuff has been very helpful, thanks a lot!

I've decided I will focus first on version 1.0 which will strictly do keyboard stuff, and consider whether to do 1.3 (lol) which would include precision dial controls. I really like the idea of setting up a torpedo shot vocally, and tweaking my rudder a few degrees vocally, as some of the most tense moments in the sub movies were during these moments.

minsc_tdp

07-27-07 09:57 PM

Panthercules, I'm very curious how well hear.py works for you, can you give it a try for me

All times are GMT -5. The time now is 10:21 PM.

Page 2 of 3

Show 20 post(s) from this thread on one page