SUBSIM Radio Room Forums

SUBSIM Radio Room Forums (https://www.subsim.com/radioroom/index.php)
-   SH4 Mods Workshop (https://www.subsim.com/radioroom/forumdisplay.php?f=219)
-   -   voice recognition? (https://www.subsim.com/radioroom/showthread.php?t=119039)

minsc_tdp 07-22-07 04:06 PM

Voice Command development underway
 
I'm thinking of working on a voice recognition mod for SH4. As with many projects I consider starting, I think it would be pretty easy and have a rough idea of how to do it, but I'd first like to get thoughts from people here.

It would probably be a perl app since that's the only language I know, but I know it well and I know how to use the Win32 API with it. I'd use a similar approach as Enigma, where it would use the existing voice recognition libraries in Windows and translate them into keystrokes or other events. It might be tricky doing things like "rudder right 10 degrees" and other mouseclick only events with no equivalent key.

I'd be doing this mostly for myself but I'd like to hear if others would be interested, and whether folks here might help if I run into trouble?

skwasjer 07-22-07 04:22 PM

There's already some tools out there, one I've been using for SH3 pretty much since I found it. It's called Shoot.

You can probably just use the same app (it's not related to SH, just a generic speech to keycommand app), but have to modify the config files that define the words/commands vs keys. I never got to it yet, but have thought about looking into this for SH4...

See here (or Google):
http://forums.ubi.com/eve/forums/a/t...7401057692/p/1

Digital_Trucker 07-22-07 06:48 PM

I, too, used Shoot quite a bit when I played SH3. It is a nice little prog for the price ($0). As soon as I get the keyboard commands set up to my liking in SH4, I'll be doing a profile and commands.cfg setup for it.

If you download it, make sure and download the profile editor that I wrote for it, too. It will save you tons of time working with the xml profiles.

Both the voice command program and the editor are available at http://clans.gameclubcentral.com/shoot/

minsc_tdp 07-22-07 07:34 PM

I get a crash trying to load a simple profile with one command in Shoot:

An unexpected error occurred. Full error message:
Object reference not set to an instance of an object:
at shoot.config.Configuration.load(String filename)
at shoot.MainForm.loadProfile(String file)
at shoot MainForm.openDialog_FilOk(Object sender, CancelEventArgs e)
at System.Windows.Forms.FileDialog.OnFileOk(CancelEve ntArgs e)
at System.Windows.Forms.FileDialog.DoFileOk(IntPtr lpOFN)

Any help with that?

In the meantime, I've been experimenting with sending mouse input and keystrokes to SH4. Both work well. Here's an example of keyboard input:

use Win32::API;
my $keybd_event = new Win32::API("user32", "keybd_event" , [qw(I I N P)], 'V') or die "keybd_event: " . Win32::FormatMessage (Win32::GetLastError ());

use constant F1 => 0x70;
use constant SC_F1 => 0x3B;
use constant X => 0x58;
use constant SC_X => 0x2D;

while (1) { # keep pressing the key so i can switch to sh4
sleep(2);
press_key(C, SC_C); # Crash dive!
}

sub press_key {

my $bVk = shift;
my $bScan = shift;
$keybd_event->Call($bVk, $bScan, KEYEVENTF_EXTENDEDKEY, 0);
sleep(1);
$dwFlags = KEYEVENTF_EXTENDEDKEY | KEYEVENTF_KEYUP;
$keybd_event->Call($bVk, $bScan, $dwFlags, 0); # release key

}

And here is an example mouse movement:

use Win32::GuiTest qw/:ALL/;

MouseMoveAbsPix(693,692);
sleep(1);
MouseMoveAbsPix(723,693);
SendMouse("{LEFTDOWN}");
sleep(1);
SendMouse("{LEFTUP}");
sleep(1);
# Speed is now set to Full, will break with different resolutions though.


So, assuming I can get the voice recognition code to work properly, which doesn't look too hard, I have everything I need for precise SH4 control, including specific depth levels and all the keys! Exciting... now to see if I actually finish this :)

If someone can give me a basic Shoot profile for SH4 that does even a few keys, I'll prefer that route and I'll probably go to the trouble of finishing the profile. I've heard it has trouble with certain keys like the backtick ` but we'll have to see.

minsc_tdp 07-22-07 07:57 PM

BTW here is the shoot profile XML that crashes when I load it:

<?xml version="1.0" encoding="utf-8"?>
<!--- - - - -This profile created with Digital_Trucker's Shoot Profile Editor- - - - - - -->
<!--- - - please address all correspondence to Digital.Trucker@Yahoo.com - - - - - -->
<shoot-config>
<command-list key-delay="0.10">
<command name="C" phrase="dive" />
</command-list>
<push-to-talk initial-state="" />
</shoot-config>

I tried dropping in the new shoot.exe.config file at http://www.gameclubcentral.com/modul...ewtopic&t=3945 but it didn't work. When I load the EXE nothing happens, no process loads or anything. It looks like it only adds a new section, something about SupportedRuntime with a version number. Maybe they haven't updated this properly for .NET 3.0 yet?

UPDATE: I think it's because I only have .NET 2.0 installed. And you can have more than one at the same time, so I'm installing 1.1 and 3.0 (for thoroughness).

Digital_Trucker 07-22-07 08:19 PM

Check out the sticky at the top of the game club central shoot support forum at http://www.gameclubcentral.com/modul...viewforum&f=39

It explains what you have to do to get Shoot to work with any .NET after 1.1. The author of the software hasn't updated it, but you can make it work as long as you have .NET 1.1 installed alongside 2.0 or 3.0. I hear he is in the process of rewriting it, but haven't seen it yet.

You can download my SH3 profile for Shoot at http://www.gameclubcentral.com/modul...r_III_EXPANDED

The only problem with that is that they changed the format of the commands.cfg file with SH4 and the key bindings file supplied with the Shoot profile will not work with SH4. You'll have to do the same thing I've been putting off (hoping that we'd get a nice tool like Setkeys for SH4) and that is map all the keys manually.

I may break down and go ahead and do it and get it over with. I think that's the last thing I need to do before I can start playing the game for real :rotfl: .

However, if you are going to write something that will handle the things that Shoot can't do, then it's all a moot point.

minsc_tdp 07-22-07 08:58 PM

Thanks, I got shoot working by simply installing .NET 1.1.

It looks like it would be REALLY easy to finish this up for SH4 but the obvious drawback is that it does keyboard only.

I would really prefer to be able to, say, "set depth 150" but that would involve doing everything Shoot does from scratch to get the mouse support in.

Even if I had Shoot ignore "set depth x" and handled that with my own app running in parallel, given the amount of work to get that going and that we would both be using the Microsoft SAPI, I'd might as well do the whole thing.

What do you think? I could do the following voice actions which have no keyboard equivalents:

set depth 150
set rudder left 10 degrees
set course north by northeast
set time compression 1024
fuel
batteries
compressed air
co2

Version 1.0 would probably be those commands plus all the keyboard commands.

Version 1.1 could do a complete firing solution almost entirely hands-free (after locking a target):

targeting computer
stadimeter
target angle 30 degrees right (i'd have to adjust it from its current position, since there's probably no way to know where it is. probably better to leave this to the mouse)
lock it in (or) solution! (sends to the computer)
start the position keeper
(repeat above as necessary for multiple readings)
estimate target speed (crew would do it, or: )
increase target speed point 5 knots (for manual adjustment)

torpedo speed slow
gyro angle 10 degrees right
magnetic detonator
open tubes 1 and 2 (I see no reason I couldn't intelligently parse this into multiple commands)
fire tube 1!
gyro angle 15 degrees left
fire tube 2!

which would of course be immediately followed by:

crash dive! ahead flank! right full rudder! damage report! blow ballast! abandon ship! :p

panthercules 07-23-07 12:55 AM

I had a blast using Shoot and yelling out commands in German to control my U-boat in SH3 - I too was planning to update/create a profile to use it in SH4, once all the patching and modding got to a reasonably stable point so I could play it again. I think it would be awesome if you could come up with a way to not only replicate the keyboard commands but to actually do mouse driven stuff like setting course to a certain compass reading (or by a certain number of degrees - such as "come right 20 degrees", or "left full rudder - steer 345 degrees"). Just be sure if you can to let the user customize the voice control phrase/triggers to whatever they want, rather than make them use/remember some arbitrary syntax that's hardcoded in - I know it was a lot easier for me to remember the commands in my Shoot/SH3 setup because I had decided what I wanted to say for the voice commands, raher than trying to memorize some commands someone else had decided on.

If you've got the skills to do this, I'd say GO FOR IT :yep:

I'll be waiting in line to download - just let me know when it's ready :up:

minsc_tdp 07-23-07 01:46 AM

It would use plaintext definition files for all voice commands, keyboard input and mouse coordinates, as well as being entirely open source (perl) so there's no chance that someone couldn't make it do whatever they want. If it all works well then perhaps the community could localize it into German, Swahili, whatever.

At this point, from the tests I've done tonight, I know I can do all of it. I have yet to actually test the perl SAPI code but I'm optimistic that it will work. I'm a little worried about some of the click-and-drag functions, like rotating the target angle on bow but I see no reason it won't work. The precise mouseclicks have also already been tested OK.

I'll probably do it not only because I want to use it, and others would too, but also to have a nice project under my belt that I can cite in case I lose my job. :cool: "Personal: Developed a voice recognition software from scratch for use with a game using MS SAPI 5 and Win32 API" would look good on a resume don't you think? :know:

minsc_tdp 07-23-07 07:32 PM

Well I should have known better than to say something is easy before starting it.

I've spent quite a bit of time today trying to get a simple speech-to-text recognizer written in Perl using Win32::SAPI5. There's few decent examples out there and the SAPI5 SDK is so convoluted. I'm surprised there's not a simple of example of "here's a valid word list, here's how you start listening, here's how you store what it hears, here's how you compare that to the list of good words, and then you print the word if it recognizes it." That's all I'm trying to do. I've got something close to working but not there quite yet.

panthercules 07-23-07 07:36 PM

Quote:

Originally Posted by minsc_tdp
"Personal: Developed a voice recognition software from scratch for use with a game using MS SAPI 5 and Win32 API" would look good on a resume don't you think? :know:

Hey - it would impress me - if I had any money I'd hire you as my personal modder :yep:

And when you get this one done, I've got another suggestion for you - we could really use an automated captain's log parser/reader/whatever that would allow us to type up log entries (kinda like IL-2 Stab did for the Sturmovik series with its War Diary function) and then output/format it into a form that the game would recognize and be able to display them in the in-game log. See this thread for a discussion about how this works manually: http://www.subsim.com/radioroom/show...220#post562220

Hey, maybe you could even combine this with your voice-recognition utility, so we could be like James Kirk and dictate our entries .... "Captains Log, star date 194112.07..." :D

minsc_tdp 07-23-07 09:59 PM

Quote:

Originally Posted by panthercules
Hey - it would impress me

It would impress me if I make this work at all at the current rate. I had no idea getting basic speech recognition in Perl would be so difficult! Help! I've blown hours on this already today.. can't get it to recognize a single word. It's close... there's a sample EXE that shows 7 or 8 events fire every time I speak and shows the words, and my script does the same thing but just shows a bunch of junk for the events themselves, no way to extract the words out!

minsc_tdp 07-24-07 06:08 PM

Some good news. I've got a simple C++ app working and modded to taste that can feed perl the recognized phrases. From this and what I've done so far, I'm certain I can do this entire mod, so it's full speed ahead.

The only questionable bit is, will people bother to train their voice recognition systems for an hour or more before trying this to ensure good recognition... it's very boring. :) But once it's done the recognition improves a lot and that will be very important, since certain words are hard to get right ("gyroscope right 10 degrees" is not easy for even the most advanced speech recognizers out there since it's a sentence fragment, let alone something it may not see combined in that way very often.)

Fortunately I'll allow people to change it, so if everytime you say "gyroscope right 10 degrees" it hears "jerry scope light 10 daires" then you could add "jerry scope" as an alias for "gyroscope" and it would just work even when the speech engine gets confused (assuming it always gets confused the same way)!

Anyway, I'm excited about this project. I'll probably go silent running for a week or two while I pound the pavement on the first version.

Digital_Trucker 07-24-07 07:04 PM

Cool, good luck with it. I guess that's a bit late since you won't see this 'till after the 2 weeks of working on it :rotfl:

panthercules 07-24-07 07:34 PM

Just in case you do check back before you go deep and silent - the training issue would not be a problem for me if in the end the thing actually works well - I know I've had to train the Naturally Speaking program I used for general voice recognition, and I can't remember (been a while since I set up my Shoot for SH3) but I think that might have had to be trained as well.

Sounds great that you think you've cracked the code on this one - can't wait till you get this one finished off . I'll be glad to be testing it out while you turn your attention to the captain's log program I suggested in a previous post :D

minsc_tdp 07-25-07 11:21 AM

Thanks for the encouragement. I'll be checking this thread periodically. I might start a different one when the first version is released, but will also notify here.

I've finished a reference file with all of the keyboard shortcuts that I will use, next is the mouse coordinate work. Here's the key file for anyone interested:

http://knepfler.com/sh4speech/commands.csv

corleonedk 07-25-07 12:34 PM

Well you will have a fan here thats for sure,keep up the work :up:

minsc_tdp 07-25-07 03:02 PM

There is one thing y'all could do to help. The only thing I'm skeptical on is how accurate the voice recognition will really be after it is fully trained. If you google the Microsoft Speech SDK 5.1, download that, and inside you'll find reco.exe and talkback.exe. This is a basic recognition example that I'll be using the core of. To use reco.exe, you just click the checkboxes called "Create Recognition Context", "Load Dictation" and "Activate Dictation". Talkback.exe you just run. Throw some commands at it and see how it does. Then, go into your control panel / Speech and train it. I'd suggest to do the initial session and then the additional training.

In control/panel speech, you might have a 5.1 Recognizer and a 6.1 Recognizer. Honestly I'm not sure which one these utilities (and mine) will use. If anyone can help me figure that out, great. Otherwise I'll probably end up recommending that people just train both fully (can't hurt) unless I stumble on the answer.

So that's what I'll be doing tonight mostly, training training training to see how accurate I can get it.

If it always hears the word "bananaphone" when you say "gyroscope", it's not too big of a deal since you will be able to just add that as an alias. But if it comes up with 5 or 10 different things every time you say the same word, that's going to be harder for users to deal with. Ideally we want it to only hear the word you say or, at the very least, a "consistent mistake".

Let me know what y'all find out, I'll do the same! I think I need a new headset/mic though, mine kind of sucks :)

UPDATE: I went and bought a cheap mic at Guitar Center (which mean $100 microphone) to eliminate all bad sound input and ensure that the Microsoft SAPI is working to its full potential. I've done three training sessions, plus the intro so I've been training it for 40 straight minutes. It's not bad, but it has trouble with certain terms. Rudder amidships is damn near impossible for it to figure out, a lot of the trouble is these short phrases. Instead of saying "bridge" it's more likely to understand you if you say "let's go to the bridge". This might not be a bad thing as more natural speech would be more fun than belting out single words and very short phrases. It also has a lot of trouble with "port" and "starboard"... which really sucks. I'd like to hear how other people's speech recognition fares.

panthercules 07-25-07 11:26 PM

Wow - with just the initial training plus one more module (Aesop's Fables), this really stinks at recognizing my voice commands - both in an absolute sense of thinking I'm saying something else, and in the sense of not being very consistent with what it thinks I'm saying from one time to the next (when I repeat the same commands).

I'll do some more training and see if it gets better - seems I was having much better luck with Shoot back in the SH3 days - of course, there I had a particular, lmited set of commands I gave it and was basically manually telling it what I was saying, but once I had done that it did a great job of recognizing it correctly when I said the same thing again.

[edit] Fascinating - when I went back in and just used the reco.exe this time (I was using both it and the talkback one before), before I did any more training, I was getting much more accurate and more consistent results - not perfect, but a lot better. Weird - gonna do some training and see what happens.

[edit] After further training, it seems to be doing better, but it still seems to be suffering from the method/concept that it needs to be trying to recognizing a virtually infinite number of things I might be saying out of a virtually infinite number of possible things to say (i.e,. the entire English language), and thus is trying to use logical/gramatical algorithms to try to "make sense" out of what it is hearing. This has got to be much tougher, and more prone to errors (particularly when what's being said are short nautical/military phrases that aren't what "normal" people would say) than the approach I believe Shoot (and some other/older speech recognition applications) takes. In that approach, you give it a limited set of commands (probably not more than 20 or 30, since at least for me that's about all I can remember, although you could always create a crib sheet list of commands to help you remember more I guess), and all it has to do is try to tell which one of those commands it's hearing - since it's only dealing with a very limited set of possibilities and you've basically trained it by speaking those commands and telling/confirming that it's heard you correctly, the recognition accuracy for any given level of computing power can be extremely high compared to trying to do this "figure out what I might be saying out of all possible combinations in the English language" approach this reco.exe thing seems to be using.

Is there some way to give this reco.exe thing this sort of limited vocabulary/command set, and then test it to see if it's working well?

CaptainKobuk 07-26-07 01:09 AM

Game Commander 3 is what i've been using for years on especially games like this involving a crew. It's an older program that's not being updated anymore but it still works flawlessly in all games.

I've tried "SHOOT" and it's voice recognition was very spotty in my opinion. But that was years ago. It could be fine now.

Being set free of a keyboard greatly relaxes gaming in my opinion.


All times are GMT -5. The time now is 05:39 PM.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 1995- 2025 Subsim®
"Subsim" is a registered trademark, all rights reserved.