SUBSIM Radio Room Forums

SUBSIM Radio Room Forums (https://www.subsim.com/radioroom/index.php)
-   SH4 Mods Workshop (https://www.subsim.com/radioroom/forumdisplay.php?f=219)
-   -   voice recognition? (https://www.subsim.com/radioroom/showthread.php?t=119039)

CaptainCox 07-26-07 01:14 AM

My wife would divorce me for real, if i stared to talk to my PC. She thinks I am nuts with my "PC" love affair as it is anyway.

CaptainKobuk 07-26-07 01:24 AM

At 3am i do wonder what my nieghbors think.

"Dive dive dive!!!"

They gotta wonder what is going on eh? :rotfl:

skwasjer 07-26-07 03:05 AM

Nothing compared to my Quake III addiction years back. That was swearing and yelling pretty much non-stop every night on TeamSpeak. :roll:

minsc_tdp 07-26-07 03:23 AM

Panther, you've nailed the problem with generalized recognition (reco.exe method) exactly. I've been mulling this since the start. The SAPI5 SDK has a system where you can supply a limited set of words/phrases that it should compare against and I believe it will do exactly what you're talking about. Unfortunately, the SAPI5 SDK is extremely complicated and convoluted.

I thought it was just my amateur perl programming skills, but I talked to a highly paid programmer/co-worker who groaned at the mention of the SAPI5 SDK. But what you're talking about may be exactly what needs to be done.

I'm going to try to recruit his help but failing that, I might try to muscle through it using a language close to the provided samples - Visual Basic (ugh) - and with the proper IDE helping with object methods and whatnot, it should be a little easier. So all this little utility would do is try to match a phrase from a list, and when it matches, returns it to my perl script where I can do the remainder of the heavy lifting with ease.

If anyone wants to tackle this part, chime in!

kaliber7 07-26-07 01:06 PM

I have been using 'Voice Buddy 3.0' for both SH3 and SH4 and have had no problems at all. The mouse-click commands can be key-mapped and these mapped keys can be used in the voice profile and trained.

I only have to train each command 2-3 times and the more I use each command within the game, the easier it becomes.

panthercules 07-26-07 10:36 PM

Quote:

Originally Posted by kaliber7
I have been using 'Voice Buddy 3.0' for both SH3 and SH4 and have had no problems at all. The mouse-click commands can be key-mapped and these mapped keys can be used in the voice profile and trained.

Based on my experience with Shoot in SH3, I'm convinced that it could be used here just fine to handle any of the game commands that can already be mapped to key strokes. I haven't played around with setting up a profile in Shoot to do that yet, as I have been waiting to get the patch and mods situation stabilized and really start playing. The fascinating part of this thread has been the idea that you might be able to use voice commands to do things that, as far as I know, have to be done by mouse in game - principal ones being changing course (and actually, since I always play with the mouse rather than the keyboard whenever possible, I haven't yet really studied the key map capabilities of the game to know what can be mapped to keys and what can't) - are there any course change commands at all that can have keys assigned to them? (besides rudder settings, which I assume can be to somedegree at least). What about depth?

The more I thought about what I posted above about unlimited/natural language recognition versus limited specified-command recognition, I started getting worried because I couldn't really see any good way other than natural language approach to be able to do the course and depth changes stuff. However, then it dawned on me that maybe you could set it up so you would say the commands one number at a time (which I've heard done actually) - like "make your course one five zero", or "make your depth two one five" - that way you'd only have to specify the ten numbers 0-9, rather than every number between 0 and 359 (or more for depth).

Thus, I think the specified command recognition method could probably work as far as getting your commands recognized - the tricky part to me would be how to translate the fact that the program recognized that you said "make your course one five zero" into some form of instruction to the game to actually set your course to 150 degrees unless that can somehow be set to equate to a key stroke/combination. How does that Voice Buddy thing go about mapping the mouse clicks to keys? How does it know how to tell SH4 that you want SH4 to do something that it doesn't have a key assigned to do?

minsc_tdp 07-27-07 12:52 AM

Quote:

Originally Posted by kaliber7
I have been using 'Voice Buddy 3.0' for both SH3 and SH4 and have had no problems at all. The mouse-click commands can be key-mapped and these mapped keys can be used in the voice profile and trained.

Panther is right. Let me know when you can key-map "gyroscope right 5 degrees" :D Moving the mouse has already been tested and works. Moving dials is trivial.

Quote:

However, then it dawned on me that maybe you could set it up so you would say the commands one number at a time (which I've heard done actually) - like "make your course one five zero", or "make your depth two one five" - that way you'd only have to specify the ten numbers 0-9, rather than every number between 0 and 359 (or more for depth).
That's what I've had in mind the whole time. However, since strings of words must be recognized as a known string of words, I might have to plug in all 360 spoken degree combinations. But that's not a big deal. If not you just have to speak a little slower so it recognizes every word - the recognition API tends to wait before deciding on a match as it uses strings of words to make that determination. That means you'd have to pause, so putting in all the combinations would help recognition a lot.

So, for those who are hoping for voice control over more than just key-mappable functions, I have good news. I've adapted a python script I found and it works flawlessly. As panther suggested, getting SAPI to focus on trying to match only certain words makes a world of difference, versus the entire english dictionary. Here's an example - as soon as I got it working, I entered these words into the list and tried it - it nailed every word the first time without repeating.

C:\sh4speech>python hear.py
Zero
One
Two
Three
Four
Five
Six
Seven
Eight
Nine
Niner
Gyroscope
Torpedo
Rudder
Right
Left
Minsc rules
Shoot drools

C:\sh4speech>

Play with it here if you like:
http://knepfler.com/hear.py

You'll probably want ActivePython, which includes the necessary win32 extensions, and possibly wxPython as well.

I also worked today on my long forgotten middle school geometry. To move a dial 3 degrees, figuring out all those from-to pixels on multiple resolutions would be a pain. Now, all I need to define for each resolution is the x/y coordinate of the center of the dial, and the radius in pixels. With the angle(a) you supply, sin(a) and cos(a) give me the number of pixels from center to the point. Then I just click at top dead center, hold, move to the new coordinates, and release. Cool eh? I had forgotten how useful basic math could be. :)

kaliber7 07-27-07 02:28 AM

As far as diving to any particular depth is concerned, I just use the command 'Dive the boat', keep my eye on the depth guage and when approaching the required depth, give the command 'Maintain depth'.

As far as changing course is concerned, I use the command/s 'Hard to port/starboard' and when approaching the required direction, I give the command 'Rudder amidships'. Or, if I have a course plotted on the map and have strayed from it during combat, I use the command 'Return to course', once the combat situation has been resolved successfully.

Presently, I have not been able to find a way to map keys to give specific depths or compass points (using Voice Buddy 3.0), but I hope to eventually solve that problem some time in the not too distant future.

minsc_tdp 07-27-07 07:48 PM

Quote:

Originally Posted by kaliber7
As far as diving to any particular depth is concerned, I just use the command 'Dive the boat', keep my eye on the depth guage and when approaching the required depth, give the command 'Maintain depth'.

As far as changing course is concerned, I use the command/s 'Hard to port/starboard' and when approaching the required direction, I give the command 'Rudder amidships'. Or, if I have a course plotted on the map and have strayed from it during combat, I use the command 'Return to course', once the combat situation has been resolved successfully.

Presently, I have not been able to find a way to map keys to give specific depths or compass points (using Voice Buddy 3.0), but I hope to eventually solve that problem some time in the not too distant future.

You make some very good points here. You're probably right that, with voice, the dive/maintain method is perfectly fine. I have to admit that perhaps the voice control of precision manuevering, gyroscope rotation, AOB, etc. might be a bit overkill. I recall reading in some manual somewhere that you can be more easily detected if you turn your rudder too fast hard to port, and small steps were encouraged when silently avoiding a DD. Anyone know if this is true in SH4?

This conversation about key vs. mouse stuff has been very helpful, thanks a lot!

I've decided I will focus first on version 1.0 which will strictly do keyboard stuff, and consider whether to do 1.3 (lol) which would include precision dial controls. I really like the idea of setting up a torpedo shot vocally, and tweaking my rudder a few degrees vocally, as some of the most tense moments in the sub movies were during these moments.

minsc_tdp 07-27-07 09:57 PM

Panthercules, I'm very curious how well hear.py works for you, can you give it a try for me

CaptainKobuk 07-27-07 10:18 PM

Even with Game Commander 3, i still greatly appreciate having a tilt scroll wheel on a mouse for steering left/right. The setting i've used most is hard left or right.

Where a command is very often repeated the mouse is best. What's coolest about voice commands is ordering the crew to battle station or emergency dive, etc. These "captain" type games can get very immersive that way.

Ordering soldiers around in Armed Assault (operation flashpooint 2) is another fantastic game to play as a "captain".

panthercules 07-28-07 01:14 AM

Quote:

Originally Posted by minsc_tdp
Panthercules, I'm very curious how well hear.py works for you, can you give it a try for me

Will be happy to, but I'm going to be spending most of the weekend building a new PC and moving bits around between about 3 others, so it may be a couple of days before I can check into this stuff again, even if all goes well with the PC construction job (and something always seems to go wrong somewhere along the way, so even that time line may prove to be optimistic, but I'll check it out as soon as I can).

This still sounds very promising, but I really hope that you can figure out how to make it work to command a specific depth or course rather than have to go with just the dive/maintain and hard rudder/amidships approach - I'd really like to be able to order a depth and go do something else without worrying that I'll forget to say "level off" and send us all to the bottom (or forget to say rudder amidships and sail around in circles).

Your comment about the string of words maybe having to include all the possible degree combinations ir probably right on target, much as it pains me to realize it. I remember when I was training Shoot for my German profile for SH3, I entered each separate word into the custom dictionary (e.g., "kleine", "langsame", "fahrt" and "voraus"), and then it would recognize each word separately as I put them together in various phrases. However, in order to tell it what to listen for when I wanted it to send the particular speed command, I did have to tell it to look for each specific combination of words (e.g., "kleine fahrt voraus", "langsame fahrt voraus"), so if you are going to be able to give granular, degree-by-degree course commands it looks like you might well have to define a separate command for all 360 degree phrases, which I was hoping to maybe avoid by calling out the numbers one by one. Too bad :(

Still, even if it proves to be too much to be able to command course changes to specific degree courses or in 1-degree increments, being able to do it in maybe 5 or 10-degree increments ("come right 10 degrees", or "come left 45 degrees", etc.) would still be really cool. Same thing with depth - even if you can't order depth in 1-foot increments, being able to do it in 10 or even 20 foot increments would still be pretty cool. You could still use either the mouse or the dive/maintain or hard rudder/amidships approaches as well if you wanted some intermediate results, since the latter are already key-mappable.

Keep up the good work :up:

minsc_tdp 07-28-07 01:39 AM

FYI
 
On Monday and Tuesday I will be on a business trip, and then I will be visiting family for the rest of the week. I will have a laptop and might do some work here and there but I doubt I'll do much on this next week at all. Please don't think I've abandoned the project! I'm quite obsessed with this, and while I'm fickle with such things at times, I'm committed to getting this done! The more I work on it, the cooler it gets.

I just finished creating voice command aliases for all of the major key commands. Some just have one, but others have many variants, and it's so easy to add them, I've thrown in some fun ones, and you're free to add your own when you start using it. "Let's have a look" for the observation periscope, for example. It really sucks when there's just one command, like Enigma, that sounds so unnatural. This will rock.

I realized today there's going to be a lot more mouse-driven events than I thought, since many MANY functions don't have default key mappings (um, battle stations? Depth below keel? hellooo?) so I'll probably work these in to the first version. The dials will probably wait and getting these simple click based ones in will be a good opportunity to tune in the click delays and such to ensure the mouse commands are reliable before moving onto the trickier dials.

BTW, I'm thinking on the "come right three four zero", I've got a pretty solid plan on how to do this. Since each possibly speakable phrase must be known to the voice rec when it loads, I'm going to write a routine that takes all voice commands that require numbers (like indicating "come right" has property=numeric argument, range=1-360) and simply generates all the additional voice combinations for 1-360 from a lookup list, and it will just magically load them all in. However, when you rotate the gyro dial 90 degrees, it only changes the gyro angle like 15 "gyro degrees" though, so I'll have to account for that somehow too (should be a simple multiplier, ie, "gyro right 15 degrees" secretly means "gyro right [15*6] degrees").

Payoff 07-28-07 09:15 AM

This sounds very interesting. Too bad they did'nt include in game voice commands such as "Sub Command" did years ago for all the incremental stuff. Keep up the good work.

minsc_tdp 07-28-07 05:31 PM

Version 1.0 is done and out the door!

http://www.subsim.com/radioroom/show...880#post606880

No mouse stuff yet.
Full support of all default key commands.
Several good additional voice command entered for certain commands (like "quiet" and "silence" in addition to "silent running" for silent running mode)
Excellent voice recognition! I am ecstatic about how well it is working. Be sure to read the guide on tuning your mic and configuring and training speech!

This thread is officially closed, please move all discussion to the other. Thanks everyone!

Mods, please lock this thread?

skwasjer 07-28-07 06:36 PM

I have not tested your release, too busy with my own project :p

I did think of one thing: with mouse related commands, try to keep in mind the different resolutions people will be using. You could use screen offsets instead of absolute values in the config, so that users would only need to specify their resolution, and the tool calculates where the actual click must take place. Like for rudder: bottom - n pixels, right - n pixels. Otherwise, it would be a serious pain to recreate configs for all resolutions (don't forget widescreen!).

minsc_tdp 07-28-07 11:06 PM

Quote:

Originally Posted by skwasjer
I have not tested your release, too busy with my own project :p

I did think of one thing: with mouse related commands, try to keep in mind the different resolutions people will be using. You could use screen offsets instead of absolute values in the config, so that users would only need to specify their resolution, and the tool calculates where the actual click must take place. Like for rudder: bottom - n pixels, right - n pixels. Otherwise, it would be a serious pain to recreate configs for all resolutions (don't forget widescreen!).

Yeah I'm keenly aware of that, it's not as simple as an offset since the controls are completely moved for the different resolutions. The x/y coordinates of everything would need to be done for all resolutions no matter what.


All times are GMT -5. The time now is 05:37 PM.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 1995- 2025 Subsim®
"Subsim" is a registered trademark, all rights reserved.