SUBSIM Radio Room Forums

SUBSIM Radio Room Forums (
-   SH4 Mods Workshop (
-   -   [REL] sh4speech - voice command for SH4 (

minsc_tdp 08-22-07 02:21 AM

Well BioShock crashes like mad so I went back to working on sh4speech. You might not believe it, but 1.7 is done. And with this version I cannot foresee adding any new major features - everything you can control in the game is now controllable, at any resolution. It is done! Any other updates would be minor from here.

Since this update is so major, I'm calling it a BETA for now.

Changelog is in the ReadMe.
Here's the download link for 1.7 BETA.

New commands:
"torpedo settings" and "PK status" (both do the same, toggles the mode. Be sure the torp settings mode is up before doing the following:)
"[increase or decrease] torpedo depth [0-50] feet" (relative to current setting)
"exploder contact"
"exploder influence"
"torpedo speed low" or "low speed torpedoes"
"torpedo speed high" or "high speed torpedoes"
"torpedo gyro [right or left] [0-19] degrees" (relative to current setting! So if it's 5 right now, and you want it 5 left, say "left 10 degrees")

"range finder " or "range tool" (sorry, "stadimeter" just won't recognize)
"[increase or decrease] angle on bow [0-360] degrees" (auto-sends to computer)
"[increase or decrease] target speed [0-40] knots" (auto-sends to computer)
"calculate target speed" (auto-sends to computer)

"toggle PK " (clicks once)
"update PK " (clicks twice to update after new readings are sent in)
(you can also use the full phrase "position keeper" for "PK")

I've tested the resolution compatibility testing in two additional resolutions, and it works great, but I certainly haven't tried every single command. If a command doesn't work, don't panic! Let me know. There could be a few where I indicated that the control is just in the wrong quadrant of the screen. I doubt there are any that don't follow the "rules" of where things end up when the res changes.

At this point I'm going to stop doing anymore work on this except for handling reported bugs and maybe working on improving the installer and compiling to EXEs to eliminate having to install Perl/Python. Beyond that, I'm just going to play the hell out of it to really see what needs doing, if anything!

Digital_Trucker 08-22-07 09:29 AM

Congrats on the completion!
Feels good to finish a project and get to play with it, doesn't it? Looking forward to being able to use all the functions at my screen resolution.

Thanks for all the hard work:rock:

minsc_tdp 08-22-07 06:28 PM

I just realized why there's a minor bug I saw late last night after release. If you use a command that click/drag the speed, torp depth, or torp angle dials more than 180 degrees, it won't go in the correct direction. It takes the shortest route between 0 angle and your requested angle, so right 270 degrees will actually go left 90 degrees, but this may not necessarily be correct due to the way these dials work - even though you end up at the same point, the direction you twist the dial might be important.

The simplest fix for now is to remove any commands that twist it more than 179 degrees. Or maybe I'll just have it drag along a few extra points between the start and finish points to ensure it spins in the right direction. I'll post that fix tonight.

UPDATE: Posted this fix, no version number change. It's pretty cool to see the dials clicked and dragged all the way to the requested position instead of just sorta zooming there instantly. :) I just set up some shots in the training session, pretty cool. I sunk that ship, and never touched the keyboard or mouse except to ID the ship, set the range finder (stadimeter), and move the periscope left and right.

minsc_tdp 08-23-07 02:51 AM

I just posted some videos at:

The videos show the entire installation process, as well as a demo video at the end of me sinking a sitting duck almost entirely by voice. Control over these functions is brand spanking new and I haven't practiced any of the commands.

So, in that vid, you might notice one or two missed recognitions - angle on bow is especially tricky for some reason, but it's the only one I've seen behave like that. "toggle PK" was just my throat needing to be cleared. Also, when using calculate speed, it didn't auto-send to TDC, but that's already fixed and posted.

UPDATE: Having lowered my Pronunciation Sensitivity a little, I get much better recognition on the Angle on Bow commands, as expected. You have to tune that once in a while, as the ambient conditions change, as well as microphone distance and gain (mine is on a stand on the table) so you want to use the test mode once in a while and make sure it's recognizing this command and a few others well, as well as not doing a false match when you say something nonsensical into the microphone.

UPDATE: Updated 1.7b (no version change) with a minor fix that speeds up the dial twisting action by using larger steps, instead of hitting every degree on the dial between 0 and the requested setting, which could be kind of slow (watch the AOB turn in my video to see what I mean.)

I'm also considering changing the voice commands from "torpedo gyro left 5 degrees" and "increase angle on bow 90 degrees" to simpler phrases, such as "gyro right 5 degrees" or "spread right 5 degrees", and "increase target angle 90 degrees" (since "bow" seems hard for it to recognize for some strange reason.)

I'll probably release 1.7 Final tonight so I hope to get some feedback on all these new changes before then!

panthercules 08-23-07 11:03 PM

Watched the video - looks pretty cool, even with the occasional glitch. Still haven't gotten out to sea with this to try it out (had my version 1.63 install ready to go, only to see you had already released 1.7 - plus still trying to tweak leo's latest LBO to work with Kriller's update and a few other things - plus, had to spend a couple of days traveling to a neighboring state for a funeral and deal with some other RL issues) - seems like I just can't keep up with all the amazing progress being made by all the modders :o

Anyway, got stuck in traffic on the way home tonight, and started thinking about trying to adapt SH4Speech to use with IL-2 (got tired of waiting for Oleg's BOB to come out, so I picked up the IL-2 1946 expansion a few days ago and thought I'd give it a whirl with my new PC and projector setup). Now that you've got it so that SH4Speech will run from anywhere, I'm assuming that it should be OK and pretty easy to set up multiple instances of SH4Speech - one for SH4 and another one in a different folder for IL-2, for example - is that correct?

Although this may change once I get into it, all I really want to do with it in IL-2 for now is program it to let me speak some radio commands (simple stuff like asking my wingman or other flights to cover me, or asking the tower for clearance to land, etc.) - I hate having to type those and scan the menu trying to figure out how to send those messages, and they seem like a real natural thing to use a speech-recognition program for. IIRC, those require a two-step process like hitting the tab key and then a number key - I assume from what you've said about your macro ability in SH4Speech that it can handle that sort of key combination, right?

I assume that with only needing those few sorts of commands, for the IL-2 install of SH4Speech I can get by with a much-pared down version of the voice_commands.csv and key_commands.csv files, use the same key_codes.csv file and just use a blank/empty copy of the tubes.csv and variables.csv files (IIRC, you said it needs to have those files, even though they're empty, as opposed to just deleting them entirely?), and otherwise just copy everything in the SH4-related instance into the folder for the IL-2-related instance - any reason that sort of approach shouldn't work? I assume that each instance of SH4Speech would be able to use the same instances of perl, python etc that are already installed and being used by my SH4-related instance (not at the same time, of course), so I wouldn't need to have duplicate instances of those things installed - is that correct?

(BTW - my .csv file-related comments above are based on how you had 1.63 structured - I haven't really digested what changes you might have made to those file relationships in 1.7 yet, but I'll be plowing into that in the next couple of days unless RL gets in the way again - still, I'm sure you can see what I'm getting at as far as what parts it seems I would need and what parts it seems I could do without for the IL-2 instance).

panthercules 08-24-07 12:47 AM

Finally got out to sea, at least momentarily, just to give 1.63 a try before updating to 1.7. Just a couple of observations so far -

(1) this is really cool - setting the depth, speed and course by voice is especially nice.

(2) took me a minute to figure out one glitch-issue though - not being able to give the mouse command orders while looking around is a bit of a drag. I suspect eventually it will become second nature to click out of look-around mouse mode so that the mouse-movement voice commands can work, but it was rather weird for a while there.

(3) this may be an actual problem - the program seems to be having a problem with the command bar commands when the command/button bar is in "unlocked" mode. This appeared when I issued the command for "resume course" and wound up at the attack scope - I finally figured out that SH4Speech was moving the mouse down to where the "resume course" icon/button would normally be when the bar was "locked", but since the bar was "unlocked" it was actually pressing the upper-tier button for attack scope, which had slid down to the bottom row position because the bar was "unlocked". Maybe you need to add in some sort of delay after the mouse moves down there and before it clicks, in order to give the bar time to move back up into the normal position?

(4) another small point/issue - when issuing the mouse-moving commands on the speed/course/depth dials, it leaves the mouse on the dial, which has the effect of leaving the dial in the "enlarged" mode, which takes up more space in the HUD view and looks odd - is there any chance of adding something after the 'click" that would quickly and automatically move the mouse over to a parking location at the edge of the screen or something, just to get it out of the way and let the dials return to their normal, unenlarged state? Just a thought.

So far though, this is really impressive - the recognition worked very well and was lots of fun. Great work :up:

minsc_tdp 08-24-07 12:59 AM

Your ideas about different instances are valid now. It should run fine.

I just posted 2.0b, which is basically just 1.7b with the files reorganized a lot, as well as the scripts compiled to EXEs. It was a hell of a lot of work to get that done, and should help a lot of people, hence the major version number increase.

The compatibility with previous CSVs still remains of course, there's just some new stuff in dials especially. There are many IDs now that have special triggers in the code, so you can't necessary assume what the behavior will be based on the CSV content except for simple commands. There's a base behavior, but if id == xxx then special things happen for certain ones. So if you copy a command ID and give it a new ID but with all the same info, it could behave differently.

For IL-2, therefore, I would suggest you try not to re-use any existing IDs, or at least just change the ID if the behavior is not what you expect.

I'll look into the unlocked command bar issue when I can. Right now I'm very curious about how 2.0b works for people who have never installed perl/python, only the speech SDK and just try to run it.

panthercules 08-24-07 12:26 PM

Well, at least I can skip version 1.7 and go straight to 2.0 - maybe that will help me catch up :D

I did note one small thing while reading through your updated first post above about the new version 2.0 - there seems to be a broken link to the .csv file structure document - I keep getting 404 errors when I try to click that link. Not a huge issue since the document is included in the download, but just thought you might want to know about it since some folks will probably try the link in your post as well.

[edit] - another small question - if I currently have version 1.63, with SDK 5.1, perl and python installed as that required, and I want to move to 2.0 -- I plan on just renaming my 1.63 folder to set it aside as backup, and then unzip the 2.0 files into a new c:\SH4Speech folder for now. Now, the question - do I need to remove or uninstall either the perl or python instances so you can get a good idea as to whether 2.0 works well without them, or can I just leave them where they are and 2.0 just won't use them?

minsc_tdp 08-24-07 01:45 PM


Originally Posted by panthercules
[edit] - another small question - if I currently have version 1.63, with SDK 5.1, perl and python installed as that required, and I want to move to 2.0 -- I plan on just renaming my 1.63 folder to set it aside as backup, and then unzip the 2.0 files into a new c:\SH4Speech folder for now. Now, the question - do I need to remove or uninstall either the perl or python instances so you can get a good idea as to whether 2.0 works well without them, or can I just leave them where they are and 2.0 just won't use them?

That's a good plan. Get your existing folder out of the way. Unzip 2.0 to anywhere. Merge in any customizations from your old CSVs into the new 2.0 CSV files (don't just overwrite them all of course or you'll lose some new controls.) The CSVs haven't changed much - some new stuff in Dials.csv and Voice_commands.csv.

2.0 won't use Perl/Python, you can uninstall them if you like. Though they're both great tools to learn and don't hurt just sitting there, you should keep them around :)

BTW, PantherCules, a special thanks to you - you've caught so many bugs, and quickly, it's like having a personal one-man QA department. That's one man more than Ubi uses for most projects, so we're doing pretty well! heh

Can you believe we've accomplished all this in under 30 days of development?

panthercules 08-24-07 09:45 PM

Well, you're doing all the heavy lifting - I'm just playing around with this stuff. Wish I had more time (and skill) to really help you.

BTW - While I was tweaking/merging my custom commands into the .csv files, I spotted a few other small things you might want to check. In voice_commands.csv, you've got a few instances where you have included both key commands and mouse commands for the same function (e.g., deploy decoys by key in ID #17, and with mouse in ID # 507, and next unit/previous unit in IDs 45/46 and 576/577, and (maybe) damage control in IDs 55 and 572) - not sure but that doesn't sound like a good idea. Also, I noticed that the same ID # 558 seems to be used for several different commands (e.g., Send range to TDC, Follow Nearest Warship (TM) and Normal Sweep (TM) ) not sure if that's intentional for some reason, but thought maybe not.

Have you noticed any slowdown, increased recognition problems or other issues as you've expanded the command set so radically? As part of my tweaking I'm winding up creating a somewhat slimmer command set - for example, other than from surface to about 70 feet, I don't think I really need to be able to command depth in 1-foot increments (something just seems a little wrong/too precise under the circumstances to hear myself ordering "make your depth 251 feet"), so I'm just doing 5-foot increments below roughly periscope depth - that gives me the chance to add commands to say it two ways (I also like saying "make your depth two five zero") and still pare down to fewer commands overall. Just wondering if there was any discernible upper limit this might eventually hit from a performance standpoint as the number of commands increases.

This is an amazing piece of work - really appreciate your contribution to the community here :up:

minsc_tdp 08-24-07 10:47 PM

I'll check on all those weird IDs you mention. I haven't noticed any slowdown or bad recs due to the additional commmands. It's really not that many, even though it looks huge, a lot of them are really just the same command said with 360 different minor variations. As long as the gist of that command is sufficiently different from all others, there's no chance of a bad recognition.

The worst I've seen it do is hear "nine" instead of "nineteen" or "ninety" which is a good case for slimming down the set.

But I like the precise control. When you're changing your speed to match a ship, or rotating AOB or changing target speed, you'll want that. Maybe not for depth control but, who knows.

UPDATE: 55 and 572 do different things - one goes to damage station and the other tells the damage control team to get in gear, I'm pretty sure those are totally different.

558 is send range to TDC, but the other two are 558.1 (Follow Warship) and 558.2 (Normal Sweep).

panthercules 08-24-07 10:59 PM


Originally Posted by minsc_tdp
But I like the precise control. When you're changing your speed to match a ship, or rotating AOB or changing target speed, you'll want that. Maybe not for depth control but, who knows.

Yeah - I like it for speed/AOB as you say, and for course too (since there's no good way yet to order "set course for xx degrees", you need the precision to get to the exact course you want). It just didn't seem quite right to me for depth and rudder (couldn't see myself ordering "right 13 degrees rudder" either - so I just use 5 degree increments there as well) - purely a personal taste thing.


Originally Posted by minsc_tdp
UPDATE: 55 and 572 do different things - one goes to damage station and the other tells the damage control team to get in gear, I'm pretty sure those are totally different.

558 is send range to TDC, but the other two are 558.1 (Follow Warship) and 558.2 (Normal Sweep).

LOL - well, I knew it was getting late and I was getting tired - turns out I was looking at the old and new files tiled vertically next to each other, so I had squeezed down the ID column widths to where I couldn't see the decimal places, so they all looked like 558 to me - sorry for the false alarm :oops:

minsc_tdp 08-24-07 11:04 PM

btw i just adjusted the torp depth from a max of 30 to 50 (will be part of 2.0 final), so watch out for that on your next merge... sorry for making that so difficult on ya by changing stuff so much. I also deleted kept the decoy command #17 but deleted #507 since keys are preferred

you should get a utility like BeyondCompare which can intelligently show the differences between these files. would be of immense help when you have a lot of customizations

Digital_Trucker 08-25-07 11:05 AM

I guess I'm the only person with the problem (or everyone else using the compiled version is running at 1024x768), but I can't get it to recognize my resolution. I tried making the change in the batch file and I've even gone so far as to attempt to recompile with the resolution change (not having much luck there, gonna try some more today).

Any thoughts on why it wouldn't work by making the change in the batch file?

Anti_Ship_Fella 08-25-07 11:20 AM

well sounds great but i dont have a microphone:cry:

All times are GMT -5. The time now is 10:56 AM.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 1995- 2024 Subsim®
"Subsim" is a registered trademark, all rights reserved.