SUBSIM Radio Room Forums

SUBSIM Radio Room Forums (https://www.subsim.com/radioroom/index.php)
-   SH4 Mods Workshop (https://www.subsim.com/radioroom/forumdisplay.php?f=219)
-   -   [REL] sh4speech - voice command for SH4 (https://www.subsim.com/radioroom/showthread.php?t=119430)

minsc_tdp 09-14-07 06:58 PM

Regarding the missed keys, try setting a ridiculously high delay via the BAT file, like 5. If that solves it, it's a performance issue with your PC.

It could be a problem with the keycodes if you are using a foreign version of windows or have some international regional settings in the control panel. If that's the case, the hits will always hit and the misses will always miss, since it has to do with some codes being valid by chance and others not. But I'd guess it's a delay problem.

DigitalTrucker, I've never seen what you're talking about. That's a really bizarre problem. Check your CSVs to be sure they haven't been altered - there's information about the heading dial rotation and size and if they're messed up it might explain this. Try re-extracting those from the original zip. It's a long shot I know. Also please explain exactly what you see with the mouse when you do "heading right ninety degrees" and "heading left ninety degrees". When I say exactly, I mean for every mouse movement/click in the sequence, tell me exactly where it is landing for each command. It could be a resolution issue.. have you triple checked to ensure you really are in 1280x1024 mode? Have you edited that in both BATs? Does the resolution show up in the sh4speech debug window on startup?

I'm going on a few days camping trip to Mt. Whitney this weekend, back next week. Good luck until then :)

Digital_Trucker 09-14-07 08:40 PM

My bad
 
:oops: Shoulda looked a little closer at the original files. Evidently, when I used my spreadsheet program (OpenOffice) to edit the file once upon a time, it decided that the text fields in the CSV file needed quotation marks (") around them. I never noticed the change because all the other commands worked fine. The problem evidently only arises with those commands that can have a negative sign (hyphen,-) in front of their entries to designate direction. Everything should be hunky-dory after I strip out the quotes:damn:

Don't know if it's even worth thinking about, but if this could be a common problem with spreadsheet programs, maybe a mention in the readme would help some other poor slob (i.e. me) not make the same mistake.

As for the answers to all your above questions: the same exact thing,yes,no,yes.

Enjoy your camping trip (and if you see this afterwards, I hope you enjoyed it):rock:

billko 09-14-07 09:01 PM

Quote:

Originally Posted by Digital_Trucker
:oops: Shoulda looked a little closer at the original files. Evidently, when I used my spreadsheet program (OpenOffice) to edit the file once upon a time, it decided that the text fields in the CSV file needed quotation marks (") around them. I never noticed the change because all the other commands worked fine. The problem evidently only arises with those commands that can have a negative sign (hyphen,-) in front of their entries to designate direction. Everything should be hunky-dory after I strip out the quotes:damn:

Don't know if it's even worth thinking about, but if this could be a common problem with spreadsheet programs, maybe a mention in the readme would help some other poor slob (i.e. me) not make the same mistake.

As for the answers to all your above questions: the same exact thing,yes,no,yes.

Enjoy your camping trip (and if you see this afterwards, I hope you enjoyed it):rock:

Digital Trucker:

Apparently, OpenOffice is treating the negative number as text because, strictly speaking, the "minus" sign isn't a numeric character) and putting it in quotation marks. Putting text in quotes is standard CSV convention because the text itself may contain a comma, which would mess up the way the file was decoded. The quotation marks tell the CSV decoder to treat everything in between as one field. As for the reason *why* OpenOffice is treating a negative number as a string, well... I guess you gotta ask the authors of the project about that one...

Bill

NefariousKoel 09-14-07 09:17 PM

Those comma-delimited bastards!

minsc_tdp 09-14-07 10:59 PM

Oh, that doesn't surprise me in the least. The behavior of Excel with the negatives is bizarre as all hell. OpenOffice is probably doing it technically right!

The problem is that I wanted Excel to treat those fields as numeric so that I could do a series fill to easily type in the extra 360 numbers, to avoid keying them in by hand. So the fields just said 1 or -1. Then, I used the Format Cells feature to apply the following custom label:

"Heading" 0

I expected that to turn 1 into Heading 1 and -1 into Heading -1. But oh no. Excel puts the - symbol before the word Heading so you get -Heading 1! I was shocked and annoyed but due to laziness and wanting to keep those as numeric fields, I coded the script to recognize the - at the very start of the field and then pluck out the number and invert it. For positive ones (where it doesn't detect the - symbol) it just plucks the number and leaves it alone.

The quotes made it not detect the - symbol, and so it treated it as a positive number, which explains why all commands were starboard, and also explains why the voice command showed up correctly in the debug window (though if you were to look closer, you would have seen "heading right 90" with "Desired angle: 90 degrees" and on "heading left 90" you would have also seen that instead of -90 which is correct. I don't blame you since I'm currently spewing quite a bit of debug output there.

I guess another aspect of the problem was the default behavior of Excel is to not quote wrap CSV output while OpenOffice does. It's not treating a negative number as a string - it really is a string - the string is "- Heading 90". So it's really correct to put quotes, and I'm relying on the fact that Excel doesn't.

In the future just post the command window debug output I can probably spot the problem in the midst of all that junk better than y'all and get to the root of the problem right away. I really need to clean that stuff up the and boil it down to the essentials (balancing that is always hard though - too little and you risk not showing the critical info you need to solve it, too much and it's hard to find the key bit. I prefer the latter though, because otherwise I'd have to give you another release that shows more debug output or code in multiple verbosity levels which is a pain.) At some point I'll probably trim the visible output to the bare essentials and have it write a more verbose log file that you can examine or send me in the event of a problem.

Digital_Trucker 09-15-07 09:57 AM

All is well that ends well
 
@ billko As minsc says in the above post, it actually put quotes around all text, not just the negative sign. However, nothing would surprise me when it comes to software differences. Standardization seems to be a bad thing because it implies a monoploy of sorts.

@NefariousKoel :rotfl: Yes, how dare they?:arrgh!:

@minsc Yep, it was my bad all around. Not looking at the original file, not noticing the lack of negativity in the debug output and not noticing that there is a choice in OpenOfficeCalc for the text delimiter (either single or double quotes, not that it would have made any difference). I'm usually pretty good at troubleshooting, but what threw me off was that everything else worked fine so I wasn't even thinking file discrepancies. Lesson learned, from now on I'll remember to strip the quotes out after I use Calc on it (better yet, just don't use Calc on it).

I think the idea of reducing the debug output in the command window and creating a log file is a good one. Much easier to deal with questions and troubleshooting that way, I think.

minsc_tdp 09-17-07 08:39 PM

Howdy all, I just got back from my camping/hiking trip to Mt. Whitney. Unlike the 13 other fools that did the 18 hour death march 22 miles, I did a liesurely 8 mile hike to the lake where I relaxed reading a book. I actually bought the Dangerous Waters manual hoping to dig into that, but didn't have time to even crack it.

I'm glad to come back to see no major problems developing. I didn't put much thought into sh4speech while I was gone, so no Revelations From The Mountain so to speak. :) I think the only immediate goals I have are to clean up the debug output, and do some general cleanup that would enable better support of SH3. But I have no idea how interested those players might be as I've posted nothing about this in the SH3 forums.

panthercules 09-18-07 12:02 AM

Quote:

Originally Posted by minsc_tdp
... and do some general cleanup that would enable better support of SH3. But I have no idea how interested those players might be as I've posted nothing about this in the SH3 forums.

Welcome back - sounds like you had a good trip - I'm sure it was nice to get away from all this for a while.

Been gone a few days myself, but was thinking about going back to SH3 for a bit (while waiting for Leo et al. to wrap up and release all the new graphics mods) and thinking about whether it would be worth trying to adapt SH4Speech to use it for that. I'd love to have the depth and course/mouse capability, but the problem I was getting stuck on was the language thing - with Shoot (what I've been using for SH3 voice command) it was really easy to add my German commands into the custom dictionary function, because that program doesn't seem to care or rely on any sort of English language recognition. Since SH4Speech relies on the Microsoft speech engine thingy, is it going to be able to recognize the German commands, or will you have to switch to a German version of the Microsoft speech engine or something? How hard do you think it would be to issue commands in German and have them work with SH4Speech?

mountainmanUK 09-18-07 02:10 AM

Quote:

I think the only immediate goals I have are to clean up the debug output, and do some general cleanup that would enable better support of SH3. But I have no idea how interested those players might be as I've posted nothing about this in the SH3 forums
Welcome back......and glad to hear you had a good time in the mountains!:D

So far as support for SH3 is concerned, I am 100% SURE that there are a heck of a lot of people that would really appreciate it, if you could come up with a couple of "basic" SH3Speech configurations......say, one for standard SH3 1.4b (patched Vanilla), and also a GWX v1.03 version.

As a dedicated SH3 user, and active in the Wolves at War Campaign, to be able to do almost everything in SH3 by voice alone would really make for possibly THE greatest aid to realism possible!!

I appreciate that all this takes time, and I'm pretty sure that you, like the rest of us, has a real life outside of SubSimming.....so it would be OK whenever you want to get into it!!!

When I do play SHIV these days (a lot less nowadays, since Wolves at War 3 started), I always use your SH4Speech setup. It is something that I really miss now, when I fire up my SH3!

cheers,

Dave

Digital_Trucker 09-18-07 08:10 AM

Panthercules, if I am not mistaken, you can spell the German words phonetically in the command files and the speech engine should recognize them. For instance, to use my redneck dialect as an example, I had extreme difficulty getting the word "fire" recognized. But as soon as I changed the spelling to fierr it recognized my hillbilly pronunciation every time.

I would think that the same could be applied to the German language. I.E. Deutsch (forgive spelling if I got it wrong, it's been 30 years since I used the language) would become doitsh, Mein Herr would become mine hair (even though I don't have much :rotfl: ). I don't think a German speech engine would be necessary, just some creativity in spelling.

minsc_tdp 09-18-07 03:16 PM

I believe getting proper german support in is two parts: First is to install and configure the MS Speech control panel settings appropriately to use a german language recognizer. While phonetic spellings might work I don't think that takes into account certain idiosyncracies of the language that the alternate Recognizers are designed to handle better.

Next would be figuring out why umlauts and any other german language specific character issues are properly supported in the files and scripts. As it stands, I believe that an umlaut in the file gets mangled by the time it passes through hear.exe and ultimately down to voice.exe and as a result the character just becomes a standard a instead of ä and this causes an inability to match. It'll take some work to sort it all out.

By far, the hardest work in supporting SH3 is defining the replacement entires for dials.csv. But perhaps it would be enough to get the three main dials in and not bother with the TDC and torpedo settings dial stuff until later.

panthercules 09-19-07 01:51 AM

Quote:

Originally Posted by minsc_tdp
I believe getting proper german support in is two parts: First is to install and configure the MS Speech control panel settings appropriately to use a german language recognizer. While phonetic spellings might work I don't think that takes into account certain idiosyncracies of the language that the alternate Recognizers are designed to handle better.

Next would be figuring out why umlauts and any other german language specific character issues are properly supported in the files and scripts. As it stands, I believe that an umlaut in the file gets mangled by the time it passes through hear.exe and ultimately down to voice.exe and as a result the character just becomes a standard a instead of ä and this causes an inability to match. It'll take some work to sort it all out.

By far, the hardest work in supporting SH3 is defining the replacement entires for dials.csv. But perhaps it would be enough to get the three main dials in and not bother with the TDC and torpedo settings dial stuff until later.

Yeah - I would think that it would be enough to get the main 3 dials working (course/rudder, speed and depth) - that would already put you light-years ahead of the plain keystroke jobs like Shoot. I'll have to play around with the German version of the MS speech thingy (assuming there is one) and see - would SH4Speech work with the German version (or the English version tweaked to handle German, or however it works) of the MS speech thingy without you having to mess with it, or does it tie into the MS speech thingy in a way that would get broken if I tried to substitute the German version for the English one? Do you know whether the MS speech thingy has a custom dictionary function like Shoot does? That worked great for getting the Shoot engine to recognize German words, and I know some of the other (commercial) speech programs I've played with before had such a function - I think it was so you could add some technical jargon terms and stuff that the "mainstream" dictionary the programs used wouldn't normally have in it. I wonder how the MS speech thingy is setup in this regard - guess it's readme time :)

panthercules 09-19-07 10:26 PM

Quote:

Originally Posted by Digital_Trucker
Panthercules, if I am not mistaken, you can spell the German words phonetically in the command files and the speech engine should recognize them. For instance, to use my redneck dialect as an example, I had extreme difficulty getting the word "fire" recognized. But as soon as I changed the spelling to fierr it recognized my hillbilly pronunciation every time.

I would think that the same could be applied to the German language. I.E. Deutsch (forgive spelling if I got it wrong, it's been 30 years since I used the language) would become doitsh, Mein Herr would become mine hair (even though I don't have much :rotfl: ). I don't think a German speech engine would be necessary, just some creativity in spelling.

Well, I tried this and it does work, sort of, to some extent. For example, you can use "cline a fart for ows" as a reasonable facsimile for "kleine fahrt voraus", and "buy duh machine in stop" substitutes reasonably well for "beide machinen stopp" and it seems to recognize these ersatz German phrases most of the time. However, some of the German commands are just not that easy to simulate with English words/sounds, and even with these reasonably close matches recognition performance suffers quite noticeably.

I think you'd have to either (1) figure out how to use a German-based recognition engine (the readme for the MS speech thingy hinted that this might be possible but it didn't explain how to do that, and it appears that the application programmer (i.e., minsc_tdp) would probably have to build that capability into the Speech program); or (2) figure out how to add the capability Shoot has to be able to basically define your own words. Many speech programs allow you to do this, and they don't care whether you're adding some obscure English words (for legal or technical jargon terms, for example) or, in this case, some German words. In short, you just "train" the speech program so it knows when it hears "buy duh" you're actually saying "beide", and then it performs whatever key stroke or other command you told it to perform when it hears you say "beide" - it doesn't have to know or care that "beide" is German and not English.

At least, that's the way Shoot seems to do it, and it seems like it's probably the best way to approach this problem for making SH4Speech work with SH3 (easy for me to say - I don't have to program it :lol: ) Based on my experience with Shoot/SH3, you don't really have to add all that many German words to the program's lexicon/dictionary through "training" to be able to create the commands you need to speak, so I hope something like this could be done - I'd love to use this with SH3 too.

minsc_tdp 09-20-07 12:04 AM

Nein! I would strongly recommend that nobody pursue the route of phonetic equivalents unless they're really desperate. It is very likely a dead end.

sh4speech should use whatever recognizer (aka, an MS SAPI compatible speech engine) you have currently selected in Control Panel > Speech. Mine is "Microsoft English Recognizer 5.1". Presumably there's a way to install a "German Recognizer"!!?

According to the internet tubes (sorry lost the link!), it says:

"The speech recognition engine may be installed with the operating system or at a later time with other software. During the installation process, speech-enabled packages such as word processors and web browsers, may install their own engines or they may use existing engines. Additional engines are also available through third-party manufacturers. These engines often use a certain jargon or vocabulary; for example, they may use a vocabulary that specializes in medical or legal terminology. They can also use different voices allowing for regional accents such as British English, or use a different language altogether such as German, French, or Russian."

So you may have to buy a Microsoft SAPI compatible German recognizer. You probably want this:
http://www.nuance.de/naturallyspeaking/

frenzied 09-20-07 01:23 AM

I've been having a fairly strange, but minor, problem with this - whenever the program opens, or resets itself, it turns my microphone volume down to 0.
Any ideas on what could be causing this?

minsc_tdp 09-20-07 04:27 PM

Quote:

Originally Posted by frenzied
I've been having a fairly strange, but minor, problem with this - whenever the program opens, or resets itself, it turns my microphone volume down to 0.
Any ideas on what could be causing this?

Have you been through the Microphone Tuning Wizard in Control Panel > Speech? That might be governing the mic volume and it's resetting to that each time SAPI gets instantiated (which is when sh4speech starts or restarts each 10 minutes) and it is by design.

panthercules 09-23-07 11:56 AM

Wahoo! Fix for stupid stopwatch at high TC
 
After getting tired of having to speak the command to "secure the stopwatch" every time I called for TC 512 or higher, it finally dawned on me that the same macro string I created to order the TC 512 in the first place could be easily adapted to include the command key to remove the stopwatch, and Voila! - no more stopwatch :yep:

Minsc_tdp - gotta hand it to you - the power of what you've done here just keeps revealing itself the more one plays around with it. Now you've made possible a fix for what has to be one of the most annoying little stupidities in the game - Way to go man :rock:

For those interested, adding these commands is really easy - for ordering TC 512, you use the macro capability built into SH4Speech to add the following line in the "keys" column (column E, I believe) to your key_commands.csv file (use whatever ID # you want/have available and call it whatever you want - I call it simply "TC 512" in mine):
SHIFT-Numpad -&Numpad +&Numpad +&Numpad +&Numpad +&Numpad +&Numpad +&Numpad +&Numpad +&Numpad +&X

You'll also need to add the key code for "SHIFT-Numpad -" (which is the key for setting TC=1, which is used at the start of the sequence to make sure the macro is increasing TC by the right number of times to get to 512) to your key_codes.csv file - it's easy to do, and the code is "0x10+0x6D"

Then you just add a line to your voice_commands.csv file so you can speak whatever you want to say for TC 512 (I just use "T C five twelve") and voila! The time compression goes to 512, the stopwatch pops up and immediately goes away :D

You can do basically the same thing for the higher TC settings, just by stringing together longer series of "&Numpad +" and making sure that the "&X" is the last thing in the string, and the stopwatch will always disappear when you order TC of 512 or higher.

Enjoy!

minsc_tdp 10-24-07 05:58 PM

E for All
 
I recently attended E for All, a mini-E3 video game conference in Los Angeles. I tossed around my idea of modifying sh4speech to work with any game, to have a generic SDK and I got some really good feedback. I got some interesting ideas as to other types of software this might be applicable to. I'm still skeptical that there's any value here and I still need to follow up with the contacts I made, but there might be more in store for sh4speech yet. It might become a sort of generic "GameVoice" app that can be licensed by game developers for inclusion with their software, or a generic app that the communities use to tune for each game.

One of the most interesting problems is that in games like flight sims, the HUD is not static like sh4, it moves around, so the location of dials and buttons tends to change in realtime. Locating them would require a datastream from the game so that their location can be pinned down, or even better, a strong API that would allow me to set their values without worrying about where they are and not having to take over the mouse. So there's a lot of think about. This post was really just a weak excuse to bump this thread I love so much back into the main page. :)

minsc_tdp 10-24-07 07:32 PM

linkedin
 
Hey I'm messing around with this LinkedIn site. Here's my public profile:
http://www.linkedin.com/pub/5/984/45a

I'd apprecate if anyone who has used sh4speech, likes it, and has an account there (or is willing to create one) would add me as a friend/connection/whatever and put in a recommendation with glowing reviews of my work. :) Thanks!

Hawk_345 12-09-07 11:48 AM

I think i might give this thing a go, but first i need to know if its 1.4 cmpatible, and if you need to be a computer genius to install it and make it work. also on a side note, if i do manage to install this thing, will it afect the use of the Teamspeak program, and if i want to uninstall it, will i have to reinstall the whole game.


All times are GMT -5. The time now is 01:49 PM.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 1995- 2025 Subsim®
"Subsim" is a registered trademark, all rights reserved.