|
Author
|
Topic: Vista speech recognition
|
CoffeeSmack
Mini Geek
Member # 3937
Member Rated:
|
posted February 16, 2007 06:41
Perhaps this is the wrong forum, more like "I hate my computer", but this clip of the Windows Vista Speech Recognition at work is downright sad:
http://www.youtube.com/watch?v=KyLqUf4cdwc
Although I must admit, the guy is trying to code, which really shouldn't be done with a microphone.
Posts: 62 | From: Mount Joy, PA | Registered: May 2005
| IP: Logged
|
|
uilleann
Discontinued
|
posted February 16, 2007 06:55
LMAO!!
IP: Logged
|
|
Black Widow
Uber Geek
Member # 3046
|
posted February 16, 2007 07:07
I think I just died from the funny.
Posts: 931 | From: Missouri | Registered: Oct 2004
| IP: Logged
|
|
SpazGirl
Assimilated
Member # 4915
Member Rated:
|
posted February 16, 2007 08:58
It's kind of like my iBook when I try to get it to tell me the time, although that is about seven years old now...
-------------------- Things, and things.
Posts: 465 | From: Ypsilanti, MI | Registered: Feb 2006
| IP: Logged
|
|
WinterSolstice
 Solid Nitrozanium SuperFan
Member # 934
Member Rated:
|
posted February 16, 2007 09:16
I have ok success with my voice recognition on my MBP, but in general voice recognition sucks.
Only Dragon Naturally Speaking (Dragon Dictate) ever actually worked for me. Worst: Voice recognition built into OS2 Warp. OMFG that was funny.
I've been working on getting good voice recognition since Win 3.1/OS2, and it's never worked. I don't even mean stuff like coding (cause that's just stupid). I mean stuff like basic OS navigation. Yay for remotes.
-------------------- An operating system should be like a light switch... simple, effective, easy to use, and designed for everyone.
Posts: 1192 | From: Los Angeles | Registered: Oct 2001
| IP: Logged
|
|
uilleann
Discontinued
|
posted February 16, 2007 09:53
I do not believe that coding was bad example. For example, if he cannot capitalise "INFO", how do you capitalise acronyms? If you can't get it to give you a capital letter, it's only going to be of use for the sad modern era where proper punctuation means nothing. What if you wanted to write "To Mr F G Bloggs" and it failed to understand what a capital 'F' was?
Yes, he gave it a serious work-out, but exposed problems that would be as annoying in everyday use. Speech recognition has to be robust and able to deal with the complexity of human language. And not insert "fuck" into your text at random or, when Smeg was trying it earlier, "sex" (OK, it misheard 'sucks' as 'sex' but still, what sort of rating does this have? ;)
IP: Logged
|
|
WinterSolstice
 Solid Nitrozanium SuperFan
Member # 934
Member Rated:
|
posted February 16, 2007 10:39
quote: Originally posted by uilleann: I do not believe that coding was bad example. ... And not insert "fuck" into your text at random or, when Smeg was trying it earlier, "sex" (OK, it misheard 'sucks' as 'sex' but still, what sort of rating does this have?
I think coding was a bad example - not because the speech program shouldn't be able to do it (it should) but just because it's so slow to "speak" code. I can type a simple line of code much faster than I can say it... for example
code:
void sgenrand(unsigned long seed) /* seed should not be 0 */ { int k; ptgfsr[0]= seed & 0xffffffff; for (k=1; k<N; k++) ptgfsr[k] = (69069 * ptgfsr[k-1]) & 0xffffffff; }
That sample from Don Knuth's The Art of Computer Programming took just a moment to type, but would take forever to say
But I agree that it should be possible. I also agree that there should be a language rating
What a real speach recognition program needs is modes. Corporate Mode: Strictly G/PG rated words, college level grammar, buzzword compliant.
Coding mode: Language sensitive (brackets automatch and such) with verbal shortcuts... like "Select statement" or whatever.
Geek mode: Emoticon support, UBB support, etc ![[Big Grin]](biggrin.gif)
-------------------- An operating system should be like a light switch... simple, effective, easy to use, and designed for everyone.
Posts: 1192 | From: Los Angeles | Registered: Oct 2001
| IP: Logged
|
|
uilleann
Discontinued
|
posted February 16, 2007 11:10
I was not suggesting that you should use it to code with, but he gave the system a very hefty workout. And it failed, hilariously. There are too many cases where it won't understand what you said within perfect reason, and you'll need to explicitly state what you mean, e.g. acronyms that are also normal words, simple mathematical equations, and common symbols. You don't want to fight a war with it over this.
IP: Logged
|
|
stevenback7
SuperBlabberMouth!
Member # 5114
Member Rated:
|
posted February 16, 2007 13:06
Until i can turn off the computer monitor and recite a whole letter and then turn the monitor back on and see the whole letter 99 % perfect it is not worth the effort just because it would probally be easier to just type it.
Now code would be hard to do even with a good speech recocgnition not because it might make a mistake and you didn't recognize it but because it is usually a lot quicker to type code then to speak it.
-------------------- Comic Book Guy: There is no emoticon for what i'm feeling.
Posts: 1199 | From: Canada eh? | Registered: May 2006
| IP: Logged
|
|
uilleann
Discontinued
|
posted February 16, 2007 13:43
I dunno. I think the fact that it cannot tell "M" from "N" is pretty bad, or "equals" from "eagles" :P Funniest thing I've seen in a long time - on a par with "Star War (just the one)"
IP: Logged
|
|
dragonman97
 SuperFan!
Member # 780
Member Rated:
|
posted February 16, 2007 14:33
stevenback7: A far better example than turning off your monitor would be to use a computer dictation service on your cell phone! No one in their right mind, who has good dexterity would use this steaming pile of excrement to write code, or anything serious, for that matter.
Honestly, this takes me back to the days when I set up a brand new IBM with Windows 98, and it came with a copy of Dragon Natural Speaking. It came with a headset, and naturally, I just had to try it out -- I have since grown far more cynical. The key [ ] difference is that I had to train it a bit to my voice, and honestly, it did a lot better than Vista.
The other thing is that it was new enough that I probably had to read some instructions - when you say something 'just works,' and don't require them to learn the syntax, you have to account for a /lot/ of possibilities. The Vista speech recognition demoed in the video took 'upper case' to mean what Word calls 'title case.' I bet there is a way to get it to do it right, but he didn't know what it responded to. Maybe 'type in all caps, info, end caps.' Or...'select info, capitalize selection.' Using a verbal interface to Word: 'select info, format menu, change case, uppper case, okay.' Way too verbose, but it's precise enough to figure things out.
Personally, this amused me far more: http://isc.sans.org/diary.html?storyid=2148
P.S. Ooooh...I just thought of a perfect social engineering mechanism for the latter...damn white hat ethics restraint, though. ;P However, I will be getting my hands on a copy in a bit, and I'll definitely have to give it a shot. ![[Big Grin]](biggrin.gif)
-------------------- There are three things you can be sure of in life: Death, taxes, and reading about fake illnesses online...
Posts: 9037 | From: Westchester County, New York | Registered: May 2001
| IP: Logged
|
|
Richard Wolf VI
SuperBlabberMouth!
Member # 4993
|
posted February 16, 2007 21:01
I think voice recognition is useful only if you want to ask simple orders, but not for writing. I would prefer handwriting recognition instead.
-------------------- The same old iWanToUseaMac... Who am I fooling? I'm getting a Wii now, iWanToUseaMac isn't :P Get Opera. The best web experience. Contest. Group. Success.
Posts: 1355 | From: Bogotá, Colombia | Registered: Mar 2006
| IP: Logged
|
|
The Famous Druid
 Gold Hearted SuperFan!
Member # 1769
Member Rated:
|
posted February 16, 2007 21:27
There's a story from the old DOS days, of a marketing-droid demonstrating his shiny new speech recognition software at a user group.
He turned the machine on, plugged in the microphone, and...
...someone up the back yelled out "Format See Colon return Yes return"
And that was the end of the demo.
-------------------- If you watch 'The History Of NASA' backwards, it's about a space agency that has no manned spaceflight capability, then does low-orbit flights, then lands on the Moon.
Posts: 10312 | From: Melbourne, Australia | Registered: Oct 2002
| IP: Logged
|
|
dragonman97
 SuperFan!
Member # 780
Member Rated:
|
posted February 16, 2007 22:40
*sigh* That never gets old. ![[Big Grin]](biggrin.gif)
-------------------- There are three things you can be sure of in life: Death, taxes, and reading about fake illnesses online...
Posts: 9037 | From: Westchester County, New York | Registered: May 2001
| IP: Logged
|
|
ScholasticSpastic
Highlie
Member # 6919
Member Rated:
|
posted February 17, 2007 10:17
In defense of the voice-recognition software (and I type fast enough that voice-rec would do me no good), that guy didn't exactly follow reasonable protocols. He made lots of nonsense noises while dictating. He failed to think ahead as he structured his commands. His ennunciation sucked. (subcategory of ennunciation sucks: He ran his words together, failing to space them adequately to reduce comprehension errors.) He failed to refrain from making comments that he didn't want the computer to respond to while it was in listening mode.
Conclusion: Voice-rec is definitely not ready for the average user yet.
-------------------- "As in repeating a well-known song, so in instincts, one action follows another by a sort of rhythm; if a person be interrupted in a song, or in repeating anything by rote, he is generally forced to go back to recover the habitual train of thought..." (Darwin, The Origin of Species)
Posts: 540 | From: Vernal, UT | Registered: Jan 2007
| IP: Logged
|
|
uilleann
Discontinued
|
posted February 17, 2007 10:45
His enunciation sounded fine, very clear to me. The fact that "capital i" came out as "i" was a joke -- you'd expect the entire phrase "capital i" entered if Windows didn't recognise "capital" as a reserved word. But no, Windows is just stupid. This isn't a guessing game: "capital I" is the correct term and it should be recognised. It was, later, which is even weirder.
Yet when he rapidly rattled off a deletion for some of the crap it inserted, it knew what he meant even when he misread and mispronounced words and even I couldn't keep up with what he was saying.
Just seems very flaky to me, and pretty dumb. Granted, saying "thank you" to it is silly ...
When I was playing with Mac OS 9's recognition recently, there were certain words it was totally unable to recognise no matter how clearly I said them. Quite frustrating, gave up in the end.
IP: Logged
|
|
ScholasticSpastic
Highlie
Member # 6919
Member Rated:
|
posted February 17, 2007 10:49
About half of the errors seemed to stem from his over-dramatic sighing. I haven't heard sighing like that since I had a girlfriend (and that was a long time ago). I agree, though, about the 'i' issue.
-------------------- "As in repeating a well-known song, so in instincts, one action follows another by a sort of rhythm; if a person be interrupted in a song, or in repeating anything by rote, he is generally forced to go back to recover the habitual train of thought..." (Darwin, The Origin of Species)
Posts: 540 | From: Vernal, UT | Registered: Jan 2007
| IP: Logged
|
|
uilleann
Discontinued
|
posted February 17, 2007 11:07
Those only seemed to cause the introduction of random words at times. The major problems were things like ignoring attempts to capitalise, and no obvious way to insert acronyms.
Also, when he asked for "capital I, capital N" and got "IM", asking to delete "M" removed all of "IM" at once. I don't know of any English word called "im" (the only "im" I know is in German). Capitalised "IM" should be "eye emm", no? (When IM means instant messaging, do you say "eye emm", or "im"?)
I think Vista would drive me mad very quickly. I am not sure that it's conceptually possible to achieve good speech control yet, although to a degree, Vista did show reasonable comprehension between text and commands.
But we're not yet up to the comprehension level shown in The Conscience of the King yet.
IP: Logged
|
|
WinterSolstice
 Solid Nitrozanium SuperFan
Member # 934
Member Rated:
|
posted February 17, 2007 11:30
I know last time I tried I found that "Select all, delete" were my most common commands ![[Big Grin]](biggrin.gif)
-------------------- An operating system should be like a light switch... simple, effective, easy to use, and designed for everyone.
Posts: 1192 | From: Los Angeles | Registered: Oct 2001
| IP: Logged
|
|
hecateluna
Maximum Newbie
Member # 7568
Rate Member
|
posted March 13, 2007 12:11
Response from an NLP (natural language processing) researcher/grad student:
Actually, I think that it makes a lot of sense to use voice commands in coding. Programmers I talk to are constantly wishing they could work faster, and another reliable form of input (aside from keyboard and mouse) to our IDEs would be really useful (I wouldn't invision it as replacing keyboard and mouse, though). So the question is whether it's possible to make the speech recognition system reliable.
My answer? If it's specialized, absolutely. [lots of rambling about how speech recognition systems work deleted]
I have no experience with the Vista speech recognition system, but I wouldn't be surprised if it sucks a lot. However, we really shouldn't expect it to be doing well at a task it's not designed for. It is looking to recognize a word based on the previous couple of words, based on a statistical model which was trained on mind-numbing amounts of data, and I seriously doubt any of the training involved trying to write perl code.
While it seems the same to _us_ to put capital "INFO" on the screen where the previous words are "open (", and to write some acronym on the screen in the context of some other sort of text, to the language model, it is (or I should hope it is) really a very different kind of thing.
Um. Also, yes, it was funny. Ah, the fun of speech recognition system hijinks!
Posts: 17 | From: Ohio | Registered: Mar 2007
| IP: Logged
|
|
Stereo
 Solid Nitrozanium SuperFan!
Member # 748
Member Rated:
|
posted March 13, 2007 12:38
Actually, I can see only one way speech recognition can get usefull for coding. Beyond the use of a specialized dictionary to pick words from (either one by language, or it picks up the correct subset of restricted words and grammar rules based on the file extention), it must also recognize the declared variables. I would hate to have to spell with capitalization - or any other word separation - each time I want to write strFooBar. At worse, you should declare "int m-y-underscore-i-t-e-r-a-t-o-r-semi-colon" so you spell it once, then say "my iterator" when using the variable. If there isn't a clear difference between two choices, for example (a variable name:) fora = 0 vs. a loop start: for (a=0, [...]), then the speech recognition should offer the choice rather than make it on statistics only. (Or it should buffer up until it can makes a decision: does "for?a?equals 0" is followed by "semicolon" or "to maxnum increase by one" ? then it would know wether it is fora = 0; or for (a=0, a=maxNum, a++){}). Now, that's the kind of coding I would like to do.*
But my knowlege of natural laguage intepretation is limited, so it may not be worth much.
*Then again, in a room with many coders, imagine the cacophony! Maybe I should stick to regular coding.
-------------------- Eppur, si muove!
Galileo Galilei
Posts: 2286 | From: Gatineau, Quebec, Canada | Registered: Apr 2001
| IP: Logged
|
|
hecateluna
Maximum Newbie
Member # 7568
Rate Member
|
posted March 13, 2007 14:02
quote: Originally posted by Stereo: *Then again, in a room with many coders, imagine the cacophony! Maybe I should stick to regular coding.
Oh, man, I hadn't... actually thought about that. THIS is why you run your crazy ideas by people OTHER than your crazy husband. (It could still be useful, though.)
Anyway, it shouldn't be a big deal to not have to worry about capitalization, etc. I mean, unless you want to declare StrFOObar and StrFooBar, which is terrible horrible bad coding practices anyway, so I'm _happy_ if the IDE discourages people from doing so (besides, giving a choice when it isn't sure, especially in a case like that, isn't a big deal). Also, you're probably right that for most of the things you'd want to do, a grammar would be sufficient (not necessarily best though, except because there's no training data), but some statistical modelling of things like "uh" and "um" insertion (for example) would probably be useful. And you'd still have to train an acoustic model statistically (probably using a standard dataset, and supplementing with as much mock data as you can trick your friends into recording).
I would ultimately want it to have several modes (including my-hands-are-cramped-full-voice mode, and a bare command-and-control mode). I'm not sure which I'd actually end up using most--I mean, if you're using long variable names, it's pretty difficult to type fast enough to be faster than saying them (so I'd want to be able to say "String foo bar" [it types "strFooBar "], type "=", and then say "elephant dot to string" (or whatever--yes I named my random variable elephant)).
In any case, my experience tells me that any of these things would not be a big deal to create (and to make function reliably--much much much much more reliably than general purpose speech recognizers), especially if programmers are willing to invest a little time training the application on their voices. ARGH! Need more time for personal projects. *dies*
Posts: 17 | From: Ohio | Registered: Mar 2007
| IP: Logged
|
|
zesovietrussian
SuperBlabberMouth!
Member # 1177
|
posted March 14, 2007 18:28
open cmd format c enter y enter
'nuff said
Posts: 1092 | From: Boston | Registered: Mar 2002
| IP: Logged
|
|
Stereo
 Solid Nitrozanium SuperFan!
Member # 748
Member Rated:
|
posted March 15, 2007 06:28
quote: Originally posted by zesovietrussian: open cmd format c enter y enter
'nuff said
(command windows open) > format see command not found > why command not found ![[evil]](graemlins/evil.gif)
-------------------- Eppur, si muove!
Galileo Galilei
Posts: 2286 | From: Gatineau, Quebec, Canada | Registered: Apr 2001
| IP: Logged
|
|
HubmaN
Geek Larva
Member # 7554
Rate Member
|
posted March 19, 2007 04:21
Feh, I think speech recognition in Office '03 was much better. But that's because it's got less commands to remember... Isn't there a voice command function in Vista's speech recognition? Office '03 had one. What's best for me is a combination (on my Mac, Speakable Items suck, personally) of both; typing, and speaking.
quote: Originally posted by WinterSolstice: I know last time I tried I found that "Select all, delete" were my most common commands
-------------------- "...they will see us waiting on such great heights, come down now, they'll say..."
Posts: 29 | From: Bangkok, Thailand | Registered: Mar 2007
| IP: Logged
|
|
|