homeGeek CultureWebstoreeCards!Forums!Joy of Tech!AY2K!webcam

The Geek Culture Forums


Post New Topic  New Poll  Post A Reply
my profile | directory login | | search | faq | forum home
  next oldest topic   next newest topic
» The Geek Culture Forums   » Love!   » I Love my Computer   » Vista speech recognition (Page 1)

 - UBBFriend: Email this page to someone!  
This topic comprises 2 pages: 1  2 
 
Author Topic: Vista speech recognition
CoffeeSmack
Mini Geek
Member # 3937

Member Rated:
5
Icon 1 posted February 16, 2007 06:41      Profile for CoffeeSmack         Edit/Delete Post   Reply With Quote 
Perhaps this is the wrong forum, more like "I hate my computer", but this clip of the Windows Vista Speech Recognition at work is downright sad:

http://www.youtube.com/watch?v=KyLqUf4cdwc

Although I must admit, the guy is trying to code, which really shouldn't be done with a microphone.

Posts: 62 | From: Mount Joy, PA | Registered: May 2005  |  IP: Logged
uilleann
Discontinued


Icon 10 posted February 16, 2007 06:55            Edit/Delete Post   Reply With Quote 
LMAO!!
IP: Logged
Black Widow
Uber Geek
Member # 3046

Icon 1 posted February 16, 2007 07:07      Profile for Black Widow     Send New Private Message       Edit/Delete Post   Reply With Quote 
I think I just died from the funny. [crazy]
Posts: 931 | From: Missouri | Registered: Oct 2004  |  IP: Logged
SpazGirl
Assimilated
Member # 4915

Member Rated:
5
Icon 1 posted February 16, 2007 08:58      Profile for SpazGirl   Author's Homepage     Send New Private Message       Edit/Delete Post   Reply With Quote 
It's kind of like my iBook when I try to get it to tell me the time, although that is about seven years old now...

--------------------
Things, and things.

Posts: 465 | From: Ypsilanti, MI | Registered: Feb 2006  |  IP: Logged
WinterSolstice

Solid Nitrozanium SuperFan
Member # 934

Member Rated:
3
Icon 1 posted February 16, 2007 09:16      Profile for WinterSolstice     Send New Private Message       Edit/Delete Post   Reply With Quote 
I have ok success with my voice recognition on my MBP, but in general voice recognition sucks.

Only Dragon Naturally Speaking (Dragon Dictate) ever actually worked for me.
Worst: Voice recognition built into OS2 Warp. OMFG that was funny.

I've been working on getting good voice recognition since Win 3.1/OS2, and it's never worked. I don't even mean stuff like coding (cause that's just stupid). I mean stuff like basic OS navigation. Yay for remotes.

--------------------
An operating system should be like a light switch... simple, effective, easy to use, and designed for everyone.

Posts: 1192 | From: Los Angeles | Registered: Oct 2001  |  IP: Logged
uilleann
Discontinued


Icon 1 posted February 16, 2007 09:53            Edit/Delete Post   Reply With Quote 
I do not believe that coding was bad example. For example, if he cannot capitalise "INFO", how do you capitalise acronyms? If you can't get it to give you a capital letter, it's only going to be of use for the sad modern era where proper punctuation means nothing. What if you wanted to write "To Mr F G Bloggs" and it failed to understand what a capital 'F' was?

Yes, he gave it a serious work-out, but exposed problems that would be as annoying in everyday use. Speech recognition has to be robust and able to deal with the complexity of human language. And not insert "fuck" into your text at random or, when Smeg was trying it earlier, "sex" (OK, it misheard 'sucks' as 'sex' but still, what sort of rating does this have? ;)

IP: Logged
WinterSolstice

Solid Nitrozanium SuperFan
Member # 934

Member Rated:
3
Icon 10 posted February 16, 2007 10:39      Profile for WinterSolstice     Send New Private Message       Edit/Delete Post   Reply With Quote 
quote:
Originally posted by uilleann:
I do not believe that coding was bad example.
...
And not insert "fuck" into your text at random or, when Smeg was trying it earlier, "sex" (OK, it misheard 'sucks' as 'sex' but still, what sort of rating does this have? [Wink]

I think coding was a bad example - not because the speech program shouldn't be able to do it (it should) but just because it's so slow to "speak" code. I can type a simple line of code much faster than I can say it... for example
code:
void sgenrand(unsigned long seed) /* seed should not be 0 */
{
int k;
ptgfsr[0]= seed & 0xffffffff;
for (k=1; k<N; k++)
ptgfsr[k] = (69069 * ptgfsr[k-1]) & 0xffffffff;
}

That sample from Don Knuth's The Art of Computer Programming took just a moment to type, but would take forever to say

But I agree that it should be possible. I also agree that there should be a language rating [Smile]

What a real speach recognition program needs is modes.
Corporate Mode: Strictly G/PG rated words, college level grammar, buzzword compliant.

Coding mode: Language sensitive (brackets automatch and such) with verbal shortcuts... like "Select statement" or whatever.

Geek mode: Emoticon support, UBB support, etc [Big Grin]

--------------------
An operating system should be like a light switch... simple, effective, easy to use, and designed for everyone.

Posts: 1192 | From: Los Angeles | Registered: Oct 2001  |  IP: Logged
uilleann
Discontinued


Icon 1 posted February 16, 2007 11:10            Edit/Delete Post   Reply With Quote 
I was not suggesting that you should use it to code with, but he gave the system a very hefty workout. And it failed, hilariously. There are too many cases where it won't understand what you said within perfect reason, and you'll need to explicitly state what you mean, e.g. acronyms that are also normal words, simple mathematical equations, and common symbols. You don't want to fight a war with it over this.
IP: Logged
stevenback7
SuperBlabberMouth!
Member # 5114

Member Rated:
4
Icon 1 posted February 16, 2007 13:06      Profile for stevenback7   Author's Homepage     Send New Private Message       Edit/Delete Post   Reply With Quote 
Until i can turn off the computer monitor and recite a whole letter and then turn the monitor back on and see the whole letter 99 % perfect it is not worth the effort just because it would probally be easier to just type it.

Now code would be hard to do even with a good speech recocgnition not because it might make a mistake and you didn't recognize it but because it is usually a lot quicker to type code then to speak it.

--------------------
Comic Book Guy: There is no emoticon for what i'm feeling.

Posts: 1199 | From: Canada eh? | Registered: May 2006  |  IP: Logged
uilleann
Discontinued


Icon 1 posted February 16, 2007 13:43            Edit/Delete Post   Reply With Quote 
I dunno. I think the fact that it cannot tell "M" from "N" is pretty bad, or "equals" from "eagles" :P Funniest thing I've seen in a long time - on a par with "Star War (just the one)"
IP: Logged
dragonman97

SuperFan!
Member # 780

Member Rated:
4
Icon 1 posted February 16, 2007 14:33      Profile for dragonman97   Author's Homepage     Send New Private Message       Edit/Delete Post   Reply With Quote 
stevenback7: A far better example than turning off your monitor would be to use a computer dictation service on your cell phone! No one in their right mind, who has good dexterity would use this steaming pile of excrement to write code, or anything serious, for that matter.

Honestly, this takes me back to the days when I set up a brand new IBM with Windows 98, and it came with a copy of Dragon Natural Speaking. It came with a headset, and naturally, I just had to try it out -- I have since grown far more cynical. [Wink] The key [ [Big Grin] ] difference is that I had to train it a bit to my voice, and honestly, it did a lot better than Vista.

The other thing is that it was new enough that I probably had to read some instructions - when you say something 'just works,' and don't require them to learn the syntax, you have to account for a /lot/ of possibilities. The Vista speech recognition demoed in the video took 'upper case' to mean what Word calls 'title case.' I bet there is a way to get it to do it right, but he didn't know what it responded to. Maybe 'type in all caps, info, end caps.' Or...'select info, capitalize selection.' Using a verbal interface to Word: 'select info, format menu, change case, uppper case, okay.' Way too verbose, but it's precise enough to figure things out.

Personally, this amused me far more:
http://isc.sans.org/diary.html?storyid=2148

P.S. Ooooh...I just thought of a perfect social engineering mechanism for the latter...damn white hat ethics restraint, though. ;P However, I will be getting my hands on a copy in a bit, and I'll definitely have to give it a shot. [Big Grin]

--------------------
There are three things you can be sure of in life: Death, taxes, and reading about fake illnesses online...

Posts: 9331 | From: Westchester County, New York | Registered: May 2001  |  IP: Logged
Richard Wolf VI
SuperBlabberMouth!
Member # 4993

Icon 10 posted February 16, 2007 21:01      Profile for Richard Wolf VI   Author's Homepage     Send New Private Message       Edit/Delete Post   Reply With Quote 
I think voice recognition is useful only if you want to ask simple orders, but not for writing. I would prefer handwriting recognition instead.

--------------------
The same old iWanToUseaMac... Who am I fooling? I'm getting a Wii now, iWanToUseaMac isn't :P
Get Opera. The best web experience.
Contest. Group. Success.

Posts: 1356 | From: Bogotá, Colombia | Registered: Mar 2006  |  IP: Logged
The Famous Druid

Gold Hearted SuperFan!
Member # 1769

Member Rated:
4
Icon 1 posted February 16, 2007 21:27      Profile for The Famous Druid     Send New Private Message       Edit/Delete Post   Reply With Quote 
There's a story from the old DOS days, of a marketing-droid demonstrating his shiny new speech recognition software at a user group.

He turned the machine on, plugged in the microphone, and...

...someone up the back yelled out "Format See Colon return Yes return"

And that was the end of the demo.

--------------------
If you watch 'The History Of NASA' backwards, it's about a space agency that has no manned spaceflight capability, then does low-orbit flights, then lands on the Moon.

Posts: 10669 | From: Melbourne, Australia | Registered: Oct 2002  |  IP: Logged
dragonman97

SuperFan!
Member # 780

Member Rated:
4
Icon 1 posted February 16, 2007 22:40      Profile for dragonman97   Author's Homepage     Send New Private Message       Edit/Delete Post   Reply With Quote 
*sigh*
That never gets old. [Big Grin]

--------------------
There are three things you can be sure of in life: Death, taxes, and reading about fake illnesses online...

Posts: 9331 | From: Westchester County, New York | Registered: May 2001  |  IP: Logged
ScholasticSpastic
Highlie
Member # 6919

Member Rated:
5
Icon 1 posted February 17, 2007 10:17      Profile for ScholasticSpastic     Send New Private Message       Edit/Delete Post   Reply With Quote 
In defense of the voice-recognition software (and I type fast enough that voice-rec would do me no good), that guy didn't exactly follow reasonable protocols. He made lots of nonsense noises while dictating. He failed to think ahead as he structured his commands. His ennunciation sucked. (subcategory of ennunciation sucks: He ran his words together, failing to space them adequately to reduce comprehension errors.) He failed to refrain from making comments that he didn't want the computer to respond to while it was in listening mode.

Conclusion: Voice-rec is definitely not ready for the average user yet.

--------------------
"As in repeating a well-known song, so in instincts, one action follows another by a sort of rhythm; if a person be interrupted in a song, or in repeating anything by rote, he is generally forced to go back to recover the habitual train of thought..." (Darwin, The Origin of Species)

Posts: 540 | From: Vernal, UT | Registered: Jan 2007  |  IP: Logged
uilleann
Discontinued


Icon 1 posted February 17, 2007 10:45            Edit/Delete Post   Reply With Quote 
His enunciation sounded fine, very clear to me. The fact that "capital i" came out as "i" was a joke -- you'd expect the entire phrase "capital i" entered if Windows didn't recognise "capital" as a reserved word. But no, Windows is just stupid. This isn't a guessing game: "capital I" is the correct term and it should be recognised. It was, later, which is even weirder.

Yet when he rapidly rattled off a deletion for some of the crap it inserted, it knew what he meant even when he misread and mispronounced words and even I couldn't keep up with what he was saying.

Just seems very flaky to me, and pretty dumb. Granted, saying "thank you" to it is silly ...

When I was playing with Mac OS 9's recognition recently, there were certain words it was totally unable to recognise no matter how clearly I said them. Quite frustrating, gave up in the end.

IP: Logged
ScholasticSpastic
Highlie
Member # 6919

Member Rated:
5
Icon 1 posted February 17, 2007 10:49      Profile for ScholasticSpastic     Send New Private Message       Edit/Delete Post   Reply With Quote 
About half of the errors seemed to stem from his over-dramatic sighing. I haven't heard sighing like that since I had a girlfriend (and that was a long time ago). I agree, though, about the 'i' issue.

--------------------
"As in repeating a well-known song, so in instincts, one action follows another by a sort of rhythm; if a person be interrupted in a song, or in repeating anything by rote, he is generally forced to go back to recover the habitual train of thought..." (Darwin, The Origin of Species)

Posts: 540 | From: Vernal, UT | Registered: Jan 2007  |  IP: Logged
uilleann
Discontinued


Icon 1 posted February 17, 2007 11:07            Edit/Delete Post   Reply With Quote 
Those only seemed to cause the introduction of random words at times. The major problems were things like ignoring attempts to capitalise, and no obvious way to insert acronyms.

Also, when he asked for "capital I, capital N" and got "IM", asking to delete "M" removed all of "IM" at once. I don't know of any English word called "im" (the only "im" I know is in German). Capitalised "IM" should be "eye emm", no? (When IM means instant messaging, do you say "eye emm", or "im"?)

I think Vista would drive me mad very quickly. I am not sure that it's conceptually possible to achieve good speech control yet, although to a degree, Vista did show reasonable comprehension between text and commands.

But we're not yet up to the comprehension level shown in The Conscience of the King yet.

IP: Logged
WinterSolstice

Solid Nitrozanium SuperFan
Member # 934

Member Rated:
3
Icon 1 posted February 17, 2007 11:30      Profile for WinterSolstice     Send New Private Message       Edit/Delete Post   Reply With Quote 
I know last time I tried I found that "Select all, delete" were my most common commands [Big Grin]

--------------------
An operating system should be like a light switch... simple, effective, easy to use, and designed for everyone.

Posts: 1192 | From: Los Angeles | Registered: Oct 2001  |  IP: Logged
hecateluna
Maximum Newbie
Member # 7568

Rate Member
Icon 1 posted March 13, 2007 12:11      Profile for hecateluna     Send New Private Message       Edit/Delete Post   Reply With Quote 
Response from an NLP (natural language processing) researcher/grad student:

Actually, I think that it makes a lot of sense to use voice commands in coding. Programmers I talk to are constantly wishing they could work faster, and another reliable form of input (aside from keyboard and mouse) to our IDEs would be really useful (I wouldn't invision it as replacing keyboard and mouse, though). So the question is whether it's possible to make the speech recognition system reliable.

My answer? If it's specialized, absolutely. [lots of rambling about how speech recognition systems work deleted]

I have no experience with the Vista speech recognition system, but I wouldn't be surprised if it sucks a lot. However, we really shouldn't expect it to be doing well at a task it's not designed for. It is looking to recognize a word based on the previous couple of words, based on a statistical model which was trained on mind-numbing amounts of data, and I seriously doubt any of the training involved trying to write perl code.

While it seems the same to _us_ to put capital "INFO" on the screen where the previous words are "open (", and to write some acronym on the screen in the context of some other sort of text, to the language model, it is (or I should hope it is) really a very different kind of thing.

Um. Also, yes, it was funny. [Smile] Ah, the fun of speech recognition system hijinks!

Posts: 17 | From: Ohio | Registered: Mar 2007  |  IP: Logged
Stereo

Solid Nitrozanium SuperFan!
Member # 748

Member Rated:
5
Icon 1 posted March 13, 2007 12:38      Profile for Stereo     Send New Private Message       Edit/Delete Post   Reply With Quote 
Actually, I can see only one way speech recognition can get usefull for coding. Beyond the use of a specialized dictionary to pick words from (either one by language, or it picks up the correct subset of restricted words and grammar rules based on the file extention), it must also recognize the declared variables. I would hate to have to spell with capitalization - or any other word separation - each time I want to write strFooBar. At worse, you should declare "int m-y-underscore-i-t-e-r-a-t-o-r-semi-colon" so you spell it once, then say "my iterator" when using the variable. If there isn't a clear difference between two choices, for example (a variable name:) fora = 0 vs. a loop start: for (a=0, [...]), then the speech recognition should offer the choice rather than make it on statistics only. (Or it should buffer up until it can makes a decision: does "for?a?equals 0" is followed by "semicolon" or "to maxnum increase by one" ? then it would know wether it is fora = 0; or for (a=0, a=maxNum, a++){}). Now, that's the kind of coding I would like to do.*

But my knowlege of natural laguage intepretation is limited, so it may not be worth much.

*Then again, in a room with many coders, imagine the cacophony! Maybe I should stick to regular coding.

--------------------
Eppur, si muove!

Galileo Galilei

Posts: 2289 | From: Gatineau, Quebec, Canada | Registered: Apr 2001  |  IP: Logged
hecateluna
Maximum Newbie
Member # 7568

Rate Member
Icon 1 posted March 13, 2007 14:02      Profile for hecateluna     Send New Private Message       Edit/Delete Post   Reply With Quote 
quote:
Originally posted by Stereo:
*Then again, in a room with many coders, imagine the cacophony! Maybe I should stick to regular coding.

Oh, man, I hadn't... actually thought about that. THIS is why you run your crazy ideas by people OTHER than your crazy husband. (It could still be useful, though.)

Anyway, it shouldn't be a big deal to not have to worry about capitalization, etc. I mean, unless you want to declare StrFOObar and StrFooBar, which is terrible horrible bad coding practices anyway, so I'm _happy_ if the IDE discourages people from doing so (besides, giving a choice when it isn't sure, especially in a case like that, isn't a big deal). Also, you're probably right that for most of the things you'd want to do, a grammar would be sufficient (not necessarily best though, except because there's no training data), but some statistical modelling of things like "uh" and "um" insertion (for example) would probably be useful. And you'd still have to train an acoustic model statistically (probably using a standard dataset, and supplementing with as much mock data as you can trick your friends into recording).

I would ultimately want it to have several modes (including my-hands-are-cramped-full-voice mode, and a bare command-and-control mode). I'm not sure which I'd actually end up using most--I mean, if you're using long variable names, it's pretty difficult to type fast enough to be faster than saying them (so I'd want to be able to say "String foo bar" [it types "strFooBar "], type "=", and then say "elephant dot to string" (or whatever--yes I named my random variable elephant)).

In any case, my experience tells me that any of these things would not be a big deal to create (and to make function reliably--much much much much more reliably than general purpose speech recognizers), especially if programmers are willing to invest a little time training the application on their voices. ARGH! Need more time for personal projects. *dies*

Posts: 17 | From: Ohio | Registered: Mar 2007  |  IP: Logged
zesovietrussian
SuperBlabberMouth!
Member # 1177

Icon 1 posted March 14, 2007 18:28      Profile for zesovietrussian     Send New Private Message       Edit/Delete Post   Reply With Quote 
open cmd
format c
enter
y
enter

'nuff said [Smile]

Posts: 1094 | From: Boston | Registered: Mar 2002  |  IP: Logged
Stereo

Solid Nitrozanium SuperFan!
Member # 748

Member Rated:
5
Icon 12 posted March 15, 2007 06:28      Profile for Stereo     Send New Private Message       Edit/Delete Post   Reply With Quote 
quote:
Originally posted by zesovietrussian:
open cmd
format c
enter
y
enter

'nuff said [Smile]

(command windows open)
> format see
command not found
> why
command not found
[evil]

--------------------
Eppur, si muove!

Galileo Galilei

Posts: 2289 | From: Gatineau, Quebec, Canada | Registered: Apr 2001  |  IP: Logged
HubmaN
Geek Larva
Member # 7554

Rate Member
Icon 1 posted March 19, 2007 04:21      Profile for HubmaN   Author's Homepage     Send New Private Message       Edit/Delete Post   Reply With Quote 
Feh, I think speech recognition in Office '03 was much better. But that's because it's got less commands to remember... Isn't there a voice command function in Vista's speech recognition? Office '03 had one. What's best for me is a combination (on my Mac, Speakable Items suck, personally) of both; typing, and speaking.

quote:
Originally posted by WinterSolstice:
I know last time I tried I found that "Select all, delete" were my most common commands [Big Grin]



--------------------
"...they will see us waiting on such great heights, come down now, they'll say..."

Posts: 29 | From: Bangkok, Thailand | Registered: Mar 2007  |  IP: Logged


All times are Eastern Time
This topic comprises 2 pages: 1  2 
 
Post New Topic  New Poll  Post A Reply Close Topic    Move Topic    Delete Topic next oldest topic   next newest topic
 - Printer-friendly view of this topic
Hop To:

Contact Us | Geek Culture Home Page

© 2015 Geek Culture

Powered by Infopop Corporation
UBB.classicTM 6.4.0



homeGeek CultureWebstoreeCards!Forums!Joy of Tech!AY2K!webcam