LATEST VERSION 2.77 (08.03.2016)

Description

URBI module for speech recognition based on Microsoft Speech SDK. The Microsoft Speech SDK consists of a Software Development Kit, a Runtime, and Runtime Languages (language packs that enable speech recognition or text-to-speech for a specific language).

Grammars are at the core of speech recognition and are perhaps the most important component under control of the speech application developer that affects the accuracy of speech recognition. Grammars work in conjunction with the speech recognition engine and its lexicons and speech models to define the factors that affect speech recognition performance.

The Microsoft Speech Platform SDK provides programmatic processes for authoring speech recognition grammars and also offers support for XML-format grammars authored in compliance with industry standards.

More about Microsoft speech recognition:

Microsoft Speech Platform LINK

Microsoft Speech Platform SDK LINK

Microsoft Speech Platform Runtime LINK

Microsoft Speech Platform Languages LINK

More about Microsoft Speech Platform Grammars LINK

Differences between MSP and SAPI LINK

 

Software requirements

If only running the module:
  • Microsoft Speech Platform Runtime,
In order to compile the module, additional libraries are required:
  • Microsoft Speech Platform SDK (tested with 11.0).
 SAPI is integrated with all Windows distributions, MSP must be installed.
 

Module functions

URecog.new("engine", recognizerNo, inputNo); - initialize SR engine,
engine - you can choose "InProc" In-process or "InShared" shared engine,
recognizerNo - choose input number (mic, Kinect,...),
inputNo - choose recognizer (language),
URecog.AvailableRecognizers; - get all available recognizers,
URecog.AvailableInputs; - get all available inputs,
URecog.Poll(bool); - poll speech recognition stream, set true to wait for result or false to check events only,
URecog.LoadGrammar("fileName.grxml") - load a new grammar xml file,
URecog.AddPhraseGrammar(rule, word); - dynamically add a sentence to the grammar,
URecog.ResetGrammar(); - clear all grammar rules,
URecog.SetDictationState(SPRULESTATE); - sets the dictation topic state. 
SPRULESTATE flage indicates the new state of dictation. 
0 - SPRS_INACTIVE
1 - SPRS_ACTIVE
3 - SPRS_ACTIVE_WITH_AUTO_PAUSE
4 - SPRS_ACTIVE_USER_DELIMITED
URecog.pause; - pause recognition engine, true - paused, false - resume,
URecog.result - recognition result,
URecog.resultTag - recognition result tag,
URecog.confidence - recognition result confidence value,
URecog.confidenceTreshold - recognition confidence threshold value,
URecog.isListening - recognition engine status flag.
 

Urbiscript examples

Example 1

loadModule("URecog");
var recog=URecog.new("InProc",0 , 0);
recog.AddPhraseGrammar("","my name is John");
recog.AddPhraseGrammar("","please start game");
recog.AddPhraseGrammar("","yes");
recog.AddPhraseGrammar("","no");
recog.AddPhraseGrammar("","stop game");
recog.AddPhraseGrammar("","turn left");
reccg.AddPhraseGrammar("","turn right");
recog.AddPhraseGrammar("","check my email");
t:loop {
    recog.Poll(true); 
    echo(recog.result);
},

Example 2

loadModule("URecog");
var recog=URecog.new("InProc", 0, 0);
recog.LoadGrammar("speech.grxml");
t:loop { 
    recog.Poll(true); 
    echo(recog.result);
},

Download

URBI module LINK

Microsoft Speech Platform Runtime v11.0 LINK

 

 

EMYS and FLASH are Open Source and distributed according to the GPL v2.0 © Rev. 0.8.0, 27.04.2016

FLASH Documentation