1180 datasheet
TRANSCRIPT
5/13/2018 1180 Datasheet - slidepdf.com
http://slidepdf.com/reader/full/1180-datasheet 1/9
Email: [email protected] or sunrom@gmail
Visit us at http://www.sunrom
Document: Datasheet Date: 2-Oct-09 Model #: 1180 Product’s Page: www.sunrom.com/p-762.html
Speech Recognition System
The speech recognition system is a completely assembledand easy to use programmable speech recognition circuit.Programmable, in the sense that you train the words (orvocal utterances) you want the circuit to recognize. Thisboard allows you to experiment with many facets of speechrecognition technology. It has 8 bit data out which can beinterfaced with any microcontroller for further development.Some of interfacing applications which can be made are
controlling home appliances, robotics movements, SpeechAssisted technologies, Speech to text translation, and manymore.
Features
• Self-contained stand alone speech recognition circuit
• User programmable
• Up to 20 word vocabulary of duration two second each• Multi-lingual
• Non-volatile memory back up with 3V battery onboard.Will keep the speech recognition data in memory even
after power off.• Easily interfaced to control external circuits &
appliances
Specification
Parameter Value Note
Input Voltage 9 to 15 V DC Use a commonly available 12V 500ma DC Adapter
Output Data 8 bits at 5VLogic Level
Any microcontroller like 8051, PIC or AVR can be interfaced to dataport to interpret and implement specialized applications
Applications
There are several areas for application of voice recognition technology.
• Speech controlled appliances and toys
• Speech assisted computer games
• Speech assisted virtual reality• Telephone assistance systems
• Voice recognition security
• Speech to speech translation
5/13/2018 1180 Datasheet - slidepdf.com
http://slidepdf.com/reader/full/1180-datasheet 2/9
Sunrom Technologies Your Source for Embedded Systems Visit us at www.sunrom.com
2
Introduction
Speech recognition will become the method of choice for controlling appliances, toys, tools acomputers. At its most basic level, speech controlled appliances and tools allow the user to perforparallel tasks (i.e. hands and eyes are busy elsewhere) while working with the tool or appliance.
The heart of the circuit is the HM2007 speech recognition IC. The IC can recognize 20 words, eachword a length of 1.92 seconds.
Complete Schematic of System
Using the System
The keypad and digital display are used to communicate with and program the HM2007 chip. Thekeypad is made up of 12 normally open momentary contact switches. When the circuit is turned on“00” is on the digital display, the red LED (READY) is lit and the circuit waits for a command.
Training Words for RecognitionPress “1” (display will show “01” and the LED will turn off) on the keypad, then press the TRAIN ke( the LED will turn on) to place circuit in training mode, for word one. Say the target word into theonboard microphone (near LED) clearly. The circuit signals acceptance of the voice input by
TitleCode RevDate: Sheet of
1180 1Speech Recognition
Sunrom Technologies http://www.sunrom.c
1 1Wednesday, February 11, 2009
SW1 SW2
DB2
SW3
SW5
DATA OUT
SW6SW4
SW7
SW11 SW12
U1HM2007
GND1
X22
X13
S14
S25
S36
RDY7
K1
8
K29
K310
K411
TEST12
WLEN13
CPUM14
WAIT15
DEN16
SA017
SA118
SA219
SA320
SA421
SA522
SA623
SA724
VDD25GND26SA827SA928SA1029SA1130SA1231NC32NC33ME34MR/MW35D036D137D238D339D440D5
41D642D743VREF44LINE45MICIN46VDD47AGND48
SW8SW9
SW10
DB1
Y13.579 Mhz
DB0
U4HY6264
A010
A19
A28
A37
A46
A55
A64
A7
3
A825
A924
A1021
A1123
A122
D011
D112
D213
D315
D416
D517
D618
D7
19
V C C
2 8
G N D
1 4
OE22
WE27
CS120
CS226
BT13V BATT
D1LED
VCC
R4470R
READY
K4
K3
K1
K2
R522K
VCCC3
100nF+M1MIC
21 3
R16.8K
VCC
D7
5 6
9
4
TRAIN
7 8
CLEAR 0
C2100nF
D6K1
DB2
DB5
DB1
DB4
D4
D5
ME
DB6
DB0
DB3
DB7
D0
D3D2D1
K2K3K4
S1S2S3
S1S3 S2
D0
C1100nF
D1D2D3
D7
D5D4
D6
VCC
VCC
C5100nF
C6100nF
VCC VCC
C10
100n
VCC
D5CN2DC SOCKET
9-12V Input
+C11
100uF 16V
Powe r Supply
D6
C8
100n
+
C91000uF 25V
D7
C4100nF
D4
VCC
VCC
U274HC573
D02 D13 D24 D35 D46 D57 D68 D79
LE11
OE1
Q019Q118Q217Q316Q415Q514Q613Q712
G N D
1 0
V C C
2 0
VCC
VCC
D0
DEN
D3D2D1
D5D4
D7D6
U3CD4511B
B1
C2
D6
A7
G N
D
8
V D D
1 6
e9d10c11b12a13
g14f15
LT3
BI4
LE5
VCC
1
2
4
5
7
9
10
a b
c
de
f g
SL
1 2 3 4 5
6 7 8 9 1 0
R2220R
R3220R
R6220R
R7220R
R8220R
R9220R
R10220R
U5CD4511B
B1
C2
D6
A7
G N D
8
V D D
1 6
e9d10c11b12a13
g14f15
LT3
BI4
LE5
VCC
1
2
4
5
7
9
10
a b
c
de
f g
SL
1 2 3 4 5
6 7 8 9 1 0
R11220R
R12220R
R13220R
R14220R
R15220R
R16220R
R17220R
DEN
CN1SIP10
123456789
10
SA0SA1SA2SA3SA4
SA7
SA5SA6
SA9
SA11
SA8
SA10
SA12
SA0SA1SA2SA3
SA5
SA7SA6
SA4
SA9
SA11SA10
SA8
SA12
R18100K
C7100nF
IN OUT
GND
U6LM7805
1 3
2
VCC
MEMR
MR
D3
BAT85
VCC
D2
BAT85
DB7DB6DB5
3V
3V
DB4DB3
5/13/2018 1180 Datasheet - slidepdf.com
http://slidepdf.com/reader/full/1180-datasheet 3/9
Sunrom Technologies Your Source for Embedded Systems Visit us at www.sunrom.com
3
blinking the LED off then on. The word (or utterance) is now identified as the “01” word. If the LEDdid not flash, startover by pressing “1” and then “TRAIN” key.
You may continue training new words in the circuit. Press “2” then TRN to train the second wordand so on. The circuit will accept and recognize up to 20 words (numbers 1 through 20). It is notnecessary to train all word spaces. If you only require 10 target words that’s all you need to train.
Testing Recognition:Repeat a trained word into the microphone. The number of the word should be displayed on thedigital display. For instance, if the word “directory” was trained as word number 20, saying theword “directory” into the microphone will cause the number 20 to be displayed.
Error Codes:The chip provides the following error codes.
55 = word to long66 = word to short77 = no match
Clearing MemoryTo erase all words in memory press “99” and then “CLR”. The numbers will quickly scroll by on thedigital display as the memory is erased.
Changing & Erasing WordsTrained words can easily be changed by overwriting the original word. For instances suppose wordsix was the word “Capital” and you want to change it to the word “State”. Simply retrain the wordspace by pressing “6” then the TRAIN key and saying the word “State” into the microphone.If one wishes to erase the word without replacing it with another word press the word number (inthis case six) then press the CLR key. Word six is now erased.
Simulated Independent RecognitionThe speech recognition system is speaker dependant, meaning that the voice that trained thesystem has the highest recognition accuracy. But you can simulate independent speech recognitioTo make the recognition system simulate speaker independence one uses more than one wordspace for each target word. Now we use four word spaces per target word. Therefore we obtain fodifferent enunciation’s of each target word. (speaker independent). The word spaces 01, 02, 03 an04 are allocated to the first target word. We continue do this for the remaining word space. Forinstance, the second target word will use the word spaces 05, 06, 07 and 08. We continue in thismanner until all the words are programmed.
If you are experimenting with speaker independence use different people when training a targetword. This will enable the system to recognize different voices, inflections and enunciation's of thetarget word. The more system resources that are allocated for independent recognition the morerobust the circuit will become.
If you are experimenting with designing the most robust and accurate system possible, train targetwords using one voice with different inflections and enunciation's of the target word.
5/13/2018 1180 Datasheet - slidepdf.com
http://slidepdf.com/reader/full/1180-datasheet 4/9
Sunrom Technologies Your Source for Embedded Systems Visit us at www.sunrom.com
4
HomonymsHomonyms are words that sound alike. For instance the words cat, bat, sat and fat sound alike.Because of their like sounding nature they can confuse the speech recognition circuit. Whenchoosing target words for your system do not use homonyms.
The Voice with Stress & ExcitementStress and excitement alters ones voice. This affects the accuracy of the circuit’s recognition. For
instance assume you are sitting at your workbench and you program the target words like fire, left,right, forward, etc., into the circuit. Then you use the circuit to control a flight simulator game, Doomor Duke Nukem. Well, when you’re playing the game you’ll likely be yelling “FIRE! …Fire! ...FIRE!!...LEFT …go RIGHT!”. In the heat of the action you’re voice will sound much different than whenyou were sitting down relaxed and programming the circuit. To achieve a higher accuracy wordrecognition one needs to mimic the excitement in ones voice when programming the circuit.
These factors should be kept in mind to achieve the high accuracy possible from the circuit. Thisbecomes increasingly important when the speech recognition circuit is taken out of the lab and putto work in the outside world.
Error CodesWhen interfacing the external circuit through its data bus, The decoding circuit must recognize theword numbers from error codes. So the circuit must be designed to recognize error codes 55, 66and 77 and not confuse them with word spaces 5, 6 and 7.
Voice Security SystemThis circuit isn’t designed for a voice security system in a commercial application, but that shouldnot prevent anyone from experimenting with it for that purpose. A common approach is to use threor four keywords that must be spoken and recognized in sequence in order to open a lock or allowentry.
Aural InterfacesIt’s been found that mixing visual and aural information is not effective. Products that require visuaconfirmation of an aural command grossly reduces efficiency. To create an effective AUI productsneed to understand (recognize) commands given in an unstructured and efficient methods. The wain which people typically communicate verbally.
Learning to ListenThe ability to listen to one person speak among several at a party is beyond the capabilities oftoday’s speech recognition systems. Speech recognition systems can not (as of yet) separate andfilter out what should be considered extraneous noise.
Speech recognition is not understanding speech. Understanding the meaning of words is a higherintellectual function. Because a circuit can respond to a vocal command doesn’t mean itunderstands the command spoken. In the future, voice recognition systems may have the ability todistinguish nuances of speech and meanings of words, to “Do what I mean, not what I say!”
Speaker Dependent / Speaker IndependentSpeech recognition is divided into two broad processing categories; speaker dependent andspeaker independent.
5/13/2018 1180 Datasheet - slidepdf.com
http://slidepdf.com/reader/full/1180-datasheet 5/9
Sunrom Technologies Your Source for Embedded Systems Visit us at www.sunrom.com
5
Speaker dependent systems are trained by the individual who will be using the system. Thesesystems are capable of achieving a high command count and better than 95% accuracy for wordrecognition. The drawback to this approach is that the system only responds accurately only to theindividual who trained the system. This is the most common approach employed in software forpersonal computers.
Speaker independent is a system trained to respond to a word regardless of who speaks. Therefo
the system must respond to a large variety of speech patterns, inflections and enunciation's of thetarget word. The command word count is usually lower than the speaker dependent however highaccuracy can still be maintain within processing limits. Industrial applications more often requirespeaker independent voice recognition systems.
Recognition StyleIn addition to the speaker dependent/independent classification, speech recognition also contendswith the style of speech it can recognize. They are three styles of speech: isolated, connected andcontinuous.
Isolated: Words are spoken separately or isolated. This is the most common speech recognition
system available today. The user must pause between each word or command spoken.
Connected: This is a half way point between isolated word and continuous speech recognition. Itpermits users to speak multiple words. The HM2007 can be set up to identify words or phrases 1.9seconds in length. This reduces the word recognition dictionary number to 20.
Continuous: This is the natural conversational speech we use to in everyday life. It is extremelydifficult for a recognizer to sift through the sound as the words tend to merge together. For instanc"Hi, how are you doing?" to a computer sounds like "Hi,.howyadoin" Continuous speech recognitiosystems are on the market and are under continual development.
More On The HM2007 ChipThe HM2007 is a CMOS voice recognition LSI (Large Scale Integration) circuit. The chip containsan analog front end, voice analysis, regulation, and system control functions. The chip may be usein a stand alone or CPU connected.
Features:
• Single chip voice recognition CMOS LSI
• Speaker dependent
• External RAM support• Maximum 40 word recognition (.96 second)
• Maximum word length 1.92 seconds (20 word)
• Microphone support• Manual and CPU modes available
• Response time less than 300 milliseconds
• 5V power supply
More information on the HM2007 chip is available in the HM2007 data booklet (DS-HM2007) whiccan be downloaded below.http://www.sunrom.com/files/HM2007.pdf
5/13/2018 1180 Datasheet - slidepdf.com
http://slidepdf.com/reader/full/1180-datasheet 6/9
Sunrom Technologies Your Source for Embedded Systems Visit us at www.sunrom.com
6
Interfacing external circuits through data bus
This sample project will show how a circuit can be interfaced through the data bus of speechrecognition circuit. It will show messages and error codes on LCD. It will also operate four relays aper data from speech circuit.
Schematic of interfacing project
TitleCode RevDate: Sheet of
1180A 1Demo Project of Speech Recognition
Sunrom Technologies http://www.sunrom.com
1 1Thursday, F ebruary 26, 2009
R21K
D2
LED
VDD
R41K
VDD
D9
LED
R71K
VDD
D10
LED
R81K
D11
LED
VDD
RLY4
CN 1PBT2
LS1RELAY
35
412RLY1
VDD
RLY3
LS3RELAY
35
412
VDD
LS2RELAY
35
412
CN 3PBT2
RLY2
VDD
LS4RELAY
35
412
VDD
CN 4PBT2
CN 7PBT2
RLY4
U2ULN2803
COM10
G N D
9
IN11
IN22
IN33IN4
4
IN55
IN66
IN77
IN88
OUT118
OUT217
OUT3 16OUT4
15
OUT514
OUT613
OUT712
OUT811
RLY3
RLY1
VDD
RLY2
D1LED
R1470R
VCC
R310K
U3AT89S52
P3.1/TXD11
P3.2/INT012
P3.3/INT113
P3.4/T014
P3.5/T115
P3.6/WR 16P3.7/RD
17
X T A L 2
1 8
X T A L 1
1 9
G N D
2 0
P2.0/A821
P2.1/A922
P2.2/A1023
P2.3/A1124
P2.4/A1225
P2.5/A1326
P2.6/A1427
P2.7/A1528
PSEN29
ALE/PROG30
EA/VPP31
P0.7/AD732
P0.6/AD633 P0.5/AD534 P0.4/AD435
P0.3/AD336 P0.2/AD237 P0.1/AD138
P0.0/AD039
V C C
4 0
P1.0/T21
P1.1/T2EX2
P1.23
P1.34
P1.4/SS5
P1.5/MOSI6
P1.6/MISO7P1.7/SCK
8
RST9
P3.0/RXD10
Y1
11.0592C1333p
C1433p
VCC
VCC
+
C610uF
RN110K R-ARRAY
9 8 7 6 5 4 3 2
1
VCC
PR150K PRESET
LCD
U1LCD 16x2
D 0
7
D 1
8
D 2
9
D 3
1 0
D 4
1 1
D 5
1 2
D 6
1 3
D 7
1 4
V l e d
1 5
E n a
b l e
6
R / W
5
R S
4
V L
3
V d d
2
V s s
1
G l e d
1 6
VCC
VCC
Display Contrast
C1100n
C2100n
DB3
DB5DB6
DB2
DB4
DB7
DB1
SPEECH BOARD
DB0
CN 6SIP10
12345678910
C10
100n
VCC
D5CN2DC SOCKET
9-12V Input
+C11
100uF 16V
Powe r Supply
D6
C8
100n
+
C91000uF 25V
D7
D4
VDD
DATA FROM
IN OUT
GND
U6
LM78051 3
2
5/13/2018 1180 Datasheet - slidepdf.com
http://slidepdf.com/reader/full/1180-datasheet 7/9
Sunrom Technologies Your Source for Embedded Systems Visit us at www.sunrom.com
7
Sample Code of interfacing project
//main.c
#include <REGX51.H> // standard 8051 defines
// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
// -=-=-=-=- Include files -=-=-=-=-=-=-=
// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
#include "lcd.h"
#include "utils.h"
// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= // -=-=-=-=- Hardware Defines -=-=-=-=-=-=-=
// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
sfr DATA = P0;
sbit OUT1 = P3^4;
sbit OUT2 = P3^5;
sbit OUT3 = P3^6;
sbit OUT4 = P3^7;
// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
// -=-=-=-=- Variables -=-=-=-=-=-=-=
// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
char buf[20];
char code M1[] ="SPEECH: ONE";
char code M2[] ="SPEECH: TWO";
char code M3[] ="SPEECH: THREE";
char code M4[] ="SPEECH: FOUR";
// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
// -=-=-=-=- Main Program -=-=-=-=-=-=-=
// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
void main()
{
unsigned char lastdata, datanow;
// -=-=- Intialize variables -=-=-=
OUT1 = 0;
OUT2 = 0;
OUT3 = 0;
OUT4 = 0;// -=-=- Intialise -=-=-=
lcdInit();
// -=-=- Welcome LCD Message -=-=-=
lcdClear();
lcdGotoXY(0,0); // 1st Line of LCD
lcdPrint("Speech Test");
lcdGotoXY(0,1); // 2nd Line of LCD
lcdPrint("System");
delayms(5000); // 5 sec
lcdClear();
lcdGotoXY(0,0); // 1st Line of LCD
lcdPrint("Train: 1-4 key >");
lcdGotoXY(0,1); // 2nd Line of LCDlcdPrint("Train>Speak Now");
// -=-=- Program Loop -=-=-=
lastdata=0xff;
while(1)
{
datanow=DATA; // read data from speech board
if(lastdata!=datanow) // if there is new data then,
{
5/13/2018 1180 Datasheet - slidepdf.com
http://slidepdf.com/reader/full/1180-datasheet 8/9
Sunrom Technologies Your Source for Embedded Systems Visit us at www.sunrom.com
8
lastdata=datanow;
switch(lastdata)
{
case 0x55:
lcdClear();
lcdGotoXY(0,0); // 1st Line of LCD
lcdPrint("Speech too Long");
lcdGotoXY(0,1); // 2nd Line of LCD
lcdPrint("Try Again!");
break;case 0x66:
lcdClear();
lcdGotoXY(0,0); // 1st Line of LCD
lcdPrint("Speech too Short");
lcdGotoXY(0,1); // 2nd Line of LCD
lcdPrint("Try Again!");
break;
case 0x77:
lcdClear();
lcdGotoXY(0,0); // 1st Line of LCD
lcdPrint("No Match");
lcdGotoXY(0,1); // 2nd Line of LCD
lcdPrint("Try Again!");
break;
case 0x01:
if(OUT1==1)
OUT1 = 0;
else
OUT1 = 1;
lcdClear();
lcdGotoXY(0,0); // 1st Line of LCD
lcdPrint(M1);
break;
case 0x02:
if(OUT2==1)
OUT2 = 0;
elseOUT2 = 1;
lcdClear();
lcdGotoXY(0,0); // 1st Line of LCD
lcdPrint(M2);
break;
case 0x03:
if(OUT3==1)
OUT3 = 0;
else
OUT3 = 1;
lcdClear();
lcdGotoXY(0,0); // 1st Line of LCD
lcdPrint(M3); break;
case 0x04:
if(OUT4==1)
OUT4 = 0;
else
OUT4 = 1;
lcdClear();
lcdGotoXY(0,0); // 1st Line of LCD
lcdPrint(M4);
5/13/2018 1180 Datasheet - slidepdf.com
http://slidepdf.com/reader/full/1180-datasheet 9/9
Sunrom Technologies Your Source for Embedded Systems Visit us at www.sunrom.com
9
break;
}
}
}
}