Speech

Essay by 24 • March 5, 2011 • 2,711 Words (11 Pages) • 1,157 Views

Essay Preview: Speech

prev next

Page 1 of 11

A Speech Recognition Project

Abstract

Voice Recognition is a facinating field spanning several areas of computer science and mathematics. Reliable speech recognition is a hard problem, requiring a combination of many techniques, however modern methods have been able to achieve an impressive degree of accuracy. This project attempts to examine those techniques, and to apply them to build a simple voice recognition system. The project was started with three goals in mind. First, to be able distinguish 'yes' from 'no'. Second, to be able to recognize a vocabulary of 20 words, spoken individually. And third, to be able to recognize combinations of two or more words from this vocabulary spoken in close succession. The project is implemented in Matlab and was successful in achieving the first goal. It has been able to differentiate between a spoken 'yes' and a spoken 'no' with 100% accuracy among 24 samples taken from 8 different people. The method used is a simple one, involving a simple count of the frequency of zero crossings, but it is quite applicable to the voice recognition problem in general.

The Basic Steps

The process of voice recognition is typically divided into several well defined steps. Different systems vary on the nature of theses steps, as well as how each step is implemented, but the most successful systems follow a similar methodology.

Divide the sound wave into evenly spaced blocks

Process each block for important characteristics, such as strength across various frequency ranges, number of zero crossings, and total energy.

Using this charateristic vector, attempt to associate each block with a phone, which is the most basic unit of speech, producing a string of phones.

Find the word whose model is the most likely match to the string of phones which was produced.

Step 2 typically involves performing a spectrum analysis of the block. This can be done with a Fast Fourier Transform (FFT), or with a bank of frequency filters, but the most successful technique to date has been that of Linear Precidive Coding. Additional important features include analyzing the total energy, the change in the features over time, and the number of zero crossings. Step 3 is often done via a decision tree. Each phone often has very prominent characterstics which narrow the field of consideration. Additional characteristics then separate similar sounding phones. The final decisions are often mistaken, and these mistakes must be accounted for later. Step 4 has been implemented with a high degree of success using Hidden Markov Models (HMM's). A HMM is constructed for each word in the vocabulary, and then the string of phones is compared against each HMM, to determine which model is the most likely match.

This project implements steps 1 and 2. In step 2 the program extracts the zero crossing count. The maximum count over all blocks is then taken, which is sufficient to detect the precense or absence of an unvoiced consonant. Because 'yes' contains the unvoiced consonant 's' and 'no' does not contain an unvoiced consonant, this is able to distinguish between 'yes' and 'no' with a high degree of accuracy. See zerocross.m for the algorithm used to extract the zero crossing count in a given block.

A List of Phones

Phone Example

Vowels

IY beat

IH bit

EY bait

EH bet

AE bat

AA Bob

AH but

AO bought

OW boat

UH book

AX about

IX roses

ER bird

AXR butter

AW down

AY buy

OY boy

Consonants

Y you

W wit

R rent

L let

M met

N net

NX sing

P pet

T ten

K kit

B bet

D debt

G get

HH hat

F fat

TH thing

S sat

SH shut

V vat

DH that

Z zoo

ZH azure

CH church

JH judge

WH which

EL battle

EM bottom

EN button

DX batter

The Experiment

Numerous samples were take of various people saying either 'yes' or 'no'. This method is somewhat artificial in that a real system first has to detect whether speech exists at all (this problem is the separate task of speech detection). Therefore I implemented a criteria for the detection of speech (explained later). This this turned out to be useful for the removal of the empty header and trailer on each voice sample. The method is also somewhat artificial, because in fluent speech words tend to run together and the word boundaries are not obvious. I intended to adress this problem

...

Download as: txt (15 Kb) pdf (170.6 Kb) docx (15.4 Kb)

Continue for 10 more pages »

Read Full Essay Save

Only available on Essays24.com

Similar Essays

Speech Recognisation

Speech Recognition Using FPGA Technology Authors: Carlos Asmat 260148251 David López Sanzo 260146414 Kanwen Wu 260045745 _____________ _____________ _____________ Design Project Laboratory ECSE 494 Submitted

5,836 Words | 24 Pages
Illegal Drugs Informative Speech Outline

Introduction: Marijuana, cocaine, heroin, PCP, speed, shrooms, crystal meth, and angel dust are all types of illegal drugs. What is an illegal drug? An illegal

531 Words | 3 Pages
Pro Gun 2nd Amendment, Speech

Outline 4/19/99 Specific Goal: I want to encourage gun ownership. Introduction. I. What is the foundation of modern technology? It\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\'s the history of the gun.

621 Words | 3 Pages
Recycling Myths Speech

Jamie Klein 2/7/05 S121 Thesis: There are many myths about recycling. INTRO: Attn Gttr: Most people in here probably recycle on some level; some more

560 Words | 3 Pages
Language Speech Process

It is amazing to look back at our history to see how mankind has developed and evolved. One of the remarkable moments in history was

855 Words | 4 Pages
Pericles Funeral Speech

Pericles' Funeral Speech Athens democracy has some evident differences of its own system compared to Sparta's. They do not copy anyone else form of government,

401 Words | 2 Pages
Persuasive Speech

LAWS GOVERNING RAPE According to the Indian Penal Code, a man is said to have committed `rape' when he has had sexual intercourse with a

598 Words | 3 Pages
Informative Speech (Polygraphs)

Title: The Polygraph Speaker: Neil Thesis: To inform my audience about polygraphs, the polygraph tests, and the controversy of them. Introduction I. Attention Getter: For

651 Words | 3 Pages
Free Speech

FREEDOM OF EXPRESSION--SPEECH AND PRESS Adoption and the Common Law Background Madison's version of the speech and press clauses, introduced in the House of Representatives

2,886 Words | 12 Pages