CA2216224A1 - Block algorithm for pattern recognition - Google Patents

Block algorithm for pattern recognition Download PDF

Info

Publication number
CA2216224A1
CA2216224A1 CA002216224A CA2216224A CA2216224A1 CA 2216224 A1 CA2216224 A1 CA 2216224A1 CA 002216224 A CA002216224 A CA 002216224A CA 2216224 A CA2216224 A CA 2216224A CA 2216224 A1 CA2216224 A1 CA 2216224A1
Authority
CA
Canada
Prior art keywords
models
observations
blocks
comparing
subset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002216224A
Other languages
French (fr)
Inventor
Peter R. Stubley
Andre Gillet
Vishwa N. Gupta
Christopher K. Toulson
David B. Peters
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nortel Networks Ltd
Original Assignee
Northern Telecom Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northern Telecom Ltd filed Critical Northern Telecom Ltd
Priority to CA002216224A priority Critical patent/CA2216224A1/en
Priority to US09/119,621 priority patent/US6092045A/en
Priority to EP98307555A priority patent/EP0903728A3/en
Publication of CA2216224A1 publication Critical patent/CA2216224A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/285Memory allocation or algorithm optimisation to reduce hardware requirements

Abstract

Comparing a series of observations representing unknown speech, to stored models representing known speech, the series of observations being divided into at least two blocks each comprising two or more of the observations, is carried out in an order which makes better use of memory.
First, the observations in one of the blocks are compared (31), to a subset comprising one or more of the models, to determine a likelihood of a match to each of the one or more models. This step is repeated (33) for models other than those in the subset; and the whole process is repeated (34) for each block.

Claims (21)

1. A method of comparing a series of observations representing unknown speech, to stored models representing known speech, the series of observations being divided into at least two blocks each comprising two or more of the observations, the method comprising the steps of:
a) comparing two or more of the observations in one of the blocks of observations, to a subset comprising one or more of the models, to determine a likelihood of a match to each of the one or more models;
b) repeating step a) for models other than those in the subset; and c) repeating steps a) and b) for a different one of the blocks.
2. The method of claim 1 wherein the observations are represented as multidimensional vectors, for the comparison at step a).
3. The method of claim 1 wherein the comparison at step a) uses a Viterbi algorithm.
4. The method of claim 1 wherein the models are represented as finite state machines with probability distribution functions attached.
5. The method of claim 1 wherein the models comprise groups of representations of phonemes.
6. The method of claim 1 wherein the models comprise representations of elements of speech, and step a) comprises the step of:
comparing the block of observations to a predetermined sequence of the models in the subset.
7. The method of claim 1 wherein step a) comprises the steps of:
comparing the block of observations to a predetermined sequence of the models in the subset;

determining for each of the models in the sequence, a score which represents the likelihood of a match with the observations compared so far;
storing the score in a score buffer for use in determining scores of subsequent models in the sequence; and determining when the score is no longer needed, then re-using the score buffer to store a subsequent score.
8. The method of claim 1 wherein, step a) comprises the step of:
comparing the block of observations to a lexical graph comprising a predetermined sequence of the models in the subset, wherein the sequence comprises different types of models, and the comparison is dependent on the type; and the method comprises the step of:
determining the types of the models before the block is compared.
9. The method of claim 1, the models comprising finite state machines, having multiple state sequences, wherein step a) comprises the steps of:
determining state scores for the matches between each respective observation and state sequences of the respective model, making an approximation of the state scores, for the observation, for storing to use in matching subsequent observations, the approximation comprising fewer state scores than were determined for the respective observation.
10. A method of recognising patterns in a series of observations, by comparing the observations to stored models, using a processing means having a main memory for storing the models and a cache memory, the cache memory being too small to contain all the models and observations, the series of observations being divided into blocks of at least two observations, the method comprising the steps of:

a) using the processor to compare a subset of the models to the observations in one of the blocks of observations, to recognise the patterns, the subset of the models being small enough to fit in the cache memory;
b) repeating step a) for a different subset of the models and;
c) repeating steps a) and b) for a different one of the blocks.
11. A method of recognising patterns in a series of observations by comparing the observations to stored models, the series of observations being divided into at least two blocks each comprising two or more of the observations, the models comprising finite state machines, having multiple state sequences, the method comprising the steps of:
a) comparing two or more of the observations in one of the blocks of observations, to a subset comprising one or more of the models, to determine a likelihood of a match to each of the one or more models, by determining which of the state sequences of the respective model is the closest match, and how close is the match;
b) repeating step a) for models other than those in the subset; and c) repeating steps a) and b) for a different one of the blocks.
12. The method of claim 11 wherein the observations are speech signals, and the models are representations of elements of speech.
13 The method of claim 11 wherein the comparison at step a) uses the Viterbi algorithm.
14. The method of claim 11 wherein the models are represented as finite state machines with probability distribution functions attached.
15. A method of comparing a series of observations representing unknown speech, to stored models representing known speech, by comparing the observations to stored models, the series of observations being grouped into one or more blocks each comprising two or more of the observations, the models comprising finite state machines, having multiple state sequences, the method comprising, for each of the one or more blocks, the steps of:
a) comparing two or more of the observations in the respective block, to a subset comprising one or more of the models, to determine a likelihood of a match to each of the one or more models, by determining which of the state sequences of the respective model is the closest match, and how close is the match; and b) repeating step a) for models other than those in the subset.
16. Software stored on a computer readable medium for comparing a series of observations representing unknown speech, to stored models representing known speech, the series of observations being divided into at least two blocks each comprising two or more of the observations, the software being arranged for carrying out the steps of:
a) comparing two or more of the observations in one of the blocks of observations, to a subset comprising one or more of the models, to determine a likelihood of a match to each of the one or more models;
b) repeating step a) for models other than those in the subset; and c) repeating steps a) and b) for a different one of the blocks.
17. Software stored on a computer readable medium for recognising patterns in a series of observations by comparing the observations to stored models, the series of observations being divided into at least two blocks each comprising two or more of the observations, the models comprising finite state machines, having multiple state sequences, the software being arranged to carry out the steps of:
a) comparing two or more of the observations in one of the blocks of observations, to a subset comprising one or more of the models, to determine a likelihood of a match to each of the one or more models, by determining which of the state sequences of the respective model is the closest match, and how close is the match;
b) repeating step a) for models other than those in the subset; and c) repeating steps a) and b) for a different one of the blocks.
18. Software stored on a computer readable medium for comparing a series of observations representing unknown speech, to stored models representing known speech, by comparing the observations to stored models, the series of observations being grouped into one or more blocks each comprising two or more of the observations, the models comprising finite state machines, having multiple state sequences, the software being arranged to carry out for each of the one or more blocks, the steps of:
a) comparing two or more of the observations in the respective block, to a subset comprising one or more of the models, to determine a likelihood of a match to each of the one or more models, by determining which of the state sequences of the respective model is the closest match, and how close is the match; and b) repeating step a) for models other than those in the subset.
19. A speech recognition processor for comparing a series of observations representing unknown speech, to stored models representing known speech, the series of observations being divided into at least two blocks each comprising two or more of the observations, the processor being arranged to carry out the steps of:
a) comparing two or more of the observations in one of the blocks of observations, to a subset comprising one or more of the models, to determine a likelihood of a match to each of the one or more models;
b) repeating step a) for models other than those in the subset; and c) repeating steps a) and b) for a different one of the blocks.
20. A speech recognition processor for recognising patterns in a series of observations by comparing the observations to stored models, the series of observations being divided into at least two blocks each comprising two or more of the observations, the models comprising finite state machines, having multiple state sequences, the processor being arranged to carry out the steps of:
a) comparing two or more of the observations in one of the blocks of observations, to a subset comprising one or more of the models, to determine a likelihood of a match to each of the one or more models, by determining which of the state sequences of the respective model is the closest match, and how close is the match;
b) repeating step a) for models other than those in the subset; and c) repeating steps a) and b) for a different one of the blocks.
21. A speech recognition processor for comparing a series of observations representing unknown speech, to stored models representing known speech, by comparing the observations to stored models, the series of observations being grouped into one or more blocks each comprising two or more of the observations, the models comprising finite state machines, having multiple state sequences, the processor being arranged to carry out, for each of the one or more blocks, the steps of:
a) comparing two or more of the observations in the respective block, to a subset comprising one or more of the models, to determine a likelihood of a match to each of the one or more models, by determining which of the state sequences of the respective model is the closest match, and how close is the match; and b) repeating step a) for models other than those in the subset.
CA002216224A 1997-09-19 1997-09-19 Block algorithm for pattern recognition Abandoned CA2216224A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CA002216224A CA2216224A1 (en) 1997-09-19 1997-09-19 Block algorithm for pattern recognition
US09/119,621 US6092045A (en) 1997-09-19 1998-07-21 Method and apparatus for speech recognition
EP98307555A EP0903728A3 (en) 1997-09-19 1998-09-17 Block algorithm for pattern recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CA002216224A CA2216224A1 (en) 1997-09-19 1997-09-19 Block algorithm for pattern recognition

Publications (1)

Publication Number Publication Date
CA2216224A1 true CA2216224A1 (en) 1999-03-19

Family

ID=4161510

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002216224A Abandoned CA2216224A1 (en) 1997-09-19 1997-09-19 Block algorithm for pattern recognition

Country Status (3)

Country Link
US (1) US6092045A (en)
EP (1) EP0903728A3 (en)
CA (1) CA2216224A1 (en)

Families Citing this family (55)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7076102B2 (en) * 2001-09-27 2006-07-11 Koninklijke Philips Electronics N.V. Video monitoring system employing hierarchical hidden markov model (HMM) event learning and classification
US6134537A (en) * 1995-09-29 2000-10-17 Ai Ware, Inc. Visualization and self organization of multidimensional data through equalized orthogonal mapping
EP0979497A1 (en) * 1997-10-08 2000-02-16 Koninklijke Philips Electronics N.V. Vocabulary and/or language model training
US6725195B2 (en) * 1998-08-25 2004-04-20 Sri International Method and apparatus for probabilistic recognition using small number of state clusters
US7058573B1 (en) * 1999-04-20 2006-06-06 Nuance Communications Inc. Speech recognition system to selectively utilize different speech recognition techniques over multiple speech recognition passes
US6195639B1 (en) 1999-05-14 2001-02-27 Telefonaktiebolaget Lm Ericsson (Publ) Matching algorithm for isolated speech recognition
US6374221B1 (en) * 1999-06-22 2002-04-16 Lucent Technologies Inc. Automatic retraining of a speech recognizer while using reliable transcripts
US6789060B1 (en) * 1999-11-01 2004-09-07 Gene J. Wolfe Network based speech transcription that maintains dynamic templates
US6621834B1 (en) * 1999-11-05 2003-09-16 Raindance Communications, Inc. System and method for voice transmission over network protocols
US6480827B1 (en) * 2000-03-07 2002-11-12 Motorola, Inc. Method and apparatus for voice communication
US6272464B1 (en) * 2000-03-27 2001-08-07 Lucent Technologies Inc. Method and apparatus for assembling a prediction list of name pronunciation variations for use during speech recognition
US6662158B1 (en) * 2000-04-27 2003-12-09 Microsoft Corporation Temporal pattern recognition method and apparatus utilizing segment and frame-based models
US6629073B1 (en) 2000-04-27 2003-09-30 Microsoft Corporation Speech recognition method and apparatus utilizing multi-unit models
US6813341B1 (en) * 2000-08-31 2004-11-02 Ivoice, Inc. Voice activated/voice responsive item locator
US7292678B2 (en) * 2000-08-31 2007-11-06 Lamson Holdings Llc Voice activated, voice responsive product locator system, including product location method utilizing product bar code and aisle-situated, aisle-identifying bar code
US7136465B2 (en) * 2000-08-31 2006-11-14 Lamson Holdings Llc Voice activated, voice responsive product locator system, including product location method utilizing product bar code and product-situated, location-identifying bar code
JP4283984B2 (en) * 2000-10-12 2009-06-24 パイオニア株式会社 Speech recognition apparatus and method
US7003455B1 (en) * 2000-10-16 2006-02-21 Microsoft Corporation Method of noise reduction using correction and scaling vectors with partitioning of the acoustic space in the domain of noisy speech
US7003450B2 (en) * 2000-10-20 2006-02-21 Pts Corporation Methods and apparatus for efficient vocoder implementations
US20020120446A1 (en) * 2001-02-23 2002-08-29 Motorola, Inc. Detection of inconsistent training data in a voice recognition system
US7013273B2 (en) * 2001-03-29 2006-03-14 Matsushita Electric Industrial Co., Ltd. Speech recognition based captioning system
US6785647B2 (en) 2001-04-20 2004-08-31 William R. Hutchison Speech recognition system with network accessible speech processing resources
US6751595B2 (en) 2001-05-09 2004-06-15 Bellsouth Intellectual Property Corporation Multi-stage large vocabulary speech recognition system and method
ES2190342B1 (en) * 2001-06-25 2004-11-16 Universitat Pompeu Fabra METHOD FOR IDENTIFICATION OF AUDIO SEQUENCES.
US20030009334A1 (en) * 2001-07-03 2003-01-09 International Business Machines Corporation Speech processing board for high volume speech processing applications
US20030055645A1 (en) * 2001-09-18 2003-03-20 Meir Griniasty Apparatus with speech recognition and method therefor
US7346510B2 (en) * 2002-03-19 2008-03-18 Microsoft Corporation Method of speech recognition using variables representing dynamic aspects of speech
EP1488410B1 (en) * 2002-03-27 2010-06-02 Nokia Corporation Distortion measure determination in speech recognition
US7117148B2 (en) 2002-04-05 2006-10-03 Microsoft Corporation Method of noise reduction using correction vectors based on dynamic aspects of speech and noise normalization
US7139703B2 (en) * 2002-04-05 2006-11-21 Microsoft Corporation Method of iterative noise estimation in a recursive framework
US6879954B2 (en) * 2002-04-22 2005-04-12 Matsushita Electric Industrial Co., Ltd. Pattern matching for large vocabulary speech recognition systems
US7487091B2 (en) 2002-05-10 2009-02-03 Asahi Kasei Kabushiki Kaisha Speech recognition device for recognizing a word sequence using a switching speech model network
US7203635B2 (en) * 2002-06-27 2007-04-10 Microsoft Corporation Layered models for context awareness
US7493253B1 (en) 2002-07-12 2009-02-17 Language And Computing, Inc. Conceptual world representation natural language understanding system and method
US7047047B2 (en) 2002-09-06 2006-05-16 Microsoft Corporation Non-linear observation model for removing noise from corrupted signals
US7191130B1 (en) * 2002-09-27 2007-03-13 Nuance Communications Method and system for automatically optimizing recognition configuration parameters for speech recognition systems
US7529671B2 (en) * 2003-03-04 2009-05-05 Microsoft Corporation Block synchronous decoding
US7548858B2 (en) 2003-03-05 2009-06-16 Microsoft Corporation System and method for selective audible rendering of data to a user based on user input
US7165026B2 (en) * 2003-03-31 2007-01-16 Microsoft Corporation Method of noise estimation using incremental bayes learning
US7991984B2 (en) * 2005-02-17 2011-08-02 Samsung Electronics Co., Ltd. System and method for executing loops in a processor
US7653547B2 (en) * 2005-03-31 2010-01-26 Microsoft Corporation Method for testing a speech server
US8195462B2 (en) * 2006-02-16 2012-06-05 At&T Intellectual Property Ii, L.P. System and method for providing large vocabulary speech processing based on fixed-point arithmetic
US8831943B2 (en) * 2006-05-31 2014-09-09 Nec Corporation Language model learning system, language model learning method, and language model learning program
US9147212B2 (en) 2008-06-05 2015-09-29 Aisle411, Inc. Locating products in stores using voice search from a communication device
US9015093B1 (en) 2010-10-26 2015-04-21 Michael Lamport Commons Intelligent control with hierarchical stacked neural networks
US8775341B1 (en) 2010-10-26 2014-07-08 Michael Lamport Commons Intelligent control with hierarchical stacked neural networks
US20120109649A1 (en) * 2010-11-01 2012-05-03 General Motors Llc Speech dialect classification for automatic speech recognition
US9064191B2 (en) 2012-01-26 2015-06-23 Qualcomm Incorporated Lower modifier detection and extraction from devanagari text images to improve OCR performance
US9047540B2 (en) 2012-07-19 2015-06-02 Qualcomm Incorporated Trellis based word decoder with reverse pass
US9262699B2 (en) 2012-07-19 2016-02-16 Qualcomm Incorporated Method of handling complex variants of words through prefix-tree based decoding for Devanagiri OCR
US9014480B2 (en) 2012-07-19 2015-04-21 Qualcomm Incorporated Identifying a maximally stable extremal region (MSER) in an image by skipping comparison of pixels in the region
US9141874B2 (en) 2012-07-19 2015-09-22 Qualcomm Incorporated Feature extraction and use with a probability density function (PDF) divergence metric
US9484022B2 (en) 2014-05-23 2016-11-01 Google Inc. Training multiple neural networks with different accuracy
US9875081B2 (en) * 2015-09-21 2018-01-23 Amazon Technologies, Inc. Device selection for providing a response
US10482904B1 (en) 2017-08-15 2019-11-19 Amazon Technologies, Inc. Context driven device arbitration

Family Cites Families (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4164025A (en) * 1977-12-13 1979-08-07 Bell Telephone Laboratories, Incorporated Spelled word input directory information retrieval system with input word error corrective searching
US5052038A (en) * 1984-08-27 1991-09-24 Cognitronics Corporation Apparatus and method for obtaining information in a wide-area telephone system with digital data transmission between a local exchange and an information storage site
US4751737A (en) * 1985-11-06 1988-06-14 Motorola Inc. Template generation method in a speech recognition system
US4797910A (en) * 1986-05-07 1989-01-10 American Telphone And Telegraph Company, At&T Bell Laboratories Automated operator assistance calls with voice processing
US4959855A (en) * 1986-10-08 1990-09-25 At&T Bell Laboratories Directory assistance call processing and calling customer remote signal monitoring arrangements
US4805219A (en) * 1987-04-03 1989-02-14 Dragon Systems, Inc. Method for speech recognition
DE3819178A1 (en) * 1987-06-04 1988-12-22 Ricoh Kk Speech recognition method and device
US4979206A (en) * 1987-07-10 1990-12-18 At&T Bell Laboratories Directory assistance systems
JPH01102599A (en) * 1987-10-12 1989-04-20 Internatl Business Mach Corp <Ibm> Voice recognition
US5127055A (en) * 1988-12-30 1992-06-30 Kurzweil Applied Intelligence, Inc. Speech recognition apparatus & method having dynamic reference pattern adaptation
US5086479A (en) * 1989-06-30 1992-02-04 Hitachi, Ltd. Information processing system using neural network learning function
JP2964507B2 (en) * 1989-12-12 1999-10-18 松下電器産業株式会社 HMM device
US5097509A (en) * 1990-03-28 1992-03-17 Northern Telecom Limited Rejection method for speech recognition
US5163083A (en) * 1990-10-12 1992-11-10 At&T Bell Laboratories Automation of telephone operator assistance calls
US5181237A (en) * 1990-10-12 1993-01-19 At&T Bell Laboratories Automation of telephone operator assistance calls
US5204894A (en) * 1990-11-09 1993-04-20 Bell Atlantic Network Services, Inc. Personal electronic directory
US5274695A (en) * 1991-01-11 1993-12-28 U.S. Sprint Communications Company Limited Partnership System for verifying the identity of a caller in a telecommunications network
US5271088A (en) * 1991-05-13 1993-12-14 Itt Corporation Automated sorting of voice messages through speaker spotting
US5390278A (en) * 1991-10-08 1995-02-14 Bell Canada Phoneme based speech recognition
US5459798A (en) * 1993-03-19 1995-10-17 Intel Corporation System and method of pattern recognition employing a multiprocessing pipelined apparatus with private pattern memory
US5515475A (en) * 1993-06-24 1996-05-07 Northern Telecom Limited Speech recognition method using a two-pass search
US5621859A (en) * 1994-01-19 1997-04-15 Bbn Corporation Single tree method for grammar directed, very large vocabulary speech recognizer
US5488652A (en) * 1994-04-14 1996-01-30 Northern Telecom Limited Method and apparatus for training speech recognition algorithms for directory assistance applications
JP2964881B2 (en) * 1994-09-20 1999-10-18 日本電気株式会社 Voice recognition device
WO1997008685A2 (en) * 1995-08-28 1997-03-06 Philips Electronics N.V. Method and system for pattern recognition based on dynamically constructing a subset of reference vectors
WO1997008686A2 (en) * 1995-08-28 1997-03-06 Philips Electronics N.V. Method and system for pattern recognition based on tree organised probability densities
EP0834114A2 (en) * 1996-03-28 1998-04-08 Koninklijke Philips Electronics N.V. Method and computer system for processing a set of data elements on a sequential processor
JP2980026B2 (en) * 1996-05-30 1999-11-22 日本電気株式会社 Voice recognition device
US5930753A (en) * 1997-03-20 1999-07-27 At&T Corp Combining frequency warping and spectral shaping in HMM based speech recognition

Also Published As

Publication number Publication date
US6092045A (en) 2000-07-18
EP0903728A2 (en) 1999-03-24
EP0903728A3 (en) 2000-01-05

Similar Documents

Publication Publication Date Title
CA2216224A1 (en) Block algorithm for pattern recognition
EP0321410B1 (en) Method and apparatus for constructing markov model word baseforms
US5528701A (en) Trie based method for indexing handwritten databases
US5933806A (en) Method and system for pattern recognition based on dynamically constructing a subset of reference vectors
US4759068A (en) Constructing Markov models of words from multiple utterances
EP0295876A3 (en) Parallel associative memory
US20030200085A1 (en) Pattern matching for large vocabulary speech recognition systems
CA1163371A (en) Spelled word recognizer
KR100247969B1 (en) Apparatus and method for massive pattern matching
KR19980702723A (en) Speech recognition method and device
US6023673A (en) Hierarchical labeler in a speech recognition system
WO2004049240A1 (en) Method and device for determining and outputting the similarity between two data strings
US7460995B2 (en) System for speech recognition
US5640488A (en) System and method for constructing clustered dictionary for speech and text recognition
EP0935237B1 (en) Pattern matching method and apparatus
Knill et al. Fast implementation methods for Viterbi-based word-spotting
Chen et al. Phone-centric local variability vector for text-constrained speaker verification
JP2720590B2 (en) Pattern recognition device
EP0181167A2 (en) Apparatus and method for identifying spoken words
CN116415144A (en) Model compression and acceleration method based on cyclic neural network
Zhao et al. Use of Kohonen self-organising feature maps for HMM parameter smoothing in speech recognition
US20050102285A1 (en) Image recognition
Li et al. Speaker Identification Via the Relation Network: a Meta-Learning Method
JPS59160275A (en) Word recognizing device
JP3126081B2 (en) Vector quantization method and vector quantization apparatus

Legal Events

Date Code Title Description
FZDE Discontinued