WO2004075093A3 - Music feature extraction using wavelet coefficient histograms - Google Patents

Music feature extraction using wavelet coefficient histograms Download PDF

Info

Publication number
WO2004075093A3
WO2004075093A3 PCT/US2004/004343 US2004004343W WO2004075093A3 WO 2004075093 A3 WO2004075093 A3 WO 2004075093A3 US 2004004343 W US2004004343 W US 2004004343W WO 2004075093 A3 WO2004075093 A3 WO 2004075093A3
Authority
WO
WIPO (PCT)
Prior art keywords
feature extraction
wavelet coefficient
music feature
music
coefficient histograms
Prior art date
Application number
PCT/US2004/004343
Other languages
French (fr)
Other versions
WO2004075093A2 (en
Inventor
Tao Li
Qi Li
Mitsunori Ogihara
Original Assignee
Univ Rochester
Univ Delaware
Tao Li
Qi Li
Mitsunori Ogihara
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Univ Rochester, Univ Delaware, Tao Li, Qi Li, Mitsunori Ogihara filed Critical Univ Rochester
Publication of WO2004075093A2 publication Critical patent/WO2004075093A2/en
Publication of WO2004075093A3 publication Critical patent/WO2004075093A3/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0033Recording/reproducing or transmission of music for electrophonic musical instruments
    • G10H1/0041Recording/reproducing or transmission of music for electrophonic musical instruments in coded form
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/075Musical metadata derived from musical analysis or for use in electrophonic musical instruments
    • G10H2240/081Genre classification, i.e. descriptive metadata for classification or selection of musical pieces according to style
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/075Musical metadata derived from musical analysis or for use in electrophonic musical instruments
    • G10H2240/085Mood, i.e. generation, detection or selection of a particular emotional content or atmosphere in a musical piece
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/131Mathematical functions for musical analysis, processing, synthesis or composition
    • G10H2250/215Transforms, i.e. mathematical transforms into domains appropriate for musical signal processing, coding or compression
    • G10H2250/235Fourier transform; Discrete Fourier Transform [DFT]; Fast Fourier Transform [FFT]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/131Mathematical functions for musical analysis, processing, synthesis or composition
    • G10H2250/215Transforms, i.e. mathematical transforms into domains appropriate for musical signal processing, coding or compression
    • G10H2250/251Wavelet transform, i.e. transform with both frequency and temporal resolution, e.g. for compression of percussion sounds; Discrete Wavelet Transform [DWT]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique

Abstract

A music classification technique computes histograms of Daubechies wavelet coefficients at various frequency subbands with various resolutions. The coefficients are then used as an input to a machine learning technique to identify the genre and emotional content of music.
PCT/US2004/004343 2003-02-14 2004-02-13 Music feature extraction using wavelet coefficient histograms WO2004075093A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US44731203P 2003-02-14 2003-02-14
US60/447,312 2003-02-14

Publications (2)

Publication Number Publication Date
WO2004075093A2 WO2004075093A2 (en) 2004-09-02
WO2004075093A3 true WO2004075093A3 (en) 2006-06-01

Family

ID=32908420

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/004343 WO2004075093A2 (en) 2003-02-14 2004-02-13 Music feature extraction using wavelet coefficient histograms

Country Status (2)

Country Link
US (1) US7091409B2 (en)
WO (1) WO2004075093A2 (en)

Families Citing this family (73)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE10254612A1 (en) * 2002-11-22 2004-06-17 Humboldt-Universität Zu Berlin Method for determining specifically relevant acoustic characteristics of sound signals for the analysis of unknown sound signals from a sound generation
TWI282970B (en) * 2003-11-28 2007-06-21 Mediatek Inc Method and apparatus for karaoke scoring
US7626110B2 (en) * 2004-06-02 2009-12-01 Stmicroelectronics Asia Pacific Pte. Ltd. Energy-based audio pattern recognition
US7563971B2 (en) * 2004-06-02 2009-07-21 Stmicroelectronics Asia Pacific Pte. Ltd. Energy-based audio pattern recognition with weighting of energy matches
DE102004047032A1 (en) * 2004-09-28 2006-04-06 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for designating different segment classes
US7297860B2 (en) * 2004-11-12 2007-11-20 Sony Corporation System and method for determining genre of audio
US20060200346A1 (en) * 2005-03-03 2006-09-07 Nortel Networks Ltd. Speech quality measurement based on classification estimation
US8013231B2 (en) * 2005-05-26 2011-09-06 Yamaha Corporation Sound signal expression mode determining apparatus method and program
US8086168B2 (en) * 2005-07-06 2011-12-27 Sandisk Il Ltd. Device and method for monitoring, rating and/or tuning to an audio content channel
US7672916B2 (en) * 2005-08-16 2010-03-02 The Trustees Of Columbia University In The City Of New York Methods, systems, and media for music classification
US20070083365A1 (en) * 2005-10-06 2007-04-12 Dts, Inc. Neural network classifier for separating audio sources from a monophonic audio signal
KR100715949B1 (en) * 2005-11-11 2007-05-08 삼성전자주식회사 Method and apparatus for classifying mood of music at high speed
KR100717387B1 (en) * 2006-01-26 2007-05-11 삼성전자주식회사 Method and apparatus for searching similar music
KR100749045B1 (en) * 2006-01-26 2007-08-13 삼성전자주식회사 Method and apparatus for searching similar music using summary of music content
US7885911B2 (en) * 2006-03-24 2011-02-08 Alcatel-Lucent Usa Inc. Fast approximate wavelet tracking on streams
US7772478B2 (en) * 2006-04-12 2010-08-10 Massachusetts Institute Of Technology Understanding music
WO2007133754A2 (en) * 2006-05-12 2007-11-22 Owl Multimedia, Inc. Method and system for music information retrieval
JP4665836B2 (en) * 2006-05-31 2011-04-06 日本ビクター株式会社 Music classification device, music classification method, and music classification program
WO2008028484A1 (en) * 2006-09-05 2008-03-13 Gn Resound A/S A hearing aid with histogram based sound environment classification
US7645929B2 (en) * 2006-09-11 2010-01-12 Hewlett-Packard Development Company, L.P. Computational music-tempo estimation
US7873634B2 (en) * 2007-03-12 2011-01-18 Hitlab Ulc. Method and a system for automatic evaluation of digital files
US7659471B2 (en) * 2007-03-28 2010-02-09 Nokia Corporation System and method for music data repetition functionality
US8073854B2 (en) * 2007-04-10 2011-12-06 The Echo Nest Corporation Determining the similarity of music using cultural and acoustic information
US7949649B2 (en) * 2007-04-10 2011-05-24 The Echo Nest Corporation Automatically acquiring acoustic and cultural information about music
JP5088030B2 (en) * 2007-07-26 2012-12-05 ヤマハ株式会社 Method, apparatus and program for evaluating similarity of performance sound
US8510252B1 (en) 2007-12-07 2013-08-13 Google, Inc. Classification of inappropriate video content using multi-scale features
US8001062B1 (en) * 2007-12-07 2011-08-16 Google Inc. Supervised learning using multi-scale features from time series events and scale space decompositions
WO2009081921A1 (en) * 2007-12-25 2009-07-02 Konami Digital Entertainment Co., Ltd. Game system and computer program
US9390167B2 (en) 2010-07-29 2016-07-12 Soundhound, Inc. System and methods for continuous audio matching
US8452586B2 (en) * 2008-12-02 2013-05-28 Soundhound, Inc. Identifying music from peaks of a reference sound fingerprint
US8660678B1 (en) * 2009-02-17 2014-02-25 Tonara Ltd. Automatic score following
US9196254B1 (en) * 2009-07-02 2015-11-24 Alon Konchitsky Method for implementing quality control for one or more components of an audio signal received from a communication device
KR101615262B1 (en) * 2009-08-12 2016-04-26 삼성전자주식회사 Method and apparatus for encoding and decoding multi-channel audio signal using semantic information
US9047371B2 (en) 2010-07-29 2015-06-02 Soundhound, Inc. System and method for matching a query against a broadcast stream
US8670577B2 (en) 2010-10-18 2014-03-11 Convey Technology, Inc. Electronically-simulated live music
US8676574B2 (en) * 2010-11-10 2014-03-18 Sony Computer Entertainment Inc. Method for tone/intonation recognition using auditory attention cues
US8478711B2 (en) 2011-02-18 2013-07-02 Larus Technologies Corporation System and method for data fusion with adaptive learning
US8756061B2 (en) 2011-04-01 2014-06-17 Sony Computer Entertainment Inc. Speech syllable/vowel/phone boundary detection using auditory attention cues
US20120259638A1 (en) * 2011-04-08 2012-10-11 Sony Computer Entertainment Inc. Apparatus and method for determining relevance of input speech
US9035163B1 (en) 2011-05-10 2015-05-19 Soundbound, Inc. System and method for targeting content based on identified audio and multimedia
US9691395B1 (en) 2011-12-31 2017-06-27 Reality Analytics, Inc. System and method for taxonomically distinguishing unconstrained signal data segments
US9558762B1 (en) * 2011-07-03 2017-01-31 Reality Analytics, Inc. System and method for distinguishing source from unconstrained acoustic signals emitted thereby in context agnostic manner
US9886945B1 (en) * 2011-07-03 2018-02-06 Reality Analytics, Inc. System and method for taxonomically distinguishing sample data captured from biota sources
WO2013038298A1 (en) * 2011-09-12 2013-03-21 Koninklijke Philips Electronics N.V. Device and method for disaggregating a periodic input signal pattern
US8965766B1 (en) * 2012-03-15 2015-02-24 Google Inc. Systems and methods for identifying music in a noisy environment
DE102012206299B4 (en) * 2012-04-17 2017-11-02 Sivantos Pte. Ltd. Method for operating a hearing device and hearing device
US9305567B2 (en) * 2012-04-23 2016-04-05 Qualcomm Incorporated Systems and methods for audio signal processing
US10957310B1 (en) 2012-07-23 2021-03-23 Soundhound, Inc. Integrated programming framework for speech and text understanding with meaning parsing
US9263060B2 (en) * 2012-08-21 2016-02-16 Marian Mason Publishing Company, Llc Artificial neural network based system for classification of the emotional content of digital music
US9020822B2 (en) 2012-10-19 2015-04-28 Sony Computer Entertainment Inc. Emotion recognition using auditory attention cues extracted from users voice
US9031293B2 (en) 2012-10-19 2015-05-12 Sony Computer Entertainment Inc. Multi-modal sensor based emotion recognition and emotional interface
US9672811B2 (en) 2012-11-29 2017-06-06 Sony Interactive Entertainment Inc. Combining auditory attention cues with phoneme posterior scores for phone/vowel/syllable boundary detection
US9183849B2 (en) 2012-12-21 2015-11-10 The Nielsen Company (Us), Llc Audio matching with semantic audio recognition and report generation
US9158760B2 (en) 2012-12-21 2015-10-13 The Nielsen Company (Us), Llc Audio decoding with supplemental semantic audio recognition and report generation
US9195649B2 (en) 2012-12-21 2015-11-24 The Nielsen Company (Us), Llc Audio processing techniques for semantic audio recognition and report generation
US20140184917A1 (en) * 2012-12-31 2014-07-03 Sling Media Pvt Ltd Automated channel switching
US20160322066A1 (en) * 2013-02-12 2016-11-03 Google Inc. Audio Data Classification
US9336770B2 (en) * 2013-08-13 2016-05-10 Mitsubishi Electric Corporation Pattern recognition apparatus for creating multiple systems and combining the multiple systems to improve recognition performance and pattern recognition method
US9507849B2 (en) 2013-11-28 2016-11-29 Soundhound, Inc. Method for combining a query and a communication command in a natural language computer system
TWI623881B (en) * 2013-12-13 2018-05-11 財團法人資訊工業策進會 Event stream processing system, method and machine-readable storage
US9292488B2 (en) 2014-02-01 2016-03-22 Soundhound, Inc. Method for embedding voice mail in a spoken utterance using a natural language processing computer system
US11295730B1 (en) 2014-02-27 2022-04-05 Soundhound, Inc. Using phonetic variants in a local context to improve natural language understanding
US9564123B1 (en) 2014-05-12 2017-02-07 Soundhound, Inc. Method and system for building an integrated user profile
US9931483B2 (en) * 2014-05-28 2018-04-03 Devilbiss Healtcare Llc Detection of periodic breathing during CPAP therapy
US20150347388A1 (en) * 2014-06-03 2015-12-03 Google Inc. Digital Content Genre Representation
US9934785B1 (en) 2016-11-30 2018-04-03 Spotify Ab Identification of taste attributes from an audio signal
CN107680037B (en) * 2017-09-12 2020-09-29 河南大学 Improved face super-resolution reconstruction method based on nearest characteristic line manifold learning
JP7143327B2 (en) 2017-10-03 2022-09-28 グーグル エルエルシー Methods, Computer Systems, Computing Systems, and Programs Implemented by Computing Devices
CN109408660B (en) * 2018-08-31 2021-08-10 安徽四创电子股份有限公司 Music automatic classification method based on audio features
US11024288B2 (en) 2018-09-04 2021-06-01 Gracenote, Inc. Methods and apparatus to segment audio and determine audio segment similarities
CN111786951B (en) * 2020-05-28 2022-08-26 东方红卫星移动通信有限公司 Traffic data feature extraction method, malicious traffic identification method and network system
CN112634841B (en) * 2020-12-02 2022-11-29 爱荔枝科技(北京)有限公司 Guitar music automatic generation method based on voice recognition
CN113488070B (en) * 2021-09-08 2021-11-16 中国科学院自动化研究所 Method and device for detecting tampered audio, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5781881A (en) * 1995-10-19 1998-07-14 Deutsche Telekom Ag Variable-subframe-length speech-coding classes derived from wavelet-transform parameters
US20030086341A1 (en) * 2001-07-20 2003-05-08 Gracenote, Inc. Automatic identification of sound recordings
US6718063B1 (en) * 1998-12-11 2004-04-06 Canon Kabushiki Kaisha Method and apparatus for computing the similarity between images

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2643986B1 (en) * 1989-03-03 1991-05-17 Thomson Csf WAVES SIGNAL ANALYSIS METHOD
US6105015A (en) * 1997-02-03 2000-08-15 The United States Of America As Represented By The Secretary Of The Navy Wavelet-based hybrid neurosystem for classifying a signal or an image represented by the signal in a data system
US20010044719A1 (en) * 1999-07-02 2001-11-22 Mitsubishi Electric Research Laboratories, Inc. Method and system for recognizing, indexing, and searching acoustic signals
US20030205124A1 (en) * 2002-05-01 2003-11-06 Foote Jonathan T. Method and system for retrieving and sequencing music by rhythmic similarity

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5781881A (en) * 1995-10-19 1998-07-14 Deutsche Telekom Ag Variable-subframe-length speech-coding classes derived from wavelet-transform parameters
US6718063B1 (en) * 1998-12-11 2004-04-06 Canon Kabushiki Kaisha Method and apparatus for computing the similarity between images
US20030086341A1 (en) * 2001-07-20 2003-05-08 Gracenote, Inc. Automatic identification of sound recordings

Also Published As

Publication number Publication date
WO2004075093A2 (en) 2004-09-02
US20040231498A1 (en) 2004-11-25
US7091409B2 (en) 2006-08-15

Similar Documents

Publication Publication Date Title
WO2004075093A3 (en) Music feature extraction using wavelet coefficient histograms
WO2001076319A3 (en) Method and apparatus for voice signal extraction
EP1672920A3 (en) Searching electronic program guide data
WO2006092632A3 (en) Detecting partial discharge in high voltage cables
WO2007035912A3 (en) Document processing
AU2002343175A1 (en) Method and device for determining and outputting the similarity between two data strings
WO2006023715A3 (en) Applying scanned information to identify content
AU2003264774A1 (en) Improved audio data fingerprint searching
EP1752969A8 (en) Signal separation device, signal separation method, signal separation program, and recording medium
WO2008008213A3 (en) Interactively crawling data records on web pages
EP1253527A3 (en) Method and system for applying input mode bias
WO2003069554A3 (en) Method and system for interactive ground-truthing of document images
HK1099946A1 (en) Method and device for speech enhancement in the presence of background noise
WO2003058374A3 (en) Content conversion method and apparatus
WO2008101130A3 (en) Music-based search engine
EP1536638A4 (en) Metadata preparing device, preparing method therefor and retrieving device
AU2002367883A1 (en) Reshaping freehand drawn lines and shapes in an electronic document
AU2002330406A1 (en) Biosensing instrument and method and identifying device having biosensing function
MY134388A (en) Method for performing stratigraphically-based seed detection in a 3-d seismic data volume
WO2006044207A3 (en) An electronic device and method for visual text interpretation
WO2005124966A3 (en) Method and apparatus for detecting impedance
AU2003211934A1 (en) Card sound device and electronic apparatus having same
WO2007027839A3 (en) Device and methods for enhanced matched filtering based on correntropy
DE60204504D1 (en) Keyword recognition in a noisy signal
EP1921599A3 (en) Character processing apparatus and method

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase