WO2011019473A1 - Content recognition and synchronization on a television or consumer electronics device - Google Patents

Content recognition and synchronization on a television or consumer electronics device Download PDF

Info

Publication number
WO2011019473A1
WO2011019473A1 PCT/US2010/042044 US2010042044W WO2011019473A1 WO 2011019473 A1 WO2011019473 A1 WO 2011019473A1 US 2010042044 W US2010042044 W US 2010042044W WO 2011019473 A1 WO2011019473 A1 WO 2011019473A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
identifier
program
network
metadata
Prior art date
Application number
PCT/US2010/042044
Other languages
French (fr)
Inventor
Kenneth Olson
Original Assignee
Rovi Technologies Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rovi Technologies Corporation filed Critical Rovi Technologies Corporation
Priority to EP22205939.6A priority Critical patent/EP4210246A1/en
Priority to JP2012524717A priority patent/JP5481559B2/en
Priority to CA2771066A priority patent/CA2771066C/en
Priority to EP10736928A priority patent/EP2465053A1/en
Publication of WO2011019473A1 publication Critical patent/WO2011019473A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7834Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H60/00Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
    • H04H60/56Arrangements characterised by components specially adapted for monitoring, identification or recognition covered by groups H04H60/29-H04H60/54
    • H04H60/58Arrangements characterised by components specially adapted for monitoring, identification or recognition covered by groups H04H60/29-H04H60/54 of audio
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H60/00Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
    • H04H60/61Arrangements for services using the result of monitoring, identification or recognition covered by groups H04H60/29-H04H60/54
    • H04H60/64Arrangements for services using the result of monitoring, identification or recognition covered by groups H04H60/29-H04H60/54 for providing detail information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H2201/00Aspects of broadcast communication
    • H04H2201/30Aspects of broadcast communication characterised by the use of a return channel, e.g. for collecting users' opinions, for returning broadcast space/time information or for requesting data
    • H04H2201/37Aspects of broadcast communication characterised by the use of a return channel, e.g. for collecting users' opinions, for returning broadcast space/time information or for requesting data via a different channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H2201/00Aspects of broadcast communication
    • H04H2201/50Aspects of broadcast communication characterised by the use of watermarks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H2201/00Aspects of broadcast communication
    • H04H2201/90Aspects of broadcast communication characterised by the use of signatures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H60/00Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
    • H04H60/68Systems specially adapted for using specific information, e.g. geographical or meteorological information
    • H04H60/72Systems specially adapted for using specific information, e.g. geographical or meteorological information using electronic programme guides [EPG]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H60/00Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
    • H04H60/68Systems specially adapted for using specific information, e.g. geographical or meteorological information
    • H04H60/73Systems specially adapted for using specific information, e.g. geographical or meteorological information using meta-information

Definitions

  • Example aspects of the present invention generally relate to content
  • broadcast television or radio programs which may be displayed on-screen. Users may view, navigate, select, and discover content by time, title, channel, genre, etc.
  • consumer electronic (CE) devices to associate a song with a particular television show, movie, game or other content source, and further, to provide users with related metadata.
  • CE consumer electronic
  • One technical challenge in doing so is associating the song to the content or program.
  • metadata Despite the technical efforts of those providing metadata about programs, in many cases such information does not exist, or is limited. It would also be useful to provide a system that builds a database that associates information such as audio information with content such as, for example,
  • the system includes a server having a network interface to transmit and receive data over a network.
  • the server receives an audio fingerprint (FP) and a program identifier (Prog_ID) from the network and associates the audio fingerprint with an audio identifier.
  • FP audio fingerprint
  • Prog_ID program identifier
  • Metadata associated with the audio identifier and the program data are transmitted onto the network.
  • a user device in another aspect, includes an input interface to receive content from at least one content source.
  • the content contains an audio portion, a video portion, and program guide data
  • the user device also includes a program identifier (Prog_ID).
  • Prog_ID program identifier
  • the user device also includes a
  • processor to generate an audio fingerprint (FP) from a subset of the audio portion and communicate the program identifier and the audio fingerprint onto a network.
  • FP audio fingerprint
  • the user device receives metadata associated with the audio identifier (Audio_ID) and the program data from the network through a network interface.
  • FIG. Ia is a system diagram of an exemplary content recognition
  • FIG. Ib is a block diagram of an example home network in which some embodiments are implemented.
  • FIG. 2 is a block diagram of an example user device in accordance with an embodiment of the invention.
  • FIG. 3 is a ladder diagram showing an example procedure for associating a program identifier (Prog_ID) with an audio identifier (Audio_ID) and returning metadata associated with an audio portion of received content.
  • Prog_ID program identifier
  • Audio_ID audio identifier
  • FIG. 4 illustrates an exemplary record for a particular program identifier
  • FIG. 5 is a high-level block diagram of a general and/or special purpose computer system, in accordance with some embodiments.
  • Systems, methods, apparatus and computer-readable media are provided for recognizing an audio portion of received content (e.g., songs, speeches)
  • the content may also be individually and/or collectively referred to as media or
  • the content is delivered and/or
  • a user device such as, for example, a television or another type of consumer electronic (CE) device.
  • CE consumer electronic
  • EPG Electronic program guide
  • EPG data provides a digital guide for a scheduled broadcast television typically displayed on-screen and can be used to allow a viewer to navigate, select, and discover content by time, title, channel, genre, etc. by use of their remote control, a keyboard, or other similar input
  • EPG data information can be used to schedule future events.
  • DVR digital video recorder
  • PVR personal video recorder
  • album means a collection of tracks.
  • An album is typically originally published by an established entity, such as a record label (e.g., a recording
  • Audio Fingerprint (e.g., “fingerprint”, “acoustic fingerprint”, “digital fingerprint”) is a digital measure of certain acoustic properties that is
  • An audio fingerprint typically operates as a unique identifier for a particular item, such as, for example, a CD, a DVD and/or a Blu-ray Disc.
  • identifier is defined below.
  • An audio fingerprint is an independent piece of data that is not affected by metadata. Macrovision® has databases that store over 25 million unique
  • audio fingerprints for various audio samples. Practical uses of audio fingerprints include without limitation identifying songs, identifying records, identifying melodies, identifying tunes, identifying advertisements, monitoring radio broadcasts,
  • Audio Fingerprinting is the process of generating an audio fingerprint.
  • Audio Files which is herein incorporated by reference, provides an example of an apparatus for audio fingerprinting an audio waveform.
  • Blu-ray also known as Blu-ray Disc, means a disc format jointly
  • the format was developed to enable recording, rewriting and playback of high- definition (HD) video, as well as storing large amounts of data.
  • the format offers more than five times the storage capacity of conventional DVDs and can hold 25 GB on a single-layer disc and 800 GB on a 20-layer disc. More layers and more storage capacity may be feasible as well. This extra capacity combined with the use of advanced audio and/or video codecs offers consumers an unprecedented HD experience.
  • current disc technologies, such as CD and DVD rely on a red laser to read and write data
  • the Blu-ray format uses a blue-violet laser instead, hence the name Blu-ray.
  • the benefit of using a blue-violet laser (605 nm) is that it has a shorter wavelength than a red laser (650 nm). A shorter wavelength makes it
  • Chapter means an audio and/or video data block on a disc, such as a Blu- ray Disc, a CD or a DVD.
  • a chapter stores at least a portion of an audio and/or video recording.
  • CD Compact Disc
  • mini-CD 740 mm and can typically hold up to 80 minutes of audio.
  • mini-CD with diameters ranging from 60 to 80 mm.
  • Mini-CDs are sometimes used for CD singles and typically store up to 24 minutes of audio.
  • CD technology has been adapted and expanded to include without limitation data storage CD-ROM, write- once audio and data storage CD-R, rewritable media CD-RW, Super Audio CD
  • SACD Video Compact Discs
  • VCD Video Compact Discs
  • SVCD Super Video Compact Discs
  • Photo CD Picture CD
  • Compact Disc Interactive CD-i
  • Enhanced CD The wavelength used by standard CD lasers is 650 nm, and thus the light of a standard CD laser typically has a red color.
  • Database means a collection of data organized in such a way that a
  • a computer program may quickly select desired pieces of the data.
  • a database is an electronic filing system.
  • the term “database” may be used as shorthand for “database management system”.
  • Device means software, hardware or a combination thereof.
  • a device may sometimes be referred to as an apparatus. Examples of a device include
  • DVD Digital Video Disc
  • DVD was originally developed for storing digital video and digital audio data.
  • CDs but DVDs store more than six times as much data.
  • mini- DVD with diameters ranging from 60 to 80 mm. DVD technology has been
  • DVD-ROM DVD-ROM
  • DVD-R DVD+R
  • DVD-RW DVD-RW
  • Fuzzy search e.g., "fuzzy string search", “approximate string search”
  • Fuzzy searching may also be known as approximate or inexact matching. An exact match may inadvertently occur while performing a fuzzy search.
  • Signature means an identifying means that uniquely identifies an item, such as, for example, a track, a song, an album, a CD, a DVD and/or Blu-ray Disc, among other items. Examples of a signature include without limitation the
  • an audio fingerprint a portion of an audio fingerprint, a signature derived from an audio fingerprint, an audio signature, a video signature, a disc signature, a CD signature, a DVD signature, a Blu-ray
  • Disc signature a media signature, a high definition media signature, a human
  • a human footprint an animal fingerprint, an animal footprint, a
  • a signature may be any computer-readable string of characters that comports with any coding standard in any language. Examples of a coding standard include without limitation alphabet, alphanumeric, decimal, hexadecimal, binary, American Standard Code for Information Interchange
  • Signatures may not initially be computer-readable. For example, latent human fingerprints may be printed on a door knob in the physical world. A signature that is initially not
  • a computer-readable may be converted into a computer-readable signature by using any appropriate conversion technique.
  • a conversion technique for converting a latent human fingerprint into a computer-readable signature may include a ridge characteristics analysis.
  • Link means an association with an object or an element in memory.
  • a link is typically a pointer.
  • a pointer is a variable that contains the address of a location in memory. The location is the starting point of an allocated object, such
  • Metadata generally means data that describes data. More particularly, metadata may be used to describe the contents of digital recordings. Such metadata
  • Metadata may include, for example, a track name, a song name, artist information (e.g., name, birth date, discography), album information (e.g., album title, review, track listing, sound samples), relational information (e.g., similar artists and
  • advertisements e.g., links or programs (e.g., software applications), and related images.
  • Metadata may also include a program guide listing of the songs or other audio content associated with multimedia content.
  • Conventional optical discs e.g., CDs, DVDs, Blu-ray Discs
  • Metadata may be
  • a digital recording e.g., song, album, movie or video
  • a digital recording e.g., song, album, movie or video
  • Network means a connection between any two or more computers, which permits the transmission of data.
  • a network may be any combination of networks, including without limitation the Internet, a local area network, a wide area network, a wireless network and a cellular network.
  • Ordering means a copy of a recording.
  • An occurrence is preferably an exact copy of a recording.
  • different occurrences of a same pressing are typically exact copies.
  • an occurrence is not necessarily an exact copy of a recording, and may be a substantially similar copy.
  • a recording may be an inexact copy for a number of reasons, including without limitation an
  • a recording may be the source of multiple occurrences that may be exact copies or substantially similar copies. Different occurrences may be located on different devices, including without limitation different user devices, different MP3 players, different databases, different laptops, and so on. Each occurrence of a recording may be located on any appropriate storage medium, including without limitation
  • Pressing means producing a disc in a disc press from a master.
  • the disc press preferably includes a laser beam having a bandwidth of about 650 nm for DVD or about 605 nm for Blu-ray Disc.
  • Recording means media data for playback.
  • a recording is preferably a computer readable digital recording and may be, for example, an audio track, a video track, a song, a chapter, a CD recording, a DVD recording and/or a Blu-ray Disc recording, among other things.
  • Server means a software application that provides services to other
  • a server may also refer to the physical computer that has been set aside to run a specific server application.
  • Apache HTTP Server is used as the web server for a company's website
  • the computer running Apache is also called the web server.
  • Server applications can be divided among server computers over an extreme range, depending upon the workload.
  • Software means a computer program that is written in a programming language that may be used by one of ordinary skill in the art.
  • the programming language chosen should be compatible with the computer by which the software application is to be executed and, in particular, with the operating system of that computer. Examples of suitable programming languages include without
  • Computer readable media are discussed in more detail in a separate section below.
  • “Song” means a musical composition.
  • a song is typically recorded onto a track by a record label (e.g., recording company).
  • a song may have many
  • a radio version for example, a radio version and an extended version.
  • System means a device or multiple coupled devices.
  • a device is defined above.
  • Track means an audio/video data block.
  • a track may be on a disc, such as, for example, a Blu-ray Disc, a CD or a DVD.
  • User means a consumer, client, and/or client device in a marketplace of products and/or services.
  • User device e.g., "client”, “client device”, “user computer” is a
  • a user device may refer to a single computer or to a
  • a user device may be the client part of a client- server architecture.
  • a user device typically relies on a server to perform some operations. Examples of a user device include without limitation a television, a
  • CD player CD player
  • DVD player DVD player
  • Blu-ray Disc player a personal media device
  • portable media player an iPod®, a Zoom Player, a laptop computer, a palmtop computer, a smart phone, a cell phone, a mobile phone, an MP3 player, a digital audio recorder, a digital video recorder, an IBM-type personal computer (PC) having an operating system such as Microsoft Windows®, an Apple® computer having an operating system such as MAC-OS, hardware having a JAVA-OS
  • Web browser means any software program which can display text, graphics, or both, from Web pages on Web sites. Examples of a Web browser include without limitation Mozilla Firefox® and Microsoft Internet Explorer®.
  • Web page means any documents written in mark-up language including without limitation HTML (hypertext mark-up language) or VRML (virtual reality modeling language), dynamic HTML, XML (extended mark-up language) or
  • Web server refers to a computer or other electronic device which is capable of serving at least one Web page to a Web browser.
  • An example of a Web server is a Yahoo® Web server.
  • Web site means at least one Web page, and more commonly a plurality of Web pages, virtually coupled to form a coherent group.
  • FIG. Ia is a system diagram of an exemplary audio recognition
  • system 100 includes at least one content source 102 that provides
  • Metadata database 106 that contains supplemental content associated with an audio portion of a multimedia stream (e.g., audio metadata).
  • metadata database 106 can also be a
  • a guide database 108 provides EPG data associated with a multimedia program. As shown in FIG. Ia, guide database 108 provides the EPG data to a user device 104 for content and/or media, such as a television, an audio device, a video device, and/or another type of user and/or consumer electronic (CE) device.
  • content and/or media such as a television, an audio device, a video device, and/or another type of user and/or consumer electronic (CE) device.
  • CE consumer electronic
  • Guide database 108 also stores program metadata that may not be communicated directly to the user device 104.
  • metadata database 106 and guide database 108 are linked. In one embodiment, this link is initiated from within the user device 104.
  • a request packet from the user device 104 causes a remote server (110 illustrated in Figure 2) to associate the audio data to a program for the purpose of retrieving metadata about the program.
  • this association is a logical association and/or link. It should be understood, however, that a link between entries within the metadata database 106 and entries within the guide database 108 may be physical and still be within the scope of the invention.
  • a program identifier (Prog_ID) corresponding to the multimedia content such as, for example, a television program being tuned-in from a content source
  • the user device 102 is provided to the user device 104 by the guide database 108.
  • the user device 102 is provided to the user device 104 by the guide database 108.
  • the recognition server includes or is in communication with the metadata database 106.
  • the recognition server of some embodiments is further described in relation to
  • a search of the metadata database 106 is performed to lookup an audio identifier (Audio_ID) associated with the audio portion of the content received by the user device 104 from the content source 102 based on the audio fingerprint
  • the audio identifier (Audio_ID) together with a program identifier (Prog_ID) are used to make a logical link between entries within the metadata database 106 and the guide database 108.
  • FP fingerprint
  • Pulse code modulation is a format by which many consumer electronic products operate and internally
  • any memory size, number of frames, sampling rates, time, and the like, used to perform audio fingerprinting are within the scope of the present invention.
  • FIG. Ib is a block diagram of an example home network in which some embodiments are implemented.
  • On the home network may be a variety of user devices, such as a network ready television 104a, a personal computer 104b, a gaming device 104c, a digital video recorder 104d, other devices 104e, and the like.
  • User devices 104a-104e may receive multimedia content from content
  • user devices 104a- 104e may communication with each other through a wired or wireless router 120 via network connections 132, such as Ethernet.
  • the router 120 connects the user devices 104a- 104e to the network 124, such as the Internet,
  • content sources 102 are delivered from the network 124.
  • FIG. 2 includes a more detailed diagram of the user device 104 of some embodiments.
  • the exemplary user device 104 includes a processor 212 which is coupled through a communication infrastructure (not shown).
  • the input interface 208 receives content such as in the form of audio and video streams from the content sources 102, which communicate, for example, through an HDMI (High-Definition Multimedia Interface), Radio Frequency (RF) coaxial cable, composite video, S-Video, SCART, component video, D-Terminal, VGA, and the like, to the user device 104.
  • the content sources 102 include set-top boxes, Blu-ray Disc players, personal computers (PCs), video game consoles such as the PlayStation 3 and the Xbox 360, for example, and A/V receivers, and the like.
  • the content sources 102 provide a program identifier for the movie, show or game, which is stored in a memory 214.
  • Audio signals are communicated to the processor 212 for further processing.
  • the processor 212 performs audio fingerprinting on at least a subset of the audio portion of the received content and requests metadata from one or more remote servers. As described in more detail below with respect to FIG. 3, the metadata are preferably requested based on a generated audio fingerprint (FP) and/or the program identifier.
  • FP generated audio fingerprint
  • the user device 104 also includes a main memory 214.
  • main memory 214 is random access memory (RAM).
  • the user device 104 may also include a storage device 216.
  • the storage device 216 also sometimes
  • second memory may include, for example, a hard disk drive and/or a removable storage drive, representing a disk drive, a magnetic tape drive, an optical disk drive, etc.
  • storage device 216 may include, for example, a hard disk drive and/or a removable storage drive, representing a disk drive, a magnetic tape drive, an optical disk drive, etc.
  • storage device 216 may include, for example, a hard disk drive and/or a removable storage drive, representing a disk drive, a magnetic tape drive, an optical disk drive, etc.
  • storage device 216 may include, for example, a hard disk drive and/or a removable storage drive, representing a disk drive, a magnetic tape drive, an optical disk drive, etc.
  • Attorney Docket No. 03449.000024 AMG0024 include a computer-readable storage medium having stored thereon computer software and/or data.
  • storage device 216 may include other similar devices for allowing computer programs or other instructions to be loaded into the user device 104.
  • Such devices may include, for example, a removable storage unit and an interface. Examples of such may include a program cartridge and cartridge interface such as that found in video game devices, a removable memory chip such as an erasable programmable read only memory (EPROM), or
  • EPROM erasable programmable read only memory
  • PROM programmable read only memory
  • removable storage units and interfaces which allow software and data to be transferred from the removable storage unit to the user device 104.
  • the user device 104 includes the communications interface 210 to provide connectivity to a network 124 such as the Internet.
  • the communications interface 210 also allows software and data to be transferred between the user device 104 and external devices. Examples of the communications interface 210 may include a modem, a network interface such as an Ethernet card, a communications port, a Personal Computer Memory Card International Association (PCMCIA) slot and card, etc.
  • Software and data transferred via the communications interface 210 are in the form of signals which may be electronic, electromagnetic, optical or other signals capable of being received by the communications interface 210.
  • These signals are provided to the communications interface 210 via a
  • This channel carries signals and may be implemented by using wire or cable, fiber optics, a telephone line, a cellular link, an RF link and other
  • a remote control interface 218 decodes signals received from a remote control 204, e.g., a television remote control or other input device keyboard, and
  • the decoded signals communicates the decoded signals to processor 212.
  • the decoded signals are translated and processed by the processor 212.
  • the recognition servers 110 may also be in
  • the statistics database 220 and/or guide database 108 may also be in communication directly with
  • the metadata database 106 may be part of or remote from the recognition servers 110.
  • FIG. 3 is a ladder diagram showing an example procedure for associating a program identifier (Prog_ID) with an audio identifier (Audio_ID) and returning metadata associated with a song.
  • Prog_ID program identifier
  • Audio_ID audio identifier
  • the user device 104 receives a command to initiate a lookup by, for example, a remote control 204.
  • the input interface 208 captures a sample of the audio stream from a content source 102, and feeds the audio stream such as a PCM audio
  • processor 212 which performs an audio recognition process on the captured audio.
  • the processor 212 analyzes the
  • audio fingerprinting captured audio instead of audio fingerprinting captured audio, other audio identification techniques can be used. For example a watermark embedded into the audio stream or a tag inserted in the audio stream can be used as an identifier, e.g., the
  • the audio fingerprint (FP) and program identifier (Prog_ID) are transmitted to one or more recognition server(s) 110.
  • the recognition server 110 is also referred to more generally as a back-end server.
  • the recognition server 110 performs a lookup of an audio identifier (Audio_ID) associated with the audio portion of the content, such as, for example, a song being played, based on the audio fingerprint (FP) of the song. Metadata about the audio portion of the content are also retrieved from the metadata database 106.
  • the program identifier (Prog_ID) is transmitted to the guide database 108.
  • the guide database 108 returns program metadata including information
  • the guide database 108 returns the metadata in one or more datagrams and/or packets. For instance, the audio metadata and the program metadata are returned within the same packet or in separate packets.
  • the packet transmitted by the guide database 108 to the recognition server 110 is a return packet from an original request. Accordingly, the metadata carried in the packet is preferably appropriately matched based on identifying information provided in a field of the packet which is examined and recognized by the other servers, databases and/or devices on the network 124. This identifying field may be the program identifier (Prog_ID) or other identifier initially provided by the user device 104, and/or generated by the processor 212 or the communications interface 210, for example.
  • Prog_ID program identifier
  • This identifying field may be the program identifier (Prog_ID) or other identifier initially provided by the user device 104, and/or generated by the processor 212 or the communications interface 210, for example.
  • the recognition server 110 transmits onto the network 124 the audio identifier
  • the processor 212 stores metadata in memory 214 and displays the
  • the output interface 206 presents the metadata as an overlay of the video received from the content source 102, which is being displayed on the television or the user device 104.
  • the same procedure discussed above may be performed until the audio portion of the content is recognized.
  • an audio fingerprint of a captured audio portion of the content is precise enough to return metadata, the procedure ends.
  • the audio fingerprint may not be sufficiently robust for the recognition server 110 to match it to an audio identifier (Audio_ID).
  • the return packet from the recognition server 110 may be inconclusive, e.g. , the return packet returns a null audio identifier (Audio_ID).
  • Various reasons may be the cause of this.
  • One example is that audio content was mixed with voice-over or sound effects noises in a received multimedia content stream.
  • the fingerprint algorithm may generate
  • Different fingerprints may be generated based on the length of the captured segment or from where within the audio stream the audio capturing took place.
  • the processor 212 detects a time-based offset location of the multimedia content
  • a remote recognition server corresponding to the audio fingerprint and transmits the location onto the network to, for example, a remote recognition server.
  • the processor 212 may initiate an additional lookup. This causes additional audio to be captured by the input interface 208. Alternatively, this additional information is extracted from memory 214 or storage 216 if the audio
  • the processor 212 performs audio recognition on the additional information.
  • the additional audio information may be added to the audio information previously captured, to make the total captured segment longer.
  • a first audio information may be added to the audio information previously captured, to make the total captured segment longer.
  • the processor 212 is programmed to adjust the total audio capture time.
  • the different audio capture times may be prestored or based on an analysis of prior lookup results. Alternatively, this analysis is performed offline by, for example, a statistics server database 220, and the new capture time may be downloaded by the processor 212 through the communications interface 210 during an update.
  • the processor 212 transmits it to the recognition server 110 along with the program identifier (Prog_ID).
  • the recognition server 110 performs a lookup based on the fingerprint (FP) for an audio identifier (Audio_ID).
  • the recognition server 110 transmits the audio identifier (Audio_ID) along with the program identifier (Prog_ID) to metadata database 106, which associates the program identifier and the audio identifier, and uses this
  • the program identifier (Prog_ID) is transmitted to the guide database 108.
  • the guide database 108 returns program metadata including information about the audio portion of received content such as, for example, one or more
  • the metadata database 106 then returns the metadata along with the audio identifier (Audio_ID) to the processor 212 through the recognition server 110. As described above, other information, if
  • the recognition server 110 or the processor 212 may be transmitted within the packets for use by either the recognition server 110 or the processor 212 to match the initial request to the metadata.
  • the capture of additional audio information may be performed without a lookup request from the remote control 204. Similarly, it can be performed with or without a request for additional information from the metadata database 106 or the recognition server 110. In other words, the additional capture procedure may be set to run until the processor 212 stops performing the additional audio capture. In this embodiment, it is not necessary for the metadata database 106 or the recognition
  • server 110 to notify the user device 104, which advantageously reduces the amount of time between the initial lookup request and the return of metadata.
  • the processor 212 may then perform a comparison of the received several audio
  • the processor 212 may make the decision as to whether it needs to capture additional audio content from the content source 102 or whether to use audio content stored in its buffer such as, for example, the memory 214.
  • the processor 212 may control the amount of audio information to capture based on the returned audio identifier data. For example, if the first audio identifier found has one value, e.g. , corresponding to one rendition of a particular song, and the second audio identifier found by the recognition server 110 has a
  • the processor 212 may generate the fingerprint based on a longer segment, based on a completely
  • the recognition server 110 may also send back the audio identifier to the user device 104 concurrently with
  • the user device 104 sends and receives multiple audio fingerprints and audio identifiers before receiving a packet from the metadata database 106 with the metadata
  • FIG. 4 illustrates an exemplary record 400 for a particular program identifier (Prog_ID), which in one embodiment is generated by the recognition server 110.
  • Prog_ID program identifier
  • Additional metadata may also be contained in this record 400. More particularly, information in this record 400 is obtained from a combination of data received from the user device 104, the metadata database 106, the guide database 108 and/or the statistics database 220. In one embodiment, this information is associated by the recognition server 110. For example, the program identifier (Prog_ID) of the show or movie received by the user device 104, metadata from the metadata database 106 and statistics from the statistics database 220 are associated and stored as records, e.g., the record 400, in the metadata database 106.
  • Prog_ID program identifier
  • the record 400 includes the name of each song 402 in the show or movie, the location for each song within the show or movie 404, an interest level 404 by the user for the song, and the audio identifier
  • the interest level data is just one type of metric based on gathered information. Other example metrics include popularity, time-based
  • Additional information may be included in this record 400 or may be retrieved separately from another database based on the audio identifier
  • Audio_ID the name of the song, and/or the program identifier (Prog_ID).
  • the statistics database 220 and the metadata database 106 may communicate with each other. Thus, information from the statistics database 220 may also be collected and associated by the metadata database 106 and the associated data may be transmitted by the metadata database 106 to the recognition server 110 directly. As shown in FIG. 4, the program identifier (Prog_ID) may be associated with several songs.
  • systems 100, 200, the process 300 or any part(s) or function(s) thereof) may be implemented by using hardware, software or a combination thereof and may be
  • the user device 104 may automatically initiate the lookup without a viewer's input through the remote control 204.
  • the operations may be completely implemented with machine operations.
  • FIG. 5 is a high-level block diagram of a general/special purpose computer system 500, in accordance with some embodiments.
  • the computer system 500 may be, for example, a user device, a user computer, a client computer and/or a server computer, among other things.
  • Examples of a user device include without limitation a television, a Blu-ray Disc player, a personal media device, a portable media player, an iPod(r), a Zoom Player, a laptop computer, a palmtop computer, a smart phone, a cell phone, a mobile phone, an mp3 player, a digital audio recorder, a digital video recorder, a CD player, a DVD player, an IBM-type personal computer (PC) having an
  • operating system such as Microsoft Windows(r), an Apple(r) computer having an operating system such as MAC-OS, hardware having a JAVA-OS operating
  • the computer system 500 preferably includes without limitation a
  • the processor device 510 may include without limitation a single microprocessor, or may include a plurality of microprocessors for configuring the computer system
  • the main memory 525 stores, among other
  • the main memory 525 stores the executable code when in operation.
  • the main memory 525 may include banks of dynamic random access memory
  • DRAM dynamic random access memory
  • the computer system 500 may further include a mass storage device 530, peripheral device(s) 540, portable storage medium device(s) 550, input control device(s) 580, a graphics subsystem 560, and/or an output display 570.
  • a mass storage device 530 peripheral device(s) 540, portable storage medium device(s) 550, input control device(s) 580, a graphics subsystem 560, and/or an output display 570.
  • FIG. 5 as being coupled via the bus 505.
  • the computer system 500 is not so limited. Devices of the computer system 500 may be coupled through one or more data transport means.
  • the processor device 510 and/or the main memory 525 may be coupled via a local microprocessor bus.
  • the mass storage device 530, peripheral device(s) 540, portable storage medium device(s) 550, and/or graphics subsystem 560 may be coupled via one or more input/output (I/O) buses.
  • the mass storage device 530 is preferably a nonvolatile storage device for storing data and/or instructions for use by the processor device 510.
  • the mass storage device 530 may be implemented, for example, with a magnetic disk drive or an optical disk drive.
  • the mass storage device 530 is preferably configured for loading contents of the mass storage device 530 into the main memory 525.
  • the portable storage medium device 550 operates in conjunction with a nonvolatile portable storage medium, such as, for example, a compact disc read only memory (CD ROM), to input and output data and code to and from the
  • a nonvolatile portable storage medium such as, for example, a compact disc read only memory (CD ROM)
  • the software for storing an internal identifier in metadata may be stored on a portable storage medium, and may be inputted into the computer system 500 via the portable storage medium device 550.
  • the peripheral device(s) 540 may include any type of computer support device, such as, for example, an input/output (I/O) interface configured to add additional functionality to the computer system 500.
  • the peripheral device(s) 540 may include a network interface card for interfacing the computer system 500 with a network 520.
  • the input control device(s) 580 provide a portion of the user interface for a user of the computer system 500.
  • the input control device(s) 580 may include a keypad and/or a cursor control device.
  • the keypad may be configured for
  • the cursor control device may include, for example, a mouse, a trackball, a stylus, and/or cursor direction
  • 500 preferably includes the graphics subsystem 560 and the output display 570.
  • the output display 570 may include a cathode ray tube (CRT) display and/or a liquid crystal display (LCD).
  • the graphics subsystem 560 receives textual and graphical information, and processes the information for output to the output
  • Each component of the computer system 500 may represent a broad
  • Components of the computer system 500 are not limited to the specific
  • Portions of the invention may be conveniently implemented by using a conventional general purpose computer, a specialized digital computer and/or a microprocessor programmed according to the teachings of the present disclosure, as will be apparent to those skilled in the computer art. Appropriate software coding may readily be prepared by skilled programmers based on the teachings of the present disclosure.
  • Some embodiments include a computer program product.
  • the computer program product may be a storage medium/media having instructions stored
  • the storage medium may include without
  • VRAM flash memory
  • flash card magnetic card, optical card, nanosystems
  • implementations include software for controlling both the hardware of the
  • readable media further includes software for performing aspects of the invention, as described above.
  • the processes described above may include without limitation the following: receiving a recording, generating an internal identifier for the

Abstract

An audio portion of content, such as an audio stream, is associated with a multimedia program. A server receives an audio fingerprint and a program identifier from a network and associates the audio fingerprint with an audio identifier. A request packet including the program identifier is transmitted over the network to request program guide information associated with the program identifier. The program data including the program guide information is received from the network and metadata associated with the audio identifier and the program data are transmitted onto the network. A user device initiates a request for the metadata by using an audio fingerprint and the program identifier.

Description

CONTENT RECOGNITION AND SYNCHRONIZATION ON A TELEVISION OR CONSUMER ELECTRONICS DEVICE
BACKGROUND
Field
[0001] Example aspects of the present invention generally relate to content
recognition, and more particularly to associating audio content to a multimedia program.
Related Art
[0002] The Internet has changed the way consumers listen to and purchase media content. Today, consumers can download or stream digital music and video
without much effort. Further, if a consumer cannot recognize a song they are listening to such as, in a bar, on the radio, over an announcement system, etc., the consumer can simply hold up their phone where the music is playing and send a snippet of the song to a music-discovery service, and in just a few seconds the name of the song, the artist who recorded it, which album it appears on, what year it was released, and album cover art are reported back to the consumer. With a few button presses, the consumer can buy the recognized song or related album.
BRIEF DESCRIPTION
[0003] With the advent of increased computing power in televisions and consumer electronic devices, new applications that deliver Internet services while watching TV programs are becoming more popular. Such applications enable TV viewers to interact with Internet applications designed to complement and enhance the
traditional TV viewing experience by providing content, information, and
community features available on the Internet.
[0004] Some broadcasters transmit program guide information for scheduled
broadcast television or radio programs, which may be displayed on-screen. Users may view, navigate, select, and discover content by time, title, channel, genre, etc.
by use of their remote control, a keyboard, or other input devices such as a phone keypad.
Attorney Docket No. 03449.000024 AMG0024 - ? -
[0005] It would be useful to bring audio fingerprinting to televisions and
consumer electronic (CE) devices to associate a song with a particular television show, movie, game or other content source, and further, to provide users with related metadata. One technical challenge in doing so is associating the song to the content or program. Despite the technical efforts of those providing metadata about programs, in many cases such information does not exist, or is limited. It would also be useful to provide a system that builds a database that associates information such as audio information with content such as, for example,
individual programs, games, videos, television shows, movies, etc.
[0006] Moreover, despite the technical efforts of audience monitoring systems, many obstacles hinder successful mining, deployment and sharing of viewer
listening preferences. It would be useful to collect such information in a database by associating disparate sources of information.
[0007] The example embodiments described herein meet the above-identified needs by providing methods, systems and computer program products for
associating an audio portion of media content with a media program and a
determined audio identifier (Audio_ID). The system includes a server having a network interface to transmit and receive data over a network. The server receives an audio fingerprint (FP) and a program identifier (Prog_ID) from the network and associates the audio fingerprint with an audio identifier. A request packet
including the program identifier is transmitted over the network to request program guide information associated with the program identifier. The program data
including the program guide information is received from the network and
metadata associated with the audio identifier and the program data are transmitted onto the network.
[0008] In another aspect, a user device is provided. The user device includes an input interface to receive content from at least one content source. Preferably, the content contains an audio portion, a video portion, and program guide data
including a program identifier (Prog_ID). The user device also includes a
processor to generate an audio fingerprint (FP) from a subset of the audio portion and communicate the program identifier and the audio fingerprint onto a network.
Attorney Docket No. 03449.000024 AMG0024 In addition, the user device receives metadata associated with the audio identifier (Audio_ID) and the program data from the network through a network interface.
[0009] Further features and advantages, as well as the structure and operation, of various example embodiments of the present invention are described in detail below with reference to the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] The features and advantages of the example embodiments presented herein will become more apparent from the detailed description set forth below when taken in conjunction with the drawings in which like reference numbers indicate identical or functionally similar elements.
[0011] FIG. Ia is a system diagram of an exemplary content recognition and
synchronization system 100 in which some embodiments are implemented.
[0012] FIG. Ib is a block diagram of an example home network in which some embodiments are implemented.
[0013] FIG. 2 is a block diagram of an example user device in accordance with an embodiment of the invention.
[0014] FIG. 3 is a ladder diagram showing an example procedure for associating a program identifier (Prog_ID) with an audio identifier (Audio_ID) and returning metadata associated with an audio portion of received content.
[0015] FIG. 4 illustrates an exemplary record for a particular program identifier
(ProgJD).
[0016] FIG. 5 is a high-level block diagram of a general and/or special purpose computer system, in accordance with some embodiments.
DETAILED DESCRIPTION
[0017] Systems, methods, apparatus and computer-readable media are provided for recognizing an audio portion of received content (e.g., songs, speeches)
associated with television shows, movies, games and other video sources. The content may also be individually and/or collectively referred to as media or
multimedia content. In some embodiments, the content is delivered and/or
streamed to a user device such as, for example, a television or another type of consumer electronic (CE) device. Some of these embodiments advantageously
Attorney Docket No. 03449.000024 AMG0024 link information about the audio portion of the content to program guide type information to provide associated content, programs and metadata to users.
Exemplary aspects and embodiments are now described in more detail herein in terms of an Internet-connected television, consumer electronic device, and/or another type of user device which executes program code to recognize the audio portion of specific content while the content is playing and/or is delivered. In an implementation, the content is delivered via streaming. These implementations advantageously retrieve program guide information and metadata from a remote recognition server. This is for convenience only and is not intended to limit the
application of the present description. In fact, after reading the following
description, it will be apparent to one skilled in the relevant art(s) how to
implement the following invention in alternative embodiments such as, for example, by using a local area network, by using a broadcast network to receive broadcast data while communicating requests via a back-channel, etc .
Definitions
[0018] The terms "multimedia program", "show", "program", "multimedia
content" and the like, are generally understood to include television shows,
movies, games and videos of various types.
[0019] "Electronic program guide" or "EPG" data provides a digital guide for a scheduled broadcast television typically displayed on-screen and can be used to allow a viewer to navigate, select, and discover content by time, title, channel, genre, etc. by use of their remote control, a keyboard, or other similar input
devices. In addition, EPG data information can be used to schedule future
recording by a digital video recorder (DVR) or personal video recorder (PVR).
[0020] Some additional terms are defined below in alphabetical order for easy reference. These terms are not rigidly restricted to these definitions. A term may be further defined by its use in other sections of this description.
[0021] "Album" means a collection of tracks. An album is typically originally published by an established entity, such as a record label (e.g., a recording
company such as Warner Brothers and Universal Music).
[0022] "Audio Fingerprint" (e.g., "fingerprint", "acoustic fingerprint", "digital fingerprint") is a digital measure of certain acoustic properties that is
Attorney Docket No. 03449.000024 AMG0024 deterministically generated from an audio signal that can be used to identify an audio sample and/or quickly locate similar items in an audio database. An audio fingerprint typically operates as a unique identifier for a particular item, such as, for example, a CD, a DVD and/or a Blu-ray Disc. The term "identifier" is defined below. An audio fingerprint is an independent piece of data that is not affected by metadata. Macrovision® has databases that store over 25 million unique
fingerprints for various audio samples. Practical uses of audio fingerprints include without limitation identifying songs, identifying records, identifying melodies, identifying tunes, identifying advertisements, monitoring radio broadcasts,
monitoring multipoint and/or peer-to-peer networks, managing sound effects
libraries and identifying video files.
[0023] "Audio Fingerprinting" is the process of generating an audio fingerprint.
U.S. Patent No. 7,277,766, entitled "Method and System for Analyzing Digital
Audio Files", which is herein incorporated by reference, provides an example of an apparatus for audio fingerprinting an audio waveform. U.S. Patent No. 7,451,078, entitled "Methods and Apparatus for Identifying Media Objects", which is herein incorporated by reference, provides an example of an apparatus for generating an audio fingerprint of an audio recording.
[0024] "Blu-ray", also known as Blu-ray Disc, means a disc format jointly
developed by the Blu-ray Disc Association, and personal computer and media manufacturers including Apple, Dell, Hitachi, HP, JVC, LG, Mitsubishi,
Panasonic, Pioneer, Philips, Samsung, Sharp, Sony, TDK and Thomson. The format was developed to enable recording, rewriting and playback of high- definition (HD) video, as well as storing large amounts of data. The format offers more than five times the storage capacity of conventional DVDs and can hold 25 GB on a single-layer disc and 800 GB on a 20-layer disc. More layers and more storage capacity may be feasible as well. This extra capacity combined with the use of advanced audio and/or video codecs offers consumers an unprecedented HD experience. While current disc technologies, such as CD and DVD, rely on a red laser to read and write data, the Blu-ray format uses a blue-violet laser instead, hence the name Blu-ray. The benefit of using a blue-violet laser (605 nm) is that it has a shorter wavelength than a red laser (650 nm). A shorter wavelength makes it
Attorney Docket No. 03449.000024 AMG0024 possible to focus the laser spot with greater precision. This added precision allows data to be packed more tightly and stored in less space. Thus, it is possible to fit substantially more data on a Blu-ray Disc even though a Blu-ray Disc may have substantially similar physical dimensions as a traditional CD or DVD.
[0025] "Chapter" means an audio and/or video data block on a disc, such as a Blu- ray Disc, a CD or a DVD. A chapter stores at least a portion of an audio and/or video recording.
[0026] "Compact Disc" (CD) means a disc used to store digital data. A CD was originally developed for storing digital audio. Standard CDs have a diameter of
740 mm and can typically hold up to 80 minutes of audio. There is also the mini- CD, with diameters ranging from 60 to 80 mm. Mini-CDs are sometimes used for CD singles and typically store up to 24 minutes of audio. CD technology has been adapted and expanded to include without limitation data storage CD-ROM, write- once audio and data storage CD-R, rewritable media CD-RW, Super Audio CD
(SACD), Video Compact Discs (VCD), Super Video Compact Discs (SVCD),
Photo CD, Picture CD, Compact Disc Interactive (CD-i), and Enhanced CD. The wavelength used by standard CD lasers is 650 nm, and thus the light of a standard CD laser typically has a red color.
[0027] "Database" means a collection of data organized in such a way that a
computer program may quickly select desired pieces of the data. A database is an electronic filing system. In some implementations, the term "database" may be used as shorthand for "database management system".
[0028] "Device" means software, hardware or a combination thereof. A device may sometimes be referred to as an apparatus. Examples of a device include
without limitation a software application such as Microsoft Word®, a laptop
computer, a database, a server, a display, a computer mouse, and a hard disk.
[0029] "Digital Video Disc" (DVD) means a disc used to store digital data. A
DVD was originally developed for storing digital video and digital audio data.
Most DVDs have substantially similar physical dimensions as compact discs
(CDs), but DVDs store more than six times as much data. There is also the mini- DVD, with diameters ranging from 60 to 80 mm. DVD technology has been
adapted and expanded to include DVD-ROM, DVD-R, DVD+R, DVD-RW,
Attorney Docket No. 03449.000024 AMG0024 DVD+RW and DVD-RAM. The wavelength used by standard DVD lasers is approximately 650 nm, and thus the light of a standard DVD laser typically has a red color.
[0030] "Fuzzy search" (e.g., "fuzzy string search", "approximate string search") means a search for text strings that approximately or substantially match a given text string pattern. Fuzzy searching may also be known as approximate or inexact matching. An exact match may inadvertently occur while performing a fuzzy search.
[0031] "Signature" means an identifying means that uniquely identifies an item, such as, for example, a track, a song, an album, a CD, a DVD and/or Blu-ray Disc, among other items. Examples of a signature include without limitation the
following in a computer-readable format: an audio fingerprint, a portion of an audio fingerprint, a signature derived from an audio fingerprint, an audio signature, a video signature, a disc signature, a CD signature, a DVD signature, a Blu-ray
Disc signature, a media signature, a high definition media signature, a human
fingerprint, a human footprint, an animal fingerprint, an animal footprint, a
handwritten signature, an eye print, a biometric signature, a retinal signature, a retinal scan, a DNA signature, a DNA profile, a genetic signature and/or a genetic profile, among other signatures. A signature may be any computer-readable string of characters that comports with any coding standard in any language. Examples of a coding standard include without limitation alphabet, alphanumeric, decimal, hexadecimal, binary, American Standard Code for Information Interchange
(ASCII), Unicode and/or Universal Character Set (UCS). Certain signatures may not initially be computer-readable. For example, latent human fingerprints may be printed on a door knob in the physical world. A signature that is initially not
computer-readable may be converted into a computer-readable signature by using any appropriate conversion technique. For example, a conversion technique for converting a latent human fingerprint into a computer-readable signature may include a ridge characteristics analysis.
[0032] "Link" means an association with an object or an element in memory. A link is typically a pointer. A pointer is a variable that contains the address of a location in memory. The location is the starting point of an allocated object, such
Attorney Docket No. 03449.000024 AMG0024 as an object or value type, or the element of an array. The memory may be located on a database or a database system. "Linking" means associating with (e.g.,
pointing to) an object in memory.
[0033] "Metadata" generally means data that describes data. More particularly, metadata may be used to describe the contents of digital recordings. Such
metadata may include, for example, a track name, a song name, artist information (e.g., name, birth date, discography), album information (e.g., album title, review, track listing, sound samples), relational information (e.g., similar artists and
albums, genre) and/or other types of supplemental information such as
advertisements, links or programs (e.g., software applications), and related images.
Metadata may also include a program guide listing of the songs or other audio content associated with multimedia content. Conventional optical discs (e.g., CDs, DVDs, Blu-ray Discs) do not typically contain metadata. Metadata may be
associated with a digital recording (e.g., song, album, movie or video) after the digital recording has been ripped from an optical disc, converted to another digital audio format and stored on a hard drive.
[0034] "Network" means a connection between any two or more computers, which permits the transmission of data. A network may be any combination of networks, including without limitation the Internet, a local area network, a wide area network, a wireless network and a cellular network.
[0035] "Occurrence" means a copy of a recording. An occurrence is preferably an exact copy of a recording. For example, different occurrences of a same pressing are typically exact copies. However, an occurrence is not necessarily an exact copy of a recording, and may be a substantially similar copy. A recording may be an inexact copy for a number of reasons, including without limitation an
imperfection in the copying process, different pressings having different settings, different copies having different encodings, and other reasons. Accordingly, a recording may be the source of multiple occurrences that may be exact copies or substantially similar copies. Different occurrences may be located on different devices, including without limitation different user devices, different MP3 players, different databases, different laptops, and so on. Each occurrence of a recording may be located on any appropriate storage medium, including without limitation
Attorney Docket No. 03449.000024 AMG0024 floppy disk, mini disk, optical disc, Blu-ray Disc, DVD, CD-ROM, micro-drive, magneto-optical disk, ROM, RAM, EPROM, EEPROM, DRAM, VRAM, flash memory, flash card, magnetic card, optical card, nano systems, molecular memory integrated circuit, RAID, remote data storage/archive/warehousing, and/or any other type of storage device. Occurrences may be compiled, such as in a database or in a listing.
[0036] "Pressing" (e.g., "disc pressing") means producing a disc in a disc press from a master. The disc press preferably includes a laser beam having a bandwidth of about 650 nm for DVD or about 605 nm for Blu-ray Disc.
[0037] "Recording" means media data for playback. A recording is preferably a computer readable digital recording and may be, for example, an audio track, a video track, a song, a chapter, a CD recording, a DVD recording and/or a Blu-ray Disc recording, among other things.
[0038] "Server" means a software application that provides services to other
computer programs (and their users), in the same or other computer. A server may also refer to the physical computer that has been set aside to run a specific server application. For example, when the software Apache HTTP Server is used as the web server for a company's website, the computer running Apache is also called the web server. Server applications can be divided among server computers over an extreme range, depending upon the workload.
[0039] "Software" means a computer program that is written in a programming language that may be used by one of ordinary skill in the art. The programming language chosen should be compatible with the computer by which the software application is to be executed and, in particular, with the operating system of that computer. Examples of suitable programming languages include without
limitation Object Pascal, C, C++ and Java. Further, the functions of some
embodiments, when described as a series of steps for a method, could be
implemented as a series of software instructions for being operated by a processor, such that the embodiments could be implemented as software, hardware, or a
combination thereof. Computer readable media are discussed in more detail in a separate section below.
Attorney Docket No. 03449.000024 AMG0024 [0040] "Song" means a musical composition. A song is typically recorded onto a track by a record label (e.g., recording company). A song may have many
different versions, for example, a radio version and an extended version.
[0041] "System" means a device or multiple coupled devices. A device is defined above.
[0042] "Track" means an audio/video data block. A track may be on a disc, such as, for example, a Blu-ray Disc, a CD or a DVD.
[0043] "User" means a consumer, client, and/or client device in a marketplace of products and/or services.
[0044] "User device" (e.g., "client", "client device", "user computer") is a
hardware system, a software operating system and/or one or more software
application programs. A user device may refer to a single computer or to a
network of interacting computers. A user device may be the client part of a client- server architecture. A user device typically relies on a server to perform some operations. Examples of a user device include without limitation a television, a
CD player, a DVD player, a Blu-ray Disc player, a personal media device, a
portable media player, an iPod®, a Zoom Player, a laptop computer, a palmtop computer, a smart phone, a cell phone, a mobile phone, an MP3 player, a digital audio recorder, a digital video recorder, an IBM-type personal computer (PC) having an operating system such as Microsoft Windows®, an Apple® computer having an operating system such as MAC-OS, hardware having a JAVA-OS
operating system, and a Sun Microsystems Workstation having a UNIX operating system.
[0045] "Web browser" means any software program which can display text, graphics, or both, from Web pages on Web sites. Examples of a Web browser include without limitation Mozilla Firefox® and Microsoft Internet Explorer®.
[0046] "Web page" means any documents written in mark-up language including without limitation HTML (hypertext mark-up language) or VRML (virtual reality modeling language), dynamic HTML, XML (extended mark-up language) or
related computer languages thereof, as well as to any collection of such documents reachable through one specific Internet address or at one specific Web site, or any document obtainable through a particular URL (Uniform Resource Locator).
Attorney Docket No. 03449.000024 AMG0024 [0047] "Web server" refers to a computer or other electronic device which is capable of serving at least one Web page to a Web browser. An example of a Web server is a Yahoo® Web server.
[0048] "Web site" means at least one Web page, and more commonly a plurality of Web pages, virtually coupled to form a coherent group.
System Architecture
[0049] FIG. Ia is a system diagram of an exemplary audio recognition and
synchronization system 100 in which an embodiment is implemented. As shown in FIG. Ia, system 100 includes at least one content source 102 that provides
multimedia content, a metadata database 106 that contains supplemental content associated with an audio portion of a multimedia stream (e.g., audio metadata). As will be explained in more detail below, metadata database 106 can also be a
repository for both program metadata and audio metadata that have been
associated.
[0050] A guide database 108 provides EPG data associated with a multimedia program. As shown in FIG. Ia, guide database 108 provides the EPG data to a user device 104 for content and/or media, such as a television, an audio device, a video device, and/or another type of user and/or consumer electronic (CE) device.
Guide database 108 also stores program metadata that may not be communicated directly to the user device 104.
[0051] As shown in FIG. Ia, metadata database 106 and guide database 108 are linked. In one embodiment, this link is initiated from within the user device 104.
A request packet from the user device 104 causes a remote server (110 illustrated in Figure 2) to associate the audio data to a program for the purpose of retrieving metadata about the program. In some embodiments, this association is a logical association and/or link. It should be understood, however, that a link between entries within the metadata database 106 and entries within the guide database 108 may be physical and still be within the scope of the invention.
[0052] A program identifier (Prog_ID) corresponding to the multimedia content such as, for example, a television program being tuned-in from a content source
102, is provided to the user device 104 by the guide database 108. The user device
Attorney Docket No. 03449.000024 AMG0024 104 performs an algorithm on the audio content of the multimedia content to
generate an audio fingerprint (FP) or extract a watermark, which in turn is
communicated to a recognition server via a network 124 such as the Internet. The recognition server includes or is in communication with the metadata database 106. The recognition server of some embodiments is further described in relation to
Figure 2. A search of the metadata database 106 is performed to lookup an audio identifier (Audio_ID) associated with the audio portion of the content received by the user device 104 from the content source 102 based on the audio fingerprint
(FP). Once identified, the audio identifier (Audio_ID) together with a program identifier (Prog_ID) are used to make a logical link between entries within the metadata database 106 and the guide database 108.
[0053] Preferably, only a subset of the audio portion is used to generate the
fingerprint (FP). In one example, a fingerprinting procedure is executed by a
processor on encoded or compressed audio data which has been converted into a stereo pulse code modulated (PCM) audio stream. Pulse code modulation is a format by which many consumer electronic products operate and internally
compress and/or uncompress audio data. Embodiments of the invention are
advantageously performed on any type of audio data file or stream, and therefore are not limited to operations on PCM formatted audio streams. Accordingly, any memory size, number of frames, sampling rates, time, and the like, used to perform audio fingerprinting are within the scope of the present invention.
[0054] FIG. Ib is a block diagram of an example home network in which some embodiments are implemented. On the home network may be a variety of user devices, such as a network ready television 104a, a personal computer 104b, a gaming device 104c, a digital video recorder 104d, other devices 104e, and the like. User devices 104a-104e may receive multimedia content from content
sources 102 through multimedia signal lines 130, through an input interface such as the input interface 208 described below in connection with FIG. 2. In addition, user devices 104a- 104e may communication with each other through a wired or wireless router 120 via network connections 132, such as Ethernet. The router 120 connects the user devices 104a- 104e to the network 124, such as the Internet,
Attorney Docket No. 03449.000024 AMG0024 through a modem 122. In an alternative embodiment, content sources 102 are delivered from the network 124.
[0055] FIG. 2 includes a more detailed diagram of the user device 104 of some embodiments. As shown in FIG. 2, the exemplary user device 104 includes a processor 212 which is coupled through a communication infrastructure (not
shown) to an output component via output interface 206, a communications
interface 210, a memory 214, a storage device 216, a remote control interface 218, and an input interface 208.
[0056] The input interface 208 receives content such as in the form of audio and video streams from the content sources 102, which communicate, for example, through an HDMI (High-Definition Multimedia Interface), Radio Frequency (RF) coaxial cable, composite video, S-Video, SCART, component video, D-Terminal, VGA, and the like, to the user device 104. The content sources 102 include set-top boxes, Blu-ray Disc players, personal computers (PCs), video game consoles such as the PlayStation 3 and the Xbox 360, for example, and A/V receivers, and the like. The content sources 102 provide a program identifier for the movie, show or game, which is stored in a memory 214.
[0057] In the example shown in FIG. 2, video signals received by the input
interface 208 from such content sources 102 are coupled directly to the output interface 206. Audio signals are communicated to the processor 212 for further processing. The processor 212 performs audio fingerprinting on at least a subset of the audio portion of the received content and requests metadata from one or more remote servers. As described in more detail below with respect to FIG. 3, the metadata are preferably requested based on a generated audio fingerprint (FP) and/or the program identifier.
[0058] The user device 104 also includes a main memory 214. Preferably main memory 214 is random access memory (RAM). The user device 104 may also include a storage device 216. The storage device 216 (also sometimes
referred to as "secondary memory") may include, for example, a hard disk drive and/or a removable storage drive, representing a disk drive, a magnetic tape drive, an optical disk drive, etc. As will be appreciated, storage device 216 may
Attorney Docket No. 03449.000024 AMG0024 include a computer-readable storage medium having stored thereon computer software and/or data.
[0059] In alternative embodiments, storage device 216 may include other similar devices for allowing computer programs or other instructions to be loaded into the user device 104. Such devices may include, for example, a removable storage unit and an interface. Examples of such may include a program cartridge and cartridge interface such as that found in video game devices, a removable memory chip such as an erasable programmable read only memory (EPROM), or
programmable read only memory (PROM) and associated socket, and other
removable storage units and interfaces, which allow software and data to be transferred from the removable storage unit to the user device 104.
[0060] The user device 104 includes the communications interface 210 to provide connectivity to a network 124 such as the Internet. The communications interface 210 also allows software and data to be transferred between the user device 104 and external devices. Examples of the communications interface 210 may include a modem, a network interface such as an Ethernet card, a communications port, a Personal Computer Memory Card International Association (PCMCIA) slot and card, etc. Software and data transferred via the communications interface 210 are in the form of signals which may be electronic, electromagnetic, optical or other signals capable of being received by the communications interface 210.
These signals are provided to the communications interface 210 via a
communications path, e.g., a. channel, from, for example, one or more recognition servers 110. This channel carries signals and may be implemented by using wire or cable, fiber optics, a telephone line, a cellular link, an RF link and other
communications channels.
[0061] A remote control interface 218 decodes signals received from a remote control 204, e.g., a television remote control or other input device keyboard, and
communicates the decoded signals to processor 212. The decoded signals, in turn, are translated and processed by the processor 212.
[0062] As shown in FIG. 2, the recognition servers 110 may also be in
communication with a statistics database 220 and a guide database 106. The statistics database 220 and/or guide database 108 may also be in communication directly with
Attorney Docket No. 03449.000024 AMG0024 the metadata database 106. In addition, the metadata database 106 may be part of or remote from the recognition servers 110.
[0063] FIG. 3 is a ladder diagram showing an example procedure for associating a program identifier (Prog_ID) with an audio identifier (Audio_ID) and returning metadata associated with a song. Referring to both FIGs. 2 and 3, initially, the user device 104 receives a command to initiate a lookup by, for example, a remote control 204. Next, the input interface 208 captures a sample of the audio stream from a content source 102, and feeds the audio stream such as a PCM audio
stream, for example, to a processor 212, which performs an audio recognition process on the captured audio. Particularly, the processor 212 analyzes the
captured audio to generate an audio fingerprint (FP).
[0064] It should be understood that different audio fingerprinting algorithms may be executed by the processor 212 to generate audio fingerprints and that the audio fingerprints may be different. Two exemplary audio fingerprinting algorithms are described in U.S. Patent 7,451,078, entitled "Methods and Apparatus for
Identifying Media Objects", filed December 30, 2004, and U.S. Patent 7,277,766, entitled "Method and System for Analyzing Digital Audio Files", filed October 24, 2000, both of which are hereby incorporated by reference herein in their entirety.
Similarly, instead of audio fingerprinting captured audio, other audio identification techniques can be used. For example a watermark embedded into the audio stream or a tag inserted in the audio stream can be used as an identifier, e.g., the
Audio_ID.
[0065] Once an audio fingerprint (FP) or other identifier has been generated by the processor 212, the audio fingerprint (FP) and program identifier (Prog_ID) are transmitted to one or more recognition server(s) 110. The recognition server 110 is also referred to more generally as a back-end server. The recognition server 110, in turn, performs a lookup of an audio identifier (Audio_ID) associated with the audio portion of the content, such as, for example, a song being played, based on the audio fingerprint (FP) of the song. Metadata about the audio portion of the content are also retrieved from the metadata database 106.
[0066] The program identifier (Prog_ID) is transmitted to the guide database 108.
In turn, the guide database 108 returns program metadata including information
Attorney Docket No. 03449.000024 AMG0024 about an audio portion of the received content and/or audio metadata. The guide database 108 of some embodiments returns the metadata in one or more datagrams and/or packets. For instance, the audio metadata and the program metadata are returned within the same packet or in separate packets. The packet transmitted by the guide database 108 to the recognition server 110 is a return packet from an original request. Accordingly, the metadata carried in the packet is preferably appropriately matched based on identifying information provided in a field of the packet which is examined and recognized by the other servers, databases and/or devices on the network 124. This identifying field may be the program identifier (Prog_ID) or other identifier initially provided by the user device 104, and/or generated by the processor 212 or the communications interface 210, for example.
The recognition server 110 transmits onto the network 124 the audio identifier
(Audio_ID) with the metadata to the user device 104, particularly to the processor 212 via the communications interface 210.
[0067] The processor 212 stores metadata in memory 214 and displays the
metadata through an output interface 206. In one embodiment, the output interface 206 presents the metadata as an overlay of the video received from the content source 102, which is being displayed on the television or the user device 104.
[0068] The same procedure discussed above may be performed until the audio portion of the content is recognized. Thus, if an audio fingerprint of a captured audio portion of the content is precise enough to return metadata, the procedure ends. In some cases, it is desirable to capture additional audio content from the content source 102. For example, the audio fingerprint may not be sufficiently robust for the recognition server 110 to match it to an audio identifier (Audio_ID). In such case, the return packet from the recognition server 110 may be inconclusive, e.g. , the return packet returns a null audio identifier (Audio_ID). Various reasons may be the cause of this. One example is that audio content was mixed with voice-over or sound effects noises in a received multimedia content stream.
[0069] To avoid, as best as possible, an inconclusive or erroneous result, additional audio content is preferably captured. This provides the recognition procedure
executed by the processor 212 with more audio information, resulting in a more robust audio fingerprint. In some cases, multiple fingerprints are associated with the audio
Attorney Docket No. 03449.000024 AMG0024 rendering. By capturing additional data, the fingerprint algorithm may generate
different fingerprints for the same audio portion or subset of the audio portion.
Different fingerprints may be generated based on the length of the captured segment or from where within the audio stream the audio capturing took place. In other words, the processor 212 detects a time-based offset location of the multimedia content
corresponding to the audio fingerprint and transmits the location onto the network to, for example, a remote recognition server.
[0070] As shown in FIG. 3, the processor 212 may initiate an additional lookup. This causes additional audio to be captured by the input interface 208. Alternatively, this additional information is extracted from memory 214 or storage 216 if the audio
stream has been buffered.
[0071] The processor 212 performs audio recognition on the additional information.
Particularly, the additional audio information may be added to the audio information previously captured, to make the total captured segment longer. Alternatively, a
different start and stop time within the captured audio portion, e.g. , within a song, may be used to generate the audio fingerprint. In yet another embodiment, the processor 212 is programmed to adjust the total audio capture time.
[0072] The different audio capture times may be prestored or based on an analysis of prior lookup results. Alternatively, this analysis is performed offline by, for example, a statistics server database 220, and the new capture time may be downloaded by the processor 212 through the communications interface 210 during an update.
[0073] Once a new or additional fingerprint is generated, the processor 212 transmits it to the recognition server 110 along with the program identifier (Prog_ID). In turn, the recognition server 110 performs a lookup based on the fingerprint (FP) for an audio identifier (Audio_ID). The recognition server 110 transmits the audio identifier (Audio_ID) along with the program identifier (Prog_ID) to metadata database 106, which associates the program identifier and the audio identifier, and uses this
information to locate metadata within the metadata database 106 related to the audio identifier (Audio_ID) and/or the program identifier (Prog_ID).
[0074] The program identifier (Prog_ID) is transmitted to the guide database 108.
In turn, the guide database 108 returns program metadata including information about the audio portion of received content such as, for example, one or more
Attorney Docket No. 03449.000024 AMG0024 recognizable song(s) within a multimedia stream. The metadata database 106 then returns the metadata along with the audio identifier (Audio_ID) to the processor 212 through the recognition server 110. As described above, other information, if
necessary, may be transmitted within the packets for use by either the recognition server 110 or the processor 212 to match the initial request to the metadata.
[0075] The capture of additional audio information may be performed without a lookup request from the remote control 204. Similarly, it can be performed with or without a request for additional information from the metadata database 106 or the recognition server 110. In other words, the additional capture procedure may be set to run until the processor 212 stops performing the additional audio capture. In this embodiment, it is not necessary for the metadata database 106 or the recognition
server 110 to notify the user device 104, which advantageously reduces the amount of time between the initial lookup request and the return of metadata.
[0076] By performing the additional lookup, several audio identifiers may be returned to the processor 212. These several audio identifiers may be the same or different.
The processor 212 may then perform a comparison of the received several audio
identifiers to determine if the correct metadata has been received and delete any
duplicates. This allows the processor 212 to make the decision as to whether it needs to capture additional audio content from the content source 102 or whether to use audio content stored in its buffer such as, for example, the memory 214. In another example embodiment, the processor 212 may control the amount of audio information to capture based on the returned audio identifier data. For example, if the first audio identifier found has one value, e.g. , corresponding to one rendition of a particular song, and the second audio identifier found by the recognition server 110 has a
different value, e.g. , for a different rendition of the same song, then the processor 212 may generate the fingerprint based on a longer segment, based on a completely
different segment, on various segments, and the like.
[0077] Although not shown, in an alternative embodiment, the recognition server 110 may also send back the audio identifier to the user device 104 concurrently with
sending the audio identifier (Audio_ID) to the metadata database 106. In some cases, the user device 104 sends and receives multiple audio fingerprints and audio identifiers before receiving a packet from the metadata database 106 with the metadata
Attorney Docket No. 03449.000024 AMG0024 infoπnation. This could be used to assist the processor 212 in making a determination whether to inhibit or allow the metadata to be presented through the output interface 206.
[0078] FIG. 4 illustrates an exemplary record 400 for a particular program identifier (Prog_ID), which in one embodiment is generated by the recognition server 110.
Additional metadata may also be contained in this record 400. More particularly, information in this record 400 is obtained from a combination of data received from the user device 104, the metadata database 106, the guide database 108 and/or the statistics database 220. In one embodiment, this information is associated by the recognition server 110. For example, the program identifier (Prog_ID) of the show or movie received by the user device 104, metadata from the metadata database 106 and statistics from the statistics database 220 are associated and stored as records, e.g., the record 400, in the metadata database 106.
[0079] In the example record 400 shown in FIG. 4, the record 400 includes the name of each song 402 in the show or movie, the location for each song within the show or movie 404, an interest level 404 by the user for the song, and the audio identifier
(Audio_ID) 408 for each song. The interest level data is just one type of metric based on gathered information. Other example metrics include popularity, time-based
distribution of user "clicks", and volume of "clicks" indicating, for example, raw
popularity, to name a few. Additional information may be included in this record 400 or may be retrieved separately from another database based on the audio identifier
(Audio_ID), the name of the song, and/or the program identifier (Prog_ID).
[0080] As shown in FIG. 2, the statistics database 220 and the metadata database 106 may communicate with each other. Thus, information from the statistics database 220 may also be collected and associated by the metadata database 106 and the associated data may be transmitted by the metadata database 106 to the recognition server 110 directly. As shown in FIG. 4, the program identifier (Prog_ID) may be associated with several songs.
Exemplary Computer Readable Medium Implementation
[0081] The example embodiments described above such as, for example, the
systems 100, 200, the process 300 or any part(s) or function(s) thereof) may be implemented by using hardware, software or a combination thereof and may be
Attorney Docket No. 03449.000024 AMG0024 implemented in one or more computer systems or other processing systems.
However, the manipulations performed by these example embodiments were often referred to in terms, such as entering, which are commonly associated with mental operations performed by a human operator. No such capability of a human
operator is necessary in any of the operations described herein. For example, the user device 104 may automatically initiate the lookup without a viewer's input through the remote control 204. In other words, the operations may be completely implemented with machine operations. Useful machines for performing the
operation of the example embodiments presented herein include general purpose digital computers or similar devices.
[0082] FIG. 5 is a high-level block diagram of a general/special purpose computer system 500, in accordance with some embodiments. The computer system 500 may be, for example, a user device, a user computer, a client computer and/or a server computer, among other things.
[0083] Examples of a user device include without limitation a television, a Blu-ray Disc player, a personal media device, a portable media player, an iPod(r), a Zoom Player, a laptop computer, a palmtop computer, a smart phone, a cell phone, a mobile phone, an mp3 player, a digital audio recorder, a digital video recorder, a CD player, a DVD player, an IBM-type personal computer (PC) having an
operating system such as Microsoft Windows(r), an Apple(r) computer having an operating system such as MAC-OS, hardware having a JAVA-OS operating
system, and a Sun Microsystems Workstation having a UNIX operating system.
[0084] The computer system 500 preferably includes without limitation a
processor device 510, a main memory 525, and an interconnect bus 505. The processor device 510 may include without limitation a single microprocessor, or may include a plurality of microprocessors for configuring the computer system
500 as a multi processor system. The main memory 525 stores, among other
things, instructions and/or data for execution by the processor device 510. If the system for storing an internal identifier in metadata is partially implemented in software, the main memory 525 stores the executable code when in operation. The main memory 525 may include banks of dynamic random access memory
(DRAM), as well as cache memory.
Attorney Docket No. 03449.000024 AMG0024 [0085] The computer system 500 may further include a mass storage device 530, peripheral device(s) 540, portable storage medium device(s) 550, input control device(s) 580, a graphics subsystem 560, and/or an output display 570. For
explanatory purposes, all components in the computer system 500 are shown in
FIG. 5 as being coupled via the bus 505. However, the computer system 500 is not so limited. Devices of the computer system 500 may be coupled through one or more data transport means. For example, the processor device 510 and/or the main memory 525 may be coupled via a local microprocessor bus. The mass storage device 530, peripheral device(s) 540, portable storage medium device(s) 550, and/or graphics subsystem 560 may be coupled via one or more input/output (I/O) buses. The mass storage device 530 is preferably a nonvolatile storage device for storing data and/or instructions for use by the processor device 510. The mass storage device 530 may be implemented, for example, with a magnetic disk drive or an optical disk drive. In a software embodiment, the mass storage device 530 is preferably configured for loading contents of the mass storage device 530 into the main memory 525.
[0086] The portable storage medium device 550 operates in conjunction with a nonvolatile portable storage medium, such as, for example, a compact disc read only memory (CD ROM), to input and output data and code to and from the
computer system 500. In some embodiments, the software for storing an internal identifier in metadata may be stored on a portable storage medium, and may be inputted into the computer system 500 via the portable storage medium device 550. The peripheral device(s) 540 may include any type of computer support device, such as, for example, an input/output (I/O) interface configured to add additional functionality to the computer system 500. For example, the peripheral device(s) 540 may include a network interface card for interfacing the computer system 500 with a network 520.
[0087] The input control device(s) 580 provide a portion of the user interface for a user of the computer system 500. The input control device(s) 580 may include a keypad and/or a cursor control device. The keypad may be configured for
inputting alphanumeric and/or other key information. The cursor control device may include, for example, a mouse, a trackball, a stylus, and/or cursor direction
Attorney Docket No. 03449.000024 AMG0024 - 11 - keys. In order to display textual and graphical information, the computer system
500 preferably includes the graphics subsystem 560 and the output display 570.
The output display 570 may include a cathode ray tube (CRT) display and/or a liquid crystal display (LCD). The graphics subsystem 560 receives textual and graphical information, and processes the information for output to the output
display 570.
[0088] Each component of the computer system 500 may represent a broad
category of a computer component of a general/special purpose computer.
Components of the computer system 500 are not limited to the specific
implementations provided here.
[0089] Portions of the invention may be conveniently implemented by using a conventional general purpose computer, a specialized digital computer and/or a microprocessor programmed according to the teachings of the present disclosure, as will be apparent to those skilled in the computer art. Appropriate software coding may readily be prepared by skilled programmers based on the teachings of the present disclosure.
[0090] Some embodiments may also be implemented by the preparation of
application-specific integrated circuits or by interconnecting an appropriate
network of conventional component circuits.
[0091] Some embodiments include a computer program product. The computer program product may be a storage medium/media having instructions stored
thereon/therein which can be used to control, or cause, a computer to perform any of the processes of the invention. The storage medium may include without
limitation floppy disk, mini disk, optical disc, Blu-ray Disc, DVD, CD-ROM, micro-drive, magneto-optical disk, ROM, RAM, EPROM, EEPROM, DRAM,
VRAM, flash memory, flash card, magnetic card, optical card, nanosystems,
molecular memory integrated circuit, RAID, remote data
storage/archive/warehousing, and/or any other type of device suitable for storing instructions and/or data.
[0092] Stored on any one of the computer readable medium/media, some
implementations include software for controlling both the hardware of the
general/special computer or microprocessor, and for enabling the computer or
Attorney Docket No. 03449.000024 AMG0024 microprocessor to interact with a human user or other mechanism utilizing the results of the invention. Such software may include without limitation device drivers, operating systems, and user applications. Ultimately, such computer
readable media further includes software for performing aspects of the invention, as described above.
[0093] Included in the programming/software of the general/special purpose
computer or microprocessor are software modules for implementing the processes described above. The processes described above may include without limitation the following: receiving a recording, generating an internal identifier for the
recording, and adding the internal identifier to metadata associated with at least one occurrence of the recording.
[0094] While various example embodiments of the present invention have been described above, it should be understood that they have been presented by way of example, and not limitation. It will be apparent to persons skilled in the relevant art(s) that various changes in form and detail can be made therein. Thus, the
present invention should not be limited by any of the above described example embodiments, but should be defined only in accordance with the following claims and their equivalents.
[0095] In addition, it should be understood that the figures are presented for
example purposes only. The architecture of the example embodiments presented herein is sufficiently flexible and configurable, such that it may be utilized and navigated in ways other than that shown in the accompanying figures.
[0096] Further, the purpose of the Abstract is to enable the U.S. Patent and
Trademark Office and the public generally, and especially the scientists, engineers and practitioners in the art who are not familiar with patent or legal terms or
phraseology, to determine quickly from a cursory inspection the nature and essence of the technical disclosure of the application. The Abstract is not intended to be limiting as to the scope of the example embodiments presented herein in any way.
It is also to be understood that the procedures recited in the claims need not be performed in the order presented.
Attorney Docket No. 03449.000024 AMG0024

Claims

WHAT IS CLAIMED IS:
1. A system for associating an audio portion of received content with a
multimedia program, the system comprising:
a server including a network interface to transmit and receive data over a network, the server operable to:
receive an audio fingerprint and a program identifier from the network,
associate the audio fingerprint with an audio identifier,
transmit a request packet including the program identifier over the network, the request packet requesting program guide information associated with the program identifier,
receive program data including the program guide information from the network, and
transmit metadata associated with the audio identifier and the program data onto the network.
2. The system according to Claim 1, wherein the server is configured to
generate a record corresponding to the program identifier including at least one audio identifier associated with the multimedia program and metadata associated with each audio identifier.
3. The system according to Claim 2, wherein the metadata includes a metric associated with each audio identifier.
4. The system according to Claim 1, further comprising:
a user device including:
an input interface operable to receive the received content from at least one source, the received content containing an audio portion, a video portion and program guide data, the program guide data including the program identifier;
and
Attorney Docket No. 03449.000024 AMG0024 a processor operable to generate an audio fingerprint from a subset of the audio portion, communicate the program identifier and the audio fingerprint onto a network, and receive metadata associated with the audio identifier and the program data from the network through the network interface.
5. The system according to Claim 4, wherein the user device further includes a remote interface operable to receive from a remote control a command to initiate a lookup for metadata.
6. The system according to Claim 4, wherein the user device further includes:
a memory operable to store the subset of the audio portion, wherein the processor generates another audio fingerprint based on at least one of:
an additional subset of the audio portion and
combined subsets of the audio portion.
7. The system according to Claim 4, wherein the processor is further
configured to detect a time-based offset location of the received content
corresponding to the audio fingerprint and transmit the location onto the network.
8. A method for associating an audio portion of received content with a
multimedia program, the method comprising:
receiving an audio fingerprint and a program identifier from a network;
associating the audio fingerprint with an audio identifier;
transmitting a request packet including the program identifier over the network, the request packet requesting program guide information associated with the program identifier;
receiving program data including the program guide information from the network; and
transmitting metadata associated with the audio identifier and the program data onto the network.
9. The method according to Claim 8, further comprising:
Attorney Docket No. 03449.000024 AMG0024 generating a record corresponding to the program identifier including at least one audio identifier associated with the multimedia program and metadata associated with each audio identifier.
10. The method according to Claim 9, wherein the metadata includes a metric associated with each audio identifier.
11. The method of Claim 8, further comprising:
receiving the received content, from at least one source, the received
content containing an audio portion, a video portion and program guide data, the program guide data including the program identifier;
generating an audio fingerprint from a subset of the audio portion of the received content;
communicating the program identifier and the audio fingerprint onto a network; and
receiving the metadata associated with the audio identifier and the program data from the network through a network interface,
wherein the above steps are performed by a user device including at least one processor.
12. The method according to Claim 11, further comprising:
receiving, from a remote control, a command to initiate a lookup for the metadata.
13. The method according to Claim 11, further comprising:
storing the subset of the audio portion of the received content; and
generating another audio fingerprint based on at least one of:
an additional subset of the audio portion and
combined subsets of the audio portion of the received content.
14. The method according to Claim 11, further comprising:
detecting a time-based offset location of the received content corresponding to the audio fingerprint; and
Attorney Docket No. 03449.000024 AMG0024 transmitting the location onto the network.
15. A computer-readable medium having stored thereon sequences of
instructions, the sequences of instructions including instructions which when
executed by a computer system causes the computer system to perform:
receiving an audio fingerprint and a program identifier from a network;
associating the audio fingerprint with an audio identifier;
transmitting a request packet including the program identifier over the network, the request packet requesting program guide information associated with the program identifier;
receiving program data including the program guide information from the network; and
transmitting metadata associated with the audio identifier and the program data onto the network.
16. The computer-readable medium according to Claim 15, further having stored thereon a sequence of instructions which when executed by the computer system causes the computer system to perform:
generating a record corresponding to the program identifier including at least one audio identifier associated with the multimedia program and metadata associated with each audio identifier.
17. The computer-readable medium of Claim 16, wherein the metadata
includes a metric associated with each audio identifier.
18. The computer-readable medium of Claim 15, further having stored thereon a sequence of instructions which when executed by the computer system causes the computer system to perform:
receiving content, from at least one source, the received content containing an audio portion, a video portion, and program guide data, the program guide data including the program identifier;
generating an audio fingerprint from a subset of the audio portion of the received content;
Attorney Docket No. 03449.000024 AMG0024 communicating the program identifier and the audio fingerprint onto a network; and
receiving the metadata associated with the audio identifier and the program data from the network through a network interface,
wherein the above steps are performed by a user device including at least one processor.
19. The computer-readable medium of Claim 18, further having stored thereon a sequence of instructions which when executed by the computer system causes the computer system to perform:
storing the subset of the audio portion; and
generating another audio fingerprint based on at least one of:
an additional subset of the audio portion and
combined subsets of the audio portion of the received content.
20. The computer-readable medium of Claim 18, further having stored thereon a sequence of instructions which when executed by the computer system causes the computer system to perform:
detecting a time-based offset location of the received content corresponding to the audio fingerprint; and
transmitting the location onto the network.
Attorney Docket No. 03449.000024 AMG0024
PCT/US2010/042044 2009-08-14 2010-07-15 Content recognition and synchronization on a television or consumer electronics device WO2011019473A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP22205939.6A EP4210246A1 (en) 2009-08-14 2010-07-15 Content recognition and synchronization on a television or consumer electronics device
JP2012524717A JP5481559B2 (en) 2009-08-14 2010-07-15 Content recognition and synchronization on television or consumer electronic devices
CA2771066A CA2771066C (en) 2009-08-14 2010-07-15 Content recognition and synchronization on a television or consumer electronics device
EP10736928A EP2465053A1 (en) 2009-08-14 2010-07-15 Content recognition and synchronization on a television or consumer electronics device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US12/541,552 US20110041154A1 (en) 2009-08-14 2009-08-14 Content Recognition and Synchronization on a Television or Consumer Electronics Device
US12/541,552 2009-08-14

Publications (1)

Publication Number Publication Date
WO2011019473A1 true WO2011019473A1 (en) 2011-02-17

Family

ID=42676835

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2010/042044 WO2011019473A1 (en) 2009-08-14 2010-07-15 Content recognition and synchronization on a television or consumer electronics device

Country Status (5)

Country Link
US (1) US20110041154A1 (en)
EP (2) EP2465053A1 (en)
JP (2) JP5481559B2 (en)
CA (1) CA2771066C (en)
WO (1) WO2011019473A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015530640A (en) * 2012-07-20 2015-10-15 ヴィジブル ワールド インコーポレイテッド System, method, and computer readable medium for determining program promotion outcomes

Families Citing this family (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9055335B2 (en) 2009-05-29 2015-06-09 Cognitive Networks, Inc. Systems and methods for addressing a media database using distance associative hashing
US10375451B2 (en) 2009-05-29 2019-08-06 Inscape Data, Inc. Detection of common media segments
US8769584B2 (en) 2009-05-29 2014-07-01 TVI Interactive Systems, Inc. Methods for displaying contextually targeted content on a connected television
US9449090B2 (en) 2009-05-29 2016-09-20 Vizio Inscape Technologies, Llc Systems and methods for addressing a media database using distance associative hashing
US10116972B2 (en) 2009-05-29 2018-10-30 Inscape Data, Inc. Methods for identifying video segments and displaying option to view from an alternative source and/or on an alternative device
US10949458B2 (en) 2009-05-29 2021-03-16 Inscape Data, Inc. System and method for improving work load management in ACR television monitoring system
US8677400B2 (en) * 2009-09-30 2014-03-18 United Video Properties, Inc. Systems and methods for identifying audio content using an interactive media guidance application
US20110078020A1 (en) * 2009-09-30 2011-03-31 Lajoie Dan Systems and methods for identifying popular audio assets
US8161071B2 (en) 2009-09-30 2012-04-17 United Video Properties, Inc. Systems and methods for audio asset storage and management
US8428955B2 (en) * 2009-10-13 2013-04-23 Rovi Technologies Corporation Adjusting recorder timing
US20110085781A1 (en) * 2009-10-13 2011-04-14 Rovi Technologies Corporation Content recorder timing alignment
US8682145B2 (en) 2009-12-04 2014-03-25 Tivo Inc. Recording system based on multimedia content fingerprints
US9185458B2 (en) * 2010-04-02 2015-11-10 Yahoo! Inc. Signal-driven interactive television
US10192138B2 (en) 2010-05-27 2019-01-29 Inscape Data, Inc. Systems and methods for reducing data density in large datasets
US9838753B2 (en) 2013-12-23 2017-12-05 Inscape Data, Inc. Monitoring individual viewing of television events using tracking pixels and cookies
US8898723B2 (en) * 2010-08-20 2014-11-25 Sony Corporation Virtual channel declarative script binding
JP5421316B2 (en) * 2011-03-22 2014-02-19 パナソニック株式会社 Portable terminal, pairing system, and pairing method
KR101786974B1 (en) * 2011-07-29 2017-10-18 네이버 주식회사 Apparatus and method for providing social network service using sound
CA3111501C (en) * 2011-09-26 2023-09-19 Sirius Xm Radio Inc. System and method for increasing transmission bandwidth efficiency ("ebt2")
US8949872B2 (en) * 2011-12-20 2015-02-03 Yahoo! Inc. Audio fingerprint for content identification
DE102012200083A1 (en) * 2012-01-04 2013-07-04 Robert Bosch Gmbh Method and control unit for determining an identification code for an audio data packet
US9384734B1 (en) * 2012-02-24 2016-07-05 Google Inc. Real-time audio recognition using multiple recognizers
US9703932B2 (en) * 2012-04-30 2017-07-11 Excalibur Ip, Llc Continuous content identification of broadcast content
US9628829B2 (en) 2012-06-26 2017-04-18 Google Technology Holdings LLC Identifying media on a mobile device
US9118951B2 (en) 2012-06-26 2015-08-25 Arris Technology, Inc. Time-synchronizing a parallel feed of secondary content with primary media content
US8938089B1 (en) * 2012-06-26 2015-01-20 Google Inc. Detection of inactive broadcasts during live stream ingestion
US9596386B2 (en) 2012-07-24 2017-03-14 Oladas, Inc. Media synchronization
US20140074621A1 (en) * 2012-09-07 2014-03-13 Opentv, Inc. Pushing content to secondary connected devices
US9661361B2 (en) * 2012-09-19 2017-05-23 Google Inc. Systems and methods for live media content matching
US9344773B2 (en) * 2013-02-05 2016-05-17 Microsoft Technology Licensing, Llc Providing recommendations based upon environmental sensing
US9742825B2 (en) * 2013-03-13 2017-08-22 Comcast Cable Communications, Llc Systems and methods for configuring devices
US9161074B2 (en) 2013-04-30 2015-10-13 Ensequence, Inc. Methods and systems for distributing interactive content
US9955192B2 (en) 2013-12-23 2018-04-24 Inscape Data, Inc. Monitoring individual viewing of television events using tracking pixels and cookies
US20150301718A1 (en) * 2014-04-18 2015-10-22 Google Inc. Methods, systems, and media for presenting music items relating to media content
US9729912B2 (en) * 2014-09-22 2017-08-08 Sony Corporation Method, computer program, electronic device, and system
US10762533B2 (en) * 2014-09-29 2020-09-01 Bellevue Investments Gmbh & Co. Kgaa System and method for effective monetization of product marketing in software applications via audio monitoring
WO2016081636A1 (en) 2014-11-18 2016-05-26 Branch Media Labs, Inc. Seamless setup and control for home entertainment devices and content
WO2016081624A1 (en) * 2014-11-18 2016-05-26 Branch Media Labs, Inc. Automatic identification and mapping of consumer electronic devices to ports on an hdmi switch
CN107534800B (en) 2014-12-01 2020-07-03 构造数据有限责任公司 System and method for continuous media segment identification
AU2016211254B2 (en) 2015-01-30 2019-09-19 Inscape Data, Inc. Methods for identifying video segments and displaying option to view from an alternative source and/or on an alternative device
US10534777B2 (en) * 2015-03-10 2020-01-14 Cdx Nashville, Llc Systems and methods for continuously detecting and identifying songs in a continuous audio stream
WO2016168556A1 (en) 2015-04-17 2016-10-20 Vizio Inscape Technologies, Llc Systems and methods for reducing data density in large datasets
US10080062B2 (en) 2015-07-16 2018-09-18 Inscape Data, Inc. Optimizing media fingerprint retention to improve system resource utilization
BR112018000801A2 (en) 2015-07-16 2018-09-04 Inscape Data Inc system, and method
BR112018000820A2 (en) 2015-07-16 2018-09-04 Inscape Data Inc computerized method, system, and product of computer program
JP6903653B2 (en) 2015-07-16 2021-07-14 インスケイプ データ インコーポレイテッド Common media segment detection
US10038936B2 (en) 2015-11-18 2018-07-31 Caavo Inc Source device detection
WO2018148439A1 (en) 2017-02-10 2018-08-16 Caavo Inc Determining state signatures for consumer electronic devices coupled to an audio/video switch
WO2018187592A1 (en) 2017-04-06 2018-10-11 Inscape Data, Inc. Systems and methods for improving accuracy of device maps using media viewing data
CN116259292B (en) * 2023-03-23 2023-10-20 广州资云科技有限公司 Method, device, computer equipment and storage medium for identifying basic harmonic musical scale

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030028796A1 (en) * 2001-07-31 2003-02-06 Gracenote, Inc. Multiple step identification of recordings
US7277766B1 (en) 2000-10-24 2007-10-02 Moodlogic, Inc. Method and system for analyzing digital audio files
US7451078B2 (en) 2004-12-30 2008-11-11 All Media Guide, Llc Methods and apparatus for identifying media objects
WO2009036435A1 (en) * 2007-09-14 2009-03-19 Auditude.Com, Inc. Restoring program information for clips of broadcast programs shared online

Family Cites Families (106)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US553491A (en) * 1896-01-21 Perlet l
US490873A (en) * 1893-01-31 Folding confessional
US3663885A (en) * 1971-04-16 1972-05-16 Nasa Family of frequency to amplitude converters
US4081753A (en) * 1976-12-13 1978-03-28 Miller Arthur O Automatic programming system for television receivers
US4271532A (en) * 1979-11-13 1981-06-02 Rca Corporation Receiver with a channel swapping apparatus
DE2950432A1 (en) * 1979-12-14 1981-06-19 Edmond 8031 Gröbenzell Keiser METHOD AND DEVICE FOR CONTROLLING THE OPERATION OF A TELEVISION RECEIVER
US4381522A (en) * 1980-12-01 1983-04-26 Adams-Russell Co., Inc. Selective viewing
US4367559A (en) * 1981-02-06 1983-01-04 Rca Corporation Arrangement for both channel swapping and favorite channel features
US4425579A (en) * 1981-05-22 1984-01-10 Oak Industries Inc. Catv converter with keylock to favorite channels
US4375651A (en) * 1981-07-27 1983-03-01 Zenith Radio Corporation Selective video reception control system
US4429385A (en) * 1981-12-31 1984-01-31 American Newspaper Publishers Association Method and apparatus for digital serial scanning with hierarchical and relational access
US4495654A (en) * 1983-03-29 1985-01-22 Rca Corporation Remote controlled receiver with provisions for automatically programming a channel skip list
US4754326A (en) * 1983-10-25 1988-06-28 Keycom Electronic Publishing Method and apparatus for assisting user of information retrieval systems
US4641205A (en) * 1984-03-05 1987-02-03 Rca Corporation Television system scheduler with on-screen menu type programming prompting apparatus
US4677466A (en) * 1985-07-29 1987-06-30 A. C. Nielsen Company Broadcast program identification method and apparatus
US4843562A (en) * 1987-06-24 1989-06-27 Broadcast Data Systems Limited Partnership Broadcast information classification system and method
US5210820A (en) * 1990-05-02 1993-05-11 Broadcast Data Systems Limited Partnership Signal recognition system and method
US5210611A (en) * 1991-08-12 1993-05-11 Keen Y. Yee Automatic tuning radio/TV using filtered seek
US5404393A (en) * 1991-10-03 1995-04-04 Viscorp Method and apparatus for interactive television through use of menu windows
US5875108A (en) * 1991-12-23 1999-02-23 Hoffberg; Steven M. Ergonomic man-machine interface incorporating adaptive pattern recognition based control system
US6081750A (en) * 1991-12-23 2000-06-27 Hoffberg; Steven Mark Ergonomic man-machine interface incorporating adaptive pattern recognition based control system
US5903454A (en) * 1991-12-23 1999-05-11 Hoffberg; Linda Irene Human-factored interface corporating adaptive pattern recognition based controller apparatus
JP3328951B2 (en) * 1992-02-07 2002-09-30 ソニー株式会社 TV receiver and tuning method
US5436653A (en) * 1992-04-30 1995-07-25 The Arbitron Company Method and system for recognition of broadcast segments
US5469206A (en) * 1992-05-27 1995-11-21 Philips Electronics North America Corporation System and method for automatically correlating user preferences with electronic shopping information
US5223924A (en) * 1992-05-27 1993-06-29 North American Philips Corporation System and method for automatically correlating user preferences with a T.V. program information database
US5317403A (en) * 1992-06-26 1994-05-31 Thomson Consumer Electronics, Inc. Favorite channel selection using extended keypress
US5600364A (en) * 1992-12-09 1997-02-04 Discovery Communications, Inc. Network controller for cable television delivery systems
US6181335B1 (en) * 1992-12-09 2001-01-30 Discovery Communications, Inc. Card for a set top terminal
US5600573A (en) * 1992-12-09 1997-02-04 Discovery Communications, Inc. Operations center with video storage for a television program packaging and delivery system
US5621456A (en) * 1993-06-22 1997-04-15 Apple Computer, Inc. Methods and apparatus for audio-visual interface for the display of multiple program categories
US5594509A (en) * 1993-06-22 1997-01-14 Apple Computer, Inc. Method and apparatus for audio-visual interface for the display of multiple levels of information on a display
US5481296A (en) * 1993-08-06 1996-01-02 International Business Machines Corporation Apparatus and method for selectively viewing video information
US5410344A (en) * 1993-09-22 1995-04-25 Arrowsmith Technologies, Inc. Apparatus and method of selecting video programs based on viewers' preferences
US5862260A (en) * 1993-11-18 1999-01-19 Digimarc Corporation Methods for surveying dissemination of proprietary empirical data
KR100348915B1 (en) * 1994-05-12 2002-12-26 마이크로소프트 코포레이션 TV program selection method and system
US5635978A (en) * 1994-05-20 1997-06-03 News America Publications, Inc. Electronic television program guide channel system and method
US6029195A (en) * 1994-11-29 2000-02-22 Herz; Frederick S. M. System for customized electronic identification of desirable objects
US5617565A (en) * 1994-11-29 1997-04-01 Hitachi America, Ltd. Broadcast interactive multimedia system
JP4001942B2 (en) * 1995-02-06 2007-10-31 ソニー株式会社 Receiving apparatus and receiving method, and broadcasting system and broadcasting method
US5880768A (en) * 1995-04-06 1999-03-09 Prevue Networks, Inc. Interactive program guide systems and processes
IT1285179B1 (en) * 1995-04-24 1998-06-03 Motorola Inc PROCEDURE AND APPARATUS FOR THE CONTROL OF SENSITIVE ADDRESSING FOR COMMUNICATIONS SYSTEMS.
US5752160A (en) * 1995-05-05 1998-05-12 Dunn; Matthew W. Interactive entertainment network system and method with analog video startup loop for video-on-demand
US5907323A (en) * 1995-05-05 1999-05-25 Microsoft Corporation Interactive program summary panel
US6505160B1 (en) * 1995-07-27 2003-01-07 Digimarc Corporation Connected audio and other media objects
US5905865A (en) * 1995-10-30 1999-05-18 Web Pager, Inc. Apparatus and method of automatically accessing on-line services in response to broadcast of on-line addresses
US6216264B1 (en) * 1995-11-17 2001-04-10 Thomson Licensing S.A. Scheduler apparatus employing a gopher agent
US5867226A (en) * 1995-11-17 1999-02-02 Thomson Consumer Electronics, Inc. Scheduler employing a predictive agent for use in a television receiver
US5635989A (en) * 1996-02-13 1997-06-03 Hughes Electronics Method and apparatus for sorting and searching a television program guide
US6512796B1 (en) * 1996-03-04 2003-01-28 Douglas Sherwood Method and system for inserting and retrieving data in an audio signal
US5918223A (en) * 1996-07-22 1999-06-29 Muscle Fish Method and article of manufacture for content-based analysis, storage, retrieval, and segmentation of audio information
US20030093790A1 (en) * 2000-03-28 2003-05-15 Logan James D. Audio and video program recording, editing and playback systems using metadata
US6570991B1 (en) * 1996-12-18 2003-05-27 Interval Research Corporation Multi-feature speech/music discrimination system
US6177931B1 (en) * 1996-12-19 2001-01-23 Index Systems, Inc. Systems and methods for displaying and recording control interface with television programs, video, advertising information and program scheduling information
US6068455A (en) * 1997-03-20 2000-05-30 B/E Aerospace Long life pump system
US7167857B2 (en) * 1997-04-15 2007-01-23 Gracenote, Inc. Method and system for finding approximate matches in database
US5987525A (en) * 1997-04-15 1999-11-16 Cddb, Inc. Network delivery of interactive entertainment synchronized to playback of audio recordings
US6172674B1 (en) * 1997-08-25 2001-01-09 Liberate Technologies Smart filtering
US6201176B1 (en) * 1998-05-07 2001-03-13 Canon Kabushiki Kaisha System and method for querying a music database
US6728713B1 (en) * 1999-03-30 2004-04-27 Tivo, Inc. Distributed database management system
US7302574B2 (en) * 1999-05-19 2007-11-27 Digimarc Corporation Content identifiers triggering corresponding responses through collaborative processing
EP1197075A1 (en) * 1999-06-28 2002-04-17 United Video Properties, Inc. Interactive television program guide system and method with niche hubs
US7013301B2 (en) * 2003-09-23 2006-03-14 Predixis Corporation Audio fingerprinting system and method
US20020056118A1 (en) * 1999-08-27 2002-05-09 Hunter Charles Eric Video and music distribution system
US7174293B2 (en) * 1999-09-21 2007-02-06 Iceberg Industries Llc Audio identification system and method
US6571144B1 (en) * 1999-10-20 2003-05-27 Intel Corporation System for providing a digital watermark in an audio signal
US6366907B1 (en) * 1999-12-15 2002-04-02 Napster, Inc. Real-time search engine
US6675174B1 (en) * 2000-02-02 2004-01-06 International Business Machines Corp. System and method for measuring similarity between a set of known temporal media segments and a one or more temporal media streams
US6539395B1 (en) * 2000-03-22 2003-03-25 Mood Logic, Inc. Method for creating a database for comparing music
US20020040475A1 (en) * 2000-03-23 2002-04-04 Adrian Yap DVR system
US20020069252A1 (en) * 2000-07-10 2002-06-06 Songpro.Com, Inc. Personal multimedia device and methods of use thereof
US6574594B2 (en) * 2000-11-03 2003-06-03 International Business Machines Corporation System for monitoring broadcast audio content
US20020069418A1 (en) * 2000-12-06 2002-06-06 Ashwin Philips Network-enabled audio/video player
DE10109648C2 (en) * 2001-02-28 2003-01-30 Fraunhofer Ges Forschung Method and device for characterizing a signal and method and device for generating an indexed signal
JP2002328931A (en) * 2001-04-27 2002-11-15 Matsushita Electric Ind Co Ltd System and method for processing information
US7328153B2 (en) * 2001-07-20 2008-02-05 Gracenote, Inc. Automatic identification of sound recordings
US8972481B2 (en) * 2001-07-20 2015-03-03 Audible Magic, Inc. Playlist generation method and apparatus
US7877438B2 (en) * 2001-07-20 2011-01-25 Audible Magic Corporation Method and apparatus for identifying new media content
SE0103473D0 (en) * 2001-10-18 2001-10-18 Siemens Elema Ab Switching System
MXPA04004645A (en) * 2001-11-16 2004-08-12 Koninkl Philips Electronics Nv Fingerprint database updating method, client and server.
US7035867B2 (en) * 2001-11-28 2006-04-25 Aerocast.Com, Inc. Determining redundancies in content object directories
DE10200653B4 (en) * 2002-01-10 2004-05-27 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Scalable encoder, encoding method, decoder and decoding method for a scaled data stream
US7707221B1 (en) * 2002-04-03 2010-04-27 Yahoo! Inc. Associating and linking compact disc metadata
ES2312772T3 (en) * 2002-04-25 2009-03-01 Landmark Digital Services Llc SOLID EQUIVALENCE AND INVENTORY OF AUDIO PATTERN.
US7110338B2 (en) * 2002-08-06 2006-09-19 Matsushita Electric Industrial Co., Ltd. Apparatus and method for fingerprinting digital media
US20040034441A1 (en) * 2002-08-16 2004-02-19 Malcolm Eaton System and method for creating an index of audio tracks
US8181205B2 (en) * 2002-09-24 2012-05-15 Russ Samuel H PVR channel and PVR IPG information
US7788696B2 (en) * 2003-10-15 2010-08-31 Microsoft Corporation Inferring information about media stream objects
EP2408126A1 (en) * 2004-02-19 2012-01-18 Landmark Digital Services LLC Method and apparatus for identification of broadcast source
WO2006023770A2 (en) * 2004-08-18 2006-03-02 Nielsen Media Research, Inc. Methods and apparatus for generating signatures
US7574451B2 (en) * 2004-11-02 2009-08-11 Microsoft Corporation System and method for speeding up database lookups for multiple synchronized data streams
US7647128B2 (en) * 2005-04-22 2010-01-12 Microsoft Corporation Methods, computer-readable media, and data structures for building an authoritative database of digital audio identifier elements and identifying media items
TW200729004A (en) * 2006-01-27 2007-08-01 Moda Co Ltd System and method for searching multimedia content in content network
US7668869B2 (en) * 2006-04-03 2010-02-23 Digitalsmiths Corporation Media access system
US20080036917A1 (en) * 2006-04-07 2008-02-14 Mark Pascarella Methods and systems for generating and delivering navigatable composite videos
US20070300271A1 (en) * 2006-06-23 2007-12-27 Geoffrey Benjamin Allen Dynamic triggering of media signal capture
CN101115124B (en) * 2006-07-26 2012-04-18 日电(中国)有限公司 Method and apparatus for identifying media program based on audio watermark
US20080066099A1 (en) * 2006-09-11 2008-03-13 Apple Computer, Inc. Media systems with integrated content searching
JP4224095B2 (en) * 2006-09-28 2009-02-12 株式会社東芝 Information processing apparatus, information processing program, and information processing system
EP2070231B1 (en) * 2006-10-03 2013-07-03 Shazam Entertainment, Ltd. Method for high throughput of identification of distributed broadcast content
US20080114794A1 (en) * 2006-11-10 2008-05-15 Guideworks Llc Systems and methods for using playlists
JP2009147775A (en) * 2007-12-17 2009-07-02 Panasonic Corp Program reproduction method, apparatus, program, and medium
EP2321964B1 (en) * 2008-07-25 2018-12-12 Google LLC Method and apparatus for detecting near-duplicate videos using perceptual video signatures
US20110078020A1 (en) * 2009-09-30 2011-03-31 Lajoie Dan Systems and methods for identifying popular audio assets
US8161071B2 (en) * 2009-09-30 2012-04-17 United Video Properties, Inc. Systems and methods for audio asset storage and management
US8677400B2 (en) * 2009-09-30 2014-03-18 United Video Properties, Inc. Systems and methods for identifying audio content using an interactive media guidance application

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7277766B1 (en) 2000-10-24 2007-10-02 Moodlogic, Inc. Method and system for analyzing digital audio files
US20030028796A1 (en) * 2001-07-31 2003-02-06 Gracenote, Inc. Multiple step identification of recordings
US7451078B2 (en) 2004-12-30 2008-11-11 All Media Guide, Llc Methods and apparatus for identifying media objects
WO2009036435A1 (en) * 2007-09-14 2009-03-19 Auditude.Com, Inc. Restoring program information for clips of broadcast programs shared online

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HAITSMA J ET AL: "An efficient database search strategy for audio fingerprinting", PROCEEDINGS OF THE 2003 IEEE RADAR CONFERENCE. HUNTSVILLE, AL, MAY 5 - 8, 2003; [IEEE RADAR CONFERENCE], NEW YORK, NY : IEEE, US, 9 December 2002 (2002-12-09), pages 178 - 181, XP010642541, ISBN: 978-0-7803-7920-6 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015530640A (en) * 2012-07-20 2015-10-15 ヴィジブル ワールド インコーポレイテッド System, method, and computer readable medium for determining program promotion outcomes
US10521816B2 (en) 2012-07-20 2019-12-31 Visible World, Llc Systems, methods and computer-readable media for determining outcomes for program promotions
US10949875B2 (en) 2012-07-20 2021-03-16 Visible World, Llc Systems, methods and computer-readable media for determining outcomes for program promotions

Also Published As

Publication number Publication date
JP2013501999A (en) 2013-01-17
CA2771066A1 (en) 2011-02-17
US20110041154A1 (en) 2011-02-17
EP4210246A1 (en) 2023-07-12
EP2465053A1 (en) 2012-06-20
CA2771066C (en) 2023-03-14
JP5481559B2 (en) 2014-04-23
JP2014078997A (en) 2014-05-01

Similar Documents

Publication Publication Date Title
CA2771066C (en) Content recognition and synchronization on a television or consumer electronics device
US8428955B2 (en) Adjusting recorder timing
US20110085781A1 (en) Content recorder timing alignment
US20120020647A1 (en) Filtering repeated content
US20120239690A1 (en) Utilizing time-localized metadata
US20120271823A1 (en) Automated discovery of content and metadata
US8521759B2 (en) Text-based fuzzy search
US8321394B2 (en) Matching a fingerprint
US8620967B2 (en) Managing metadata for occurrences of a recording
US20100161656A1 (en) Multiple step identification of recordings
US20110173185A1 (en) Multi-stage lookup for rolling audio recognition
US20120254234A1 (en) Systems and methods for audio asset storage and management
EP2473932B1 (en) A method and system for tunable distribution of content
US20120239689A1 (en) Communicating time-localized metadata
JP5543983B2 (en) Disc recognition
WO2011146510A2 (en) Metadata modifier and manager
JP5723373B2 (en) System and method for identifying audio content using an interactive media guidance application
CN101038589A (en) Method and apparatus for contents management
KR20030092176A (en) Method for processing edited a contents file and a navigation information
US20110072117A1 (en) Generating a Synthetic Table of Contents for a Volume by Using Statistical Analysis
WO2011046719A1 (en) Adjusting recorder timing
JP2006113639A (en) Contents storage device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10736928

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2012524717

Country of ref document: JP

Ref document number: 2771066

Country of ref document: CA

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2010736928

Country of ref document: EP