WO2006121681A1 - Selective sound source listening in conjunction with computer interactive processing - Google Patents

Selective sound source listening in conjunction with computer interactive processing Download PDF

Info

Publication number
WO2006121681A1
WO2006121681A1 PCT/US2006/016670 US2006016670W WO2006121681A1 WO 2006121681 A1 WO2006121681 A1 WO 2006121681A1 US 2006016670 W US2006016670 W US 2006016670W WO 2006121681 A1 WO2006121681 A1 WO 2006121681A1
Authority
WO
WIPO (PCT)
Prior art keywords
sound
computer program
recited
image
interactivity
Prior art date
Application number
PCT/US2006/016670
Other languages
French (fr)
Inventor
Richard L. Marks
Xiadong Mao
Original Assignee
Sony Computer Entertainment Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Computer Entertainment Inc. filed Critical Sony Computer Entertainment Inc.
Priority to EP06758867A priority Critical patent/EP1877149A1/en
Priority to CN2006800064384A priority patent/CN101132839B/en
Priority to JP2008510106A priority patent/JP5339900B2/en
Publication of WO2006121681A1 publication Critical patent/WO2006121681A1/en

Links

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/20Input arrangements for video game devices
    • A63F13/21Input arrangements for video game devices characterised by their sensors, purposes or types
    • A63F13/213Input arrangements for video game devices characterised by their sensors, purposes or types comprising photodetecting means, e.g. cameras, photodiodes or infrared cells
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/50Controlling the output signals based on the game progress
    • A63F13/54Controlling the output signals based on the game progress involving acoustic signals, e.g. for simulating revolutions per minute [RPM] dependent engine sounds in a driving game or reverberation against a virtual wall
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R29/00Monitoring arrangements; Testing arrangements
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/10Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by input arrangements for converting player-generated signals into game device control signals
    • A63F2300/1081Input via voice recognition
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/10Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by input arrangements for converting player-generated signals into game device control signals
    • A63F2300/1087Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by input arrangements for converting player-generated signals into game device control signals comprising photodetecting means, e.g. a camera
    • A63F2300/1093Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by input arrangements for converting player-generated signals into game device control signals comprising photodetecting means, e.g. a camera using visible light
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/60Methods for processing data by generating or executing the game program
    • A63F2300/6063Methods for processing data by generating or executing the game program for sound processing
    • A63F2300/6072Methods for processing data by generating or executing the game program for sound processing of an input signal, e.g. pitch and rhythm extraction, voice recognition

Definitions

  • Example gaming platforms may be the Sony Playstation or Sony Playstation2 (PS2), each of which is sold in the form of a game console.
  • the game console is designed to connect to a monitor (usually a television) and enable user interaction through handheld controllers.
  • the game console is designed with specialized processing hardware, including a CPU, a graphics synthesizer for processing intensive graphics operations, a vector unit for performing geometry transformations, and other glue hardware, firmware, and software.
  • the game console is further designed with an optical disc tray for receiving game compact discs for local play through the game console. Online gaming is also possible, where a user can interactively play against or with other users over the Internet.
  • game complexity continues to intrigue players, game and hardware manufacturers have continued to innovate to enable additional interactivity. In reality, however, the way in which users interact with a game has not changed dramatically over the years.
  • the present invention fills these needs by providing an apparatus and method that facilitates interactivity with a computer program.
  • the computer program is a game program, but without limitation, the apparatus and method can find applicability in any computer environment that may take in sound input to trigger control, input, or enable communication. More specifically, if sound is used to trigger control or input, .the embodiments of the present invention will enable filtered input of particular sound sources, and the filtered input is configured to omit or focus away from sound sources that are not of interest. In the video game environment, depending on the sound source selected, the video game can respond with specific responses after processing the sound source of interest, without the distortion or noise of other sounds that may not be of interest.
  • an apparatus for capturing image and sound during interactivity with a computer program includes an image capture unit that is configured to capture one or more image frames. Also provided is a sound capture unit. The sound capture unit is configured to identify one or more sound sources. The sound capture unit generates data capable of being analyzed to determine a zone of focus at which to process sound to the substantial exclusion of sounds outside of the zone of focus. In this manner, sound that is captured and processed for the zone of focus is used for interactivity with the computer program.
  • a method for selective sound source listening during interactivity with a computer program includes receiving input from one or more sound sources at two or more sound source capture microphones. Then, the method includes determining delay paths from each of the sound sources and identifying a direction for each of the received inputs of each of the one or more sound sources. The method then includes filtering out sound sources that are not in an identified direction of a zone of focus. The zone of focus is configured to supply the sound source for the interactivity with the computer program.
  • a game system is provided. The game system includes an image-sound capture device that is configured to interface with a computing system that enables execution of an interactive computer game.
  • the image-capture device includes video capture hardware that is capable of being positioned to capture video from a zone of focus.
  • An array of microphones is provided for capturing sound from one or more sound sources. Each sound source is identified and associated with a direction relative to the image-sound capture device.
  • the zone of focus associated with the video capture hardware is configured to be used to identify one of the sound sources at the direction that is in the proximity of the zone of focus.
  • the interactive sound identification and tracking is applicable to the interfacing with any computer program of a any computing device.
  • the content of the sound source can be further processed to trigger, drive, direct, or control features or objects rendered by a computer program.
  • Figure 1 shows a game environment in which a video game program may be executed for interactivity with one or more users, in accordance with one embodiment of the present invention.
  • Figure 2 illustrates a three-dimensional diagram of an example image-sound capture device, in accordance with one embodiment of the present invention.
  • FIGs 3 A and 3B illustrate the processing of sound paths at different microphones that are designed to receive the input, and logic for outputting the selected sound source, in accordance with one embodiment of the present invention.
  • Figure 4 illustrates an example computing system interfacing with an image-sound capture device for processing input sound sources, in accordance with one embodiment of the present invention.
  • Figure 5 illustrates an example where multiple microphones are used to increase the precision of the direction identification of particular sound sources, in accordance with one embodiment of the present invention.
  • Figure 6 illustrates an example in which sound is identified at a particular spatial volume using microphones in different planes, in accordance with one embodiment of the present invention.
  • Figures 7 and 8 illustrates exemplary method operations that may be processed in the identification of sound sources and exclusion of non-focus sound sources, in accordance with one embodiment of the present invention.
  • An invention is disclosed for methods and apparatus for facilitating the identification of specific sound sources and filtering out unwanted sound sources when sound is used as an interactive tool with a computer program.
  • FIG. 1 shows a game environment 100 in which a video game program may be executed for interactivity with one or more users, in accordance with one embodiment of the present invention.
  • player 102 is shown in front of a monitor 108 that includes a display 110.
  • the monitor 108 is interconnected with a computing system 104.
  • the computing system can be a standard computer system, a game console or a portable computer system.
  • the game console can be a one manufactured by Sony Computer Entertainment Inc., Microsoft, or any other manufacturer.
  • Computing system 104 is shown interconnected with an image-sound capture device 106.
  • the image-sound capture device 106 includes a sound capture unit 106a and an image capture unit 106b.
  • the player 102 is shown interactively communicating with a game figure 112 on the display 110.
  • the video game being executed is one in which input is at least partially provided by the player 102 by way of the image capture unit 106b, and the sound capture unit 106a.
  • the player 102 may move his hand so as to select interactive icons 114 on the display 110.
  • a translucent image of the player 102' is projected on the display 110 once captured by the image capture unit 106b.
  • the player 102 knows where to move his hand in order to cause selection of icons or interfacing with the game figure 112.
  • the interactive icon 114 is an icon that would allow the player to select "swing" so that the game figure 112 will swing the object being handled.
  • the player 102 may provide voice commands that can be captured by the sound capture unit 106a and then processed by the computing system 104 to provide interactivity with the video game being executed.
  • the sound source 116a is a voice command to "jump!.
  • the sound source 116a will then be captured by the sound capture unit 106a, and processed by the computing system 104 to then cause the game figure 112 to jump.
  • Voice recognition may be used to enable the identification of the voice commands.
  • the player 102 may be in communication with remote users connected to the internet or network, but who are also directly or partially involved in the interactivity of the game.
  • the sound capture unit 106a is configured to include at least two microphones which will enable the computing system 104 to select sound coming from particular directions. By enabling the computing system 104 to filter out directions which are not central to the game play (or the focus), distracting sounds in the game environment 100 will not interfere with or confuse the game execution when specific commands are being provided by the player 102.
  • the game player 102 may be tapping his feet and causing a tap noise which is a non-language sound 117. Such sound may be captured by the sound capture unit 106a, but then filtered out, as sound coming from the player's feet 102 is not in the zone of focus for the video game.
  • the zone of focus is preferably identified by the active image area that is the focus point of the image capture unit 106b. In an alternative manner, the zone of focus can be manually selected from a choice of zones presented to the user after an initialization stage.
  • a game observer 103 may be providing a sound source 116b which could be distracting to the processing by the computing system during the interactive game play.
  • the game observer 103 is not in the active image area of the image capture unit 106b and thus, sounds coming from the direction of game observer 103 will be filtered out so that the computing system 104 will not erroneously confuse commands from the sound source 116b with the sound sources coming from the player 102, as sound source 116a.
  • the image-sound capture device 106 includes an image capture unit 106b, and the sound capture unit 106a.
  • the image-sound capture device 106 is preferably capable of digitally capturing image frames and then transferring those image frames to the computing system 104 for further processing.
  • An example of the image capture unit 106b is a web camera, which is commonly used when video images are desired to be captured and then transferred digitally to a computing device for subsequent storage or communication over a network, such as the internet.
  • Other types of image capture devices may also work, whether analog or digital, so long as the image data is digitally processed to enable the identification and filtering.
  • the digital processing to enable the filtering is done in software, after the input data is received.
  • the sound capture unit 106a is shown including a pair of microphones (MICl and MIC2).
  • the microphones are standard microphones, which can be integrated into the housing that makes up the image-sound capture device 106.
  • Figure 3 A illustrates sound capture units 106a when confronted with sound sources 116 from sound A and sound B.
  • sound A will project its audible sound and will be detected by MICl and MIC2 along sound paths 201a and 201b.
  • Sound B will be projected toward MICl and MIC2 over sound paths 202a and 202b.
  • the sound paths for sound A will be of different lengths, thus providing for a relative delay when compared to sound paths 202a and 202b.
  • the sound coming from each of sound A and sound B will then be processed using a standard triangulation algorithm so that direction selection can occur in box 216, shown in Figure 3B.
  • the sound coming from MICl and MIC2 will each be buffered in buffers 1 and 2 (210a, 210b), and passed through delay lines (212a, 212b).
  • the buffering and delay process will be controlled by software, although hardware can be custom designed to handle the operations as well.
  • direction selection 216 will trigger identification and selection of one of the sound sources 116.
  • FIG. 4 illustrates a computing system 250 that may be used in conjunction with the image-sound capture device 106, in accordance with one embodiment of the present invention.
  • the computing system 250 includes a processor 252, and memory 256.
  • a bus 254 will interconnect the processor and the memory 256 with the image-sound capture device 106.
  • the memory 256 will include at least part of the interactive program 258, and also include selective sound source listening logic or code 260 for processing the received sound source data. Based on where the zone of focus is identified to be by the image capture unit 106b, sound sources outside of the zone of focus will be selectively filtered by the selective sound source listening logic 260 being executed (e.g., by the processor and stored at least partially in the memory 256).
  • the computing system is shown in its most simplistic form, but emphasis is placed on the fact that any hardware configuration can be used, so long as the hardware can process the instructions to effect the processing of the incoming sound sources and thus enable the selective listening.
  • the computing system 250 is also shown interconnected with the display 110 by way of the bus.
  • the zone of focus is identified by the image capture unit being focused toward the sound source B. Sound coming from other sound sources, such as sound source A will be substantially filtered out by the selective sound source listening logic 260 when the sound is captured by the sound capture unit 106a and transferred to the computing system 250.
  • a player can be participating in an internet or networked video game competition with another user where each user's primary audible experience will be by way of speakers.
  • the speakers may be part of the computing system or may be part of the monitor 108.
  • the local speakers are what is generating sound source A as shown in Figure 4.
  • the selective sound source listening logic 260 will filter out the sound of sound source A so that the competing user will not be provided with feedback of his or her own sound or voice. By supplying this filtering, it is possible to have interactive communication over a network while interfacing with a video game, while advantageously avoiding destructive feedback during the process.
  • Figure 5 illustrates an example where the image-sound capture device 106 includes at least four microphones (MICl through MIC4).
  • the sound capture unit 106a is therefore capable of triangulation with better granularity to identify the location of sound sources 116 (A and B). That is, by providing an additional microphone, it is possible to more accurately define the location of the sound sources and thus, eliminate and filter out sound sources that are not of interest or can be destructive to game play or interactivity with a computing system.
  • sound source 116 (B) is the sound source of interest as identified by the video capture unit 106b.
  • Figure 6 identifies how sound source B is identified to a spatial volume.
  • the spatial volume at which sound source B is located will define the volume of focus 274.
  • the image-sound capture device 106 will preferably include at least four microphones. At least one of the microphones will be in a different plane than three of the microphones. By maintaining one of the microphones in plane 271 and the remainder of the four in plane 270 of the image-sound capture device 106, it is possible to define a spatial volume.
  • noise coming from other people in the vicinity (shown as 276a and 276b) will be filtered out as they do not lie within the spatial volume defined in the volume focus 274. Additionally, noise that may be created just outside of the spatial volume, as shown by speaker 276c, will also be filtered out as it falls outside of the spatial volume.
  • FIG. 7 illustrates a flowchart diagram in accordance with one embodiment of the present invention.
  • the method begins at operation 302 where input is received from one or more sound sources at two or more sound capture microphones.
  • the two or more sound capture microphones are integrated into the image-sound capture device 106.
  • the two or more sound capture microphones can be part of a second module/housing that interfaces with the image capture unit 106b.
  • the sound capture unit 106a can include any number of sound capture microphones, and sound capture microphones can be placed in specific locations designed to capture sound from a user that may be interfacing with a computing system.
  • the method moves to operation 304 where a delay path for each of the sound sources is determined.
  • Example delay paths are defined by the sound paths 201 and 202 of Figure 3 A.
  • the delay paths define the time it takes for sound waves to travel from the sound sources to the specific microphones that are situated to capture the sound. Based on the delay it takes sound to travel from the particular sound sources 116, the microphones can determine what the delay is and approximate location from which the sound is emanating from using a standard triangulation algorithm.
  • a direction for each of the received inputs of the one or more sound sources is identified. That is, the direction from which the sound is originating from the sound sources 116 is identified relative to the location of the image-sound capture device, including the sound capture unit 106a. Based on the identified directions, sound sources that are not in an identified direction of a zone (or volume) of focus are filtered out in operation 308. By filtering out the sound sources that are not originating from directions that are in the vicinity of the zone of focus, it is possible to use the sound source not filtered out for interactivity with a computer program, as shown in operation 310.
  • the interactive program can be a video game in which the user can interactively communicate with features of the video game, or players that may be opposing the primary player of the video game.
  • the opposing player can either be local or located at a remote location and be in communication with the primary user over a network, such as the internet.
  • the video game can also be played between a number of users in a group designed to interactively challenge each other's skills in a particular contest associated with the video game.
  • Figure 8 illustrates a flowchart diagram in which image-sound capture device operations 320 are illustrated separate from the software executed operations that are performed on the received input in operations 340.
  • the method proceeds to operation 304 where in software, the delay path for each of the sound sources is determined. Based on the delay paths, a direction for each of the received inputs is identified for each of the one or more sound sources in operation 306, as mentioned above.
  • the method moves to operation 312 where the identified direction that is in proximity of video capture is determined. For instance, video capture will be targeted at an active image area as shown in Figure 1. Thus, the proximity of video capture would be within this active image area (or volume), and any direction associated with a sound source that is within this or in proximity to this, image-active area, will be determined. Based on this determination, the method proceeds to operation 314 where directions (or volumes) that are not in proximity of video capture are filtered out. Accordingly, distractions, noises and other extraneous input that could interfere in video game play of the primary player will be filtered out in the processing that is performed by the software executed during game play.
  • the primary user can interact with the video game, interact with other users of the video game that are actively using the video game, or communicate with other users over the network that may be logged into or associated with transactions for the same video game that is of interest.
  • Such video game communication, interactivity and control will thus be uninterrupted by extraneous noises and/or observers that are not intended to be interactively communicating or participating in a particular game or interactive program.
  • the embodiments described herein may also apply to online gaming applications. That is, the embodiments described above may occur at a server that sends a video signal to multiple users over a distributed network, such as the Internet, to enable players at remote noisy locations to communicate with each other. It should be further appreciated that the embodiments described herein may be implemented through either a hardware or a software implementation. That is, the functional descriptions discussed above may be synthesized to define a microchip having logic configured to perform the functional tasks for each of the modules associated with the noise cancellation scheme. [0043] Also, the selective filtering of sound sources can have other applications, such as telephones.
  • a primary person i.e., the caller
  • a third party i.e., the callee
  • the phone being targeted toward the primary user (by the direction of the receiver, for example) can make the sound coming from the primary user's mouth the zone of focus, and thus enable the selection for listening to only the primary user. This selective listening will therefore enable the substantial filtering out of voices or noises that are not associated with the primary person, and thus, the receiving party will be able to receive a more clear communication from the primary person using the phone.
  • Additional technologies may also include other electronic equipment that can benefit from taking in sound as an input for control or communication. For instance, a user can control settings in an automobile by voice commands, while avoiding other passengers from disrupting the commands.
  • Other applications may include computer controls of applications, such as browsing applications, document preparation, or communications. By enabling this filtering, it is possible to more effectively issue voice or sound commands without interruption by surrounding sounds. As such, any electronic apparatus.
  • the invention may employ various computer-implemented operations involving data stored in computer systems. These operations include operations requiring physical manipulation of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. Further, the manipulations performed are often referred to in terms, such as producing, identifying, determining, or comparing. [0049]
  • the above described invention may be practiced with other computer system configurations including hand-held devices, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers and the like. The invention may also be practiced in distributing computing environments where tasks are performed by remote processing devices that are linked through a communications network.
  • the invention can also be embodied as computer readable code on a computer readable medium.
  • the computer readable medium is any data storage device that can store data which can be thereafter read by a computer system, including an electromagnetic wave carrier. Examples of the computer readable medium include hard drives, network attached storage (NAS), read-only memory, random-access memory, CD-ROMs, CD-Rs, CD-RWs, magnetic tapes, and other optical and non-optical data storage devices.
  • the computer readable medium can also be distributed over a network coupled computer system so that the computer readable code is stored and executed in a distributed fashion.

Abstract

A method and apparatus for capturing image and sound during interactivity with a computer program is provided. The apparatus includes an image capture unit that is configured to capture one or more image frames. Also provided is a sound capture unit. The sound capture unit is configured to identify one or more sound sources. The sound capture unit generates data capable of being analyzed to determine a zone of focus at which to process sound to the substantial exclusion of sounds outside of the zone of focus. In this manner, sound that is captured and processed for the zone of focus is used for interactivity with the computer program.

Description

SELECTIVE SOUND SOURCE LISTENING IN CONJUNCTION WITH COMPUTER INTERACTIVE PROCESSING by Inventors
Richard L. Marks Xiadong Mao
BACKGROUND Description of the Related Art
[0001] The video game industry has seen many changes over the years. As computing power has expanded, developers of video games have likewise created game software that takes advantage of these increases in computing power. To this end, video game developers have been coding games that incorporate sophisticated operations and mathematics to produce a very realistic game experience.
[0002] Example gaming platforms, may be the Sony Playstation or Sony Playstation2 (PS2), each of which is sold in the form of a game console. As is well known, the game console is designed to connect to a monitor (usually a television) and enable user interaction through handheld controllers. The game console is designed with specialized processing hardware, including a CPU, a graphics synthesizer for processing intensive graphics operations, a vector unit for performing geometry transformations, and other glue hardware, firmware, and software. The game console is further designed with an optical disc tray for receiving game compact discs for local play through the game console. Online gaming is also possible, where a user can interactively play against or with other users over the Internet. [0003] As game complexity continues to intrigue players, game and hardware manufacturers have continued to innovate to enable additional interactivity. In reality, however, the way in which users interact with a game has not changed dramatically over the years.
[0004] In view of the foregoing, there is a need for methods and systems that enable more advanced user interactivity with game play.
SUMMARY OF THE INVENTION
[0005] Broadly speaking, the present invention fills these needs by providing an apparatus and method that facilitates interactivity with a computer program. In one embodiment, the computer program is a game program, but without limitation, the apparatus and method can find applicability in any computer environment that may take in sound input to trigger control, input, or enable communication. More specifically, if sound is used to trigger control or input, .the embodiments of the present invention will enable filtered input of particular sound sources, and the filtered input is configured to omit or focus away from sound sources that are not of interest. In the video game environment, depending on the sound source selected, the video game can respond with specific responses after processing the sound source of interest, without the distortion or noise of other sounds that may not be of interest. Commonly, a game playing environment will be exposed to many background noises, such as, music, other people, and the movement of objects. Once the sounds that are not of interest are substantially filtered out, the computer program can better respond to the sound of interest. The response can be in any form, such as a command, an initiation of action, a selection, a change in game status or state, the unlocking of features, etc. [0006] In one embodiment, an apparatus for capturing image and sound during interactivity with a computer program is provided. The apparatus includes an image capture unit that is configured to capture one or more image frames. Also provided is a sound capture unit. The sound capture unit is configured to identify one or more sound sources. The sound capture unit generates data capable of being analyzed to determine a zone of focus at which to process sound to the substantial exclusion of sounds outside of the zone of focus. In this manner, sound that is captured and processed for the zone of focus is used for interactivity with the computer program.
[0007] In another embodiment, a method for selective sound source listening during interactivity with a computer program is disclosed. The method includes receiving input from one or more sound sources at two or more sound source capture microphones. Then, the method includes determining delay paths from each of the sound sources and identifying a direction for each of the received inputs of each of the one or more sound sources. The method then includes filtering out sound sources that are not in an identified direction of a zone of focus. The zone of focus is configured to supply the sound source for the interactivity with the computer program. [0008] In yet another embodiment, a game system is provided. The game system includes an image-sound capture device that is configured to interface with a computing system that enables execution of an interactive computer game. The image-capture device includes video capture hardware that is capable of being positioned to capture video from a zone of focus. An array of microphones is provided for capturing sound from one or more sound sources. Each sound source is identified and associated with a direction relative to the image-sound capture device. The zone of focus associated with the video capture hardware is configured to be used to identify one of the sound sources at the direction that is in the proximity of the zone of focus.
[0009] In general, the interactive sound identification and tracking is applicable to the interfacing with any computer program of a any computing device. Once the sound source is identified, the content of the sound source can be further processed to trigger, drive, direct, or control features or objects rendered by a computer program.
[0010] Other aspects and advantages of the invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] The invention, together with further advantages thereof, may best be understood by reference to the following description taken in conjunction with the accompanying drawings.
[0012] Figure 1 shows a game environment in which a video game program may be executed for interactivity with one or more users, in accordance with one embodiment of the present invention.
[0013] Figure 2 illustrates a three-dimensional diagram of an example image-sound capture device, in accordance with one embodiment of the present invention.
[0014] Figures 3 A and 3B illustrate the processing of sound paths at different microphones that are designed to receive the input, and logic for outputting the selected sound source, in accordance with one embodiment of the present invention.
[0015] Figure 4 illustrates an example computing system interfacing with an image-sound capture device for processing input sound sources, in accordance with one embodiment of the present invention.
[0016] Figure 5 illustrates an example where multiple microphones are used to increase the precision of the direction identification of particular sound sources, in accordance with one embodiment of the present invention. [0017] Figure 6 illustrates an example in which sound is identified at a particular spatial volume using microphones in different planes, in accordance with one embodiment of the present invention.
[0018] Figures 7 and 8 illustrates exemplary method operations that may be processed in the identification of sound sources and exclusion of non-focus sound sources, in accordance with one embodiment of the present invention.
Detailed Description
[0019] An invention is disclosed for methods and apparatus for facilitating the identification of specific sound sources and filtering out unwanted sound sources when sound is used as an interactive tool with a computer program.
[0020] In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without some or all of these specific details. In other instances, well known process steps have not been described in detail in order not to obscure the present invention.
[0021] Figure 1 shows a game environment 100 in which a video game program may be executed for interactivity with one or more users, in accordance with one embodiment of the present invention. As illustrated, player 102 is shown in front of a monitor 108 that includes a display 110. The monitor 108 is interconnected with a computing system 104. The computing system can be a standard computer system, a game console or a portable computer system. In a specific example, but not limited to any brand, the game console can be a one manufactured by Sony Computer Entertainment Inc., Microsoft, or any other manufacturer.
[0022] Computing system 104 is shown interconnected with an image-sound capture device 106. The image-sound capture device 106 includes a sound capture unit 106a and an image capture unit 106b. The player 102 is shown interactively communicating with a game figure 112 on the display 110. The video game being executed is one in which input is at least partially provided by the player 102 by way of the image capture unit 106b, and the sound capture unit 106a. As illustrated, the player 102 may move his hand so as to select interactive icons 114 on the display 110. A translucent image of the player 102' is projected on the display 110 once captured by the image capture unit 106b. Thus, the player 102 knows where to move his hand in order to cause selection of icons or interfacing with the game figure 112. Techniques for capturing these movements and interactions can vary, but exemplary techniques are described in United Kingdom Applications GB 0304024.3 (PCT/GB2004/000693) and GB 0304022.7 (PCT/GB2004/000703), each filed on February 21, 2003, and each of which is hereby incorporated by reference. [0023] In the example shown, the interactive icon 114 is an icon that would allow the player to select "swing" so that the game figure 112 will swing the object being handled. In addition, the player 102 may provide voice commands that can be captured by the sound capture unit 106a and then processed by the computing system 104 to provide interactivity with the video game being executed. As shown, the sound source 116a is a voice command to "jump!". The sound source 116a will then be captured by the sound capture unit 106a, and processed by the computing system 104 to then cause the game figure 112 to jump. Voice recognition may be used to enable the identification of the voice commands. Alternatively, the player 102 may be in communication with remote users connected to the internet or network, but who are also directly or partially involved in the interactivity of the game. [0024] In accordance with one embodiment of the present invention, the sound capture unit 106a is configured to include at least two microphones which will enable the computing system 104 to select sound coming from particular directions. By enabling the computing system 104 to filter out directions which are not central to the game play (or the focus), distracting sounds in the game environment 100 will not interfere with or confuse the game execution when specific commands are being provided by the player 102. For example, the game player 102 may be tapping his feet and causing a tap noise which is a non-language sound 117. Such sound may be captured by the sound capture unit 106a, but then filtered out, as sound coming from the player's feet 102 is not in the zone of focus for the video game. [0025] As will be described below, the zone of focus is preferably identified by the active image area that is the focus point of the image capture unit 106b. In an alternative manner, the zone of focus can be manually selected from a choice of zones presented to the user after an initialization stage. Continuing with the example of Figure 1, a game observer 103 may be providing a sound source 116b which could be distracting to the processing by the computing system during the interactive game play. However, the game observer 103 is not in the active image area of the image capture unit 106b and thus, sounds coming from the direction of game observer 103 will be filtered out so that the computing system 104 will not erroneously confuse commands from the sound source 116b with the sound sources coming from the player 102, as sound source 116a.
[0026] The image-sound capture device 106 includes an image capture unit 106b, and the sound capture unit 106a. The image-sound capture device 106 is preferably capable of digitally capturing image frames and then transferring those image frames to the computing system 104 for further processing. An example of the image capture unit 106b is a web camera, which is commonly used when video images are desired to be captured and then transferred digitally to a computing device for subsequent storage or communication over a network, such as the internet. Other types of image capture devices may also work, whether analog or digital, so long as the image data is digitally processed to enable the identification and filtering. In one preferred embodiment, the digital processing to enable the filtering is done in software, after the input data is received. The sound capture unit 106a is shown including a pair of microphones (MICl and MIC2). The microphones are standard microphones, which can be integrated into the housing that makes up the image-sound capture device 106.
[0027] Figure 3 A illustrates sound capture units 106a when confronted with sound sources 116 from sound A and sound B. As shown, sound A will project its audible sound and will be detected by MICl and MIC2 along sound paths 201a and 201b. Sound B will be projected toward MICl and MIC2 over sound paths 202a and 202b. As illustrated, the sound paths for sound A will be of different lengths, thus providing for a relative delay when compared to sound paths 202a and 202b. The sound coming from each of sound A and sound B will then be processed using a standard triangulation algorithm so that direction selection can occur in box 216, shown in Figure 3B. The sound coming from MICl and MIC2 will each be buffered in buffers 1 and 2 (210a, 210b), and passed through delay lines (212a, 212b). In one embodiment, the buffering and delay process will be controlled by software, although hardware can be custom designed to handle the operations as well. Based on the triangulation, direction selection 216 will trigger identification and selection of one of the sound sources 116.
[0028] The sound coming from each of MICsI and MICs2 will be summed in box 214 before being output as the output of the selected source. In this manner, sound coming from directions other than the direction in the active image area will be filtered out so that such sound sources do not distract processing by the computer system 104, or distract communication with other users that may be interactively playing a video game over a network, or the internet.
[0029] Figure 4 illustrates a computing system 250 that may be used in conjunction with the image-sound capture device 106, in accordance with one embodiment of the present invention. The computing system 250 includes a processor 252, and memory 256. A bus 254 will interconnect the processor and the memory 256 with the image-sound capture device 106. The memory 256 will include at least part of the interactive program 258, and also include selective sound source listening logic or code 260 for processing the received sound source data. Based on where the zone of focus is identified to be by the image capture unit 106b, sound sources outside of the zone of focus will be selectively filtered by the selective sound source listening logic 260 being executed (e.g., by the processor and stored at least partially in the memory 256). The computing system is shown in its most simplistic form, but emphasis is placed on the fact that any hardware configuration can be used, so long as the hardware can process the instructions to effect the processing of the incoming sound sources and thus enable the selective listening.
[0030] The computing system 250 is also shown interconnected with the display 110 by way of the bus. In this example, the zone of focus is identified by the image capture unit being focused toward the sound source B. Sound coming from other sound sources, such as sound source A will be substantially filtered out by the selective sound source listening logic 260 when the sound is captured by the sound capture unit 106a and transferred to the computing system 250.
[0031] In one specific example, a player can be participating in an internet or networked video game competition with another user where each user's primary audible experience will be by way of speakers. The speakers may be part of the computing system or may be part of the monitor 108. Suppose, therefore, that the local speakers are what is generating sound source A as shown in Figure 4. In order not to feedback the sound coming out of the local . speakers for sound source A to the competing user, the selective sound source listening logic 260 will filter out the sound of sound source A so that the competing user will not be provided with feedback of his or her own sound or voice. By supplying this filtering, it is possible to have interactive communication over a network while interfacing with a video game, while advantageously avoiding destructive feedback during the process. [0032] Figure 5 illustrates an example where the image-sound capture device 106 includes at least four microphones (MICl through MIC4). The sound capture unit 106a, is therefore capable of triangulation with better granularity to identify the location of sound sources 116 (A and B). That is, by providing an additional microphone, it is possible to more accurately define the location of the sound sources and thus, eliminate and filter out sound sources that are not of interest or can be destructive to game play or interactivity with a computing system. As illustrated in Figure 5, sound source 116 (B) is the sound source of interest as identified by the video capture unit 106b. Continuing with example of Figure 5, Figure 6 identifies how sound source B is identified to a spatial volume. [0033] The spatial volume at which sound source B is located will define the volume of focus 274. By identifying a volume of focus, it is possible to eliminate or filter out noises that are not within a specific volume (i.e., which are not just in a direction). To facilitate the selection of a volume of focus 274, the image-sound capture device 106 will preferably include at least four microphones. At least one of the microphones will be in a different plane than three of the microphones. By maintaining one of the microphones in plane 271 and the remainder of the four in plane 270 of the image-sound capture device 106, it is possible to define a spatial volume.
[0034] Consequently, noise coming from other people in the vicinity (shown as 276a and 276b) will be filtered out as they do not lie within the spatial volume defined in the volume focus 274. Additionally, noise that may be created just outside of the spatial volume, as shown by speaker 276c, will also be filtered out as it falls outside of the spatial volume.
[0035] Figure 7 illustrates a flowchart diagram in accordance with one embodiment of the present invention. The method begins at operation 302 where input is received from one or more sound sources at two or more sound capture microphones. In one example, the two or more sound capture microphones are integrated into the image-sound capture device 106. Alternatively, the two or more sound capture microphones can be part of a second module/housing that interfaces with the image capture unit 106b. Alternatively, the sound capture unit 106a can include any number of sound capture microphones, and sound capture microphones can be placed in specific locations designed to capture sound from a user that may be interfacing with a computing system.
[0036] The method moves to operation 304 where a delay path for each of the sound sources is determined. Example delay paths are defined by the sound paths 201 and 202 of Figure 3 A. As is well known, the delay paths define the time it takes for sound waves to travel from the sound sources to the specific microphones that are situated to capture the sound. Based on the delay it takes sound to travel from the particular sound sources 116, the microphones can determine what the delay is and approximate location from which the sound is emanating from using a standard triangulation algorithm.
[0037] The method then continues to operation 306 where a direction for each of the received inputs of the one or more sound sources is identified. That is, the direction from which the sound is originating from the sound sources 116 is identified relative to the location of the image-sound capture device, including the sound capture unit 106a. Based on the identified directions, sound sources that are not in an identified direction of a zone (or volume) of focus are filtered out in operation 308. By filtering out the sound sources that are not originating from directions that are in the vicinity of the zone of focus, it is possible to use the sound source not filtered out for interactivity with a computer program, as shown in operation 310. [0038] For instance, the interactive program can be a video game in which the user can interactively communicate with features of the video game, or players that may be opposing the primary player of the video game. The opposing player can either be local or located at a remote location and be in communication with the primary user over a network, such as the internet. In addition, the video game can also be played between a number of users in a group designed to interactively challenge each other's skills in a particular contest associated with the video game.
[0039] Figure 8 illustrates a flowchart diagram in which image-sound capture device operations 320 are illustrated separate from the software executed operations that are performed on the received input in operations 340. Thus, once the input from the one or more sound sources at the two or more sound capture microphones is received in operation 302, the method proceeds to operation 304 where in software, the delay path for each of the sound sources is determined. Based on the delay paths, a direction for each of the received inputs is identified for each of the one or more sound sources in operation 306, as mentioned above.
[0040] At this point, the method moves to operation 312 where the identified direction that is in proximity of video capture is determined. For instance, video capture will be targeted at an active image area as shown in Figure 1. Thus, the proximity of video capture would be within this active image area (or volume), and any direction associated with a sound source that is within this or in proximity to this, image-active area, will be determined. Based on this determination, the method proceeds to operation 314 where directions (or volumes) that are not in proximity of video capture are filtered out. Accordingly, distractions, noises and other extraneous input that could interfere in video game play of the primary player will be filtered out in the processing that is performed by the software executed during game play. [0041] Consequently, the primary user can interact with the video game, interact with other users of the video game that are actively using the video game, or communicate with other users over the network that may be logged into or associated with transactions for the same video game that is of interest. Such video game communication, interactivity and control will thus be uninterrupted by extraneous noises and/or observers that are not intended to be interactively communicating or participating in a particular game or interactive program.
[0042] It should be appreciated that the embodiments described herein may also apply to online gaming applications. That is, the embodiments described above may occur at a server that sends a video signal to multiple users over a distributed network, such as the Internet, to enable players at remote noisy locations to communicate with each other. It should be further appreciated that the embodiments described herein may be implemented through either a hardware or a software implementation. That is, the functional descriptions discussed above may be synthesized to define a microchip having logic configured to perform the functional tasks for each of the modules associated with the noise cancellation scheme. [0043] Also, the selective filtering of sound sources can have other applications, such as telephones. In phone use environments, there is usually a primary person (i.e., the caller) desiring to have a conversation with a third party (i.e., the callee). During that communication, however, there may be other people in the vicinity who are either talking or making noise. The phone, being targeted toward the primary user (by the direction of the receiver, for example) can make the sound coming from the primary user's mouth the zone of focus, and thus enable the selection for listening to only the primary user. This selective listening will therefore enable the substantial filtering out of voices or noises that are not associated with the primary person, and thus, the receiving party will be able to receive a more clear communication from the primary person using the phone. [0044] Additional technologies may also include other electronic equipment that can benefit from taking in sound as an input for control or communication. For instance, a user can control settings in an automobile by voice commands, while avoiding other passengers from disrupting the commands. Other applications may include computer controls of applications, such as browsing applications, document preparation, or communications. By enabling this filtering, it is possible to more effectively issue voice or sound commands without interruption by surrounding sounds. As such, any electronic apparatus.
[0045] Further, the embodiments of the present invention have a wide array of applications, and the scope of the claims should be read to include any such application that can benefit from the such embodiments.
[0046] For instance, in a similar application, it may be possible to filter out sound sources using sound analysis. If sound analysis is used, it is possible to use as few as one microphone. The sound captured by the single microphone can be digitally analyzed (in software or hardware) to determine which voice or sound is of interest. In some environments, such as gaming, it may be possible for the primary user to record his or her voice once to train the system to identify the particular voice. In this manner, exclusion of other voices or sounds will be facilitated. Consequently, it would not be necessary to identify a direction, as filtering could be done based one sound tones and/or frequencies. [0047] AU of the advantages mentioned above with respect to sound filtering, when direction and volume are taken into account, are equally applicable.
[0048] With the above embodiments in mind, it should be understood that the invention may employ various computer-implemented operations involving data stored in computer systems. These operations include operations requiring physical manipulation of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. Further, the manipulations performed are often referred to in terms, such as producing, identifying, determining, or comparing. [0049] The above described invention may be practiced with other computer system configurations including hand-held devices, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers and the like. The invention may also be practiced in distributing computing environments where tasks are performed by remote processing devices that are linked through a communications network. [0050] The invention can also be embodied as computer readable code on a computer readable medium. The computer readable medium is any data storage device that can store data which can be thereafter read by a computer system, including an electromagnetic wave carrier. Examples of the computer readable medium include hard drives, network attached storage (NAS), read-only memory, random-access memory, CD-ROMs, CD-Rs, CD-RWs, magnetic tapes, and other optical and non-optical data storage devices. The computer readable medium can also be distributed over a network coupled computer system so that the computer readable code is stored and executed in a distributed fashion. [0051] Although the foregoing invention has been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the appended claims. Accordingly, the present embodiments are to be considered as illustrative and not restrictive, and the invention is not to be limited to the details given herein, but may be modified within the scope and equivalents of the appended claims.
What is claimed is:

Claims

1. An apparatus for capturing image and sound during interactivity with a computer program, comprising: an image capture unit configured to capture one or more image frames; and a sound capture unit, the sound capture unit being configured to identify one or more sound sources, the sound capture unit generating data capable of being analyzed to determine a zone of focus at which to process sound to the substantial exclusion of sounds outside of the zone of focus, sound that is captured and processed for the zone of focus is used for interactivity with the computer program.
2. An apparatus for capturing image and sound during interactivity with a computer program as recited in claim 1, wherein the sound capture unit includes an array of microphones, the array of microphones being configured for receiving sound from the one or more sound sources, the sounds of the one or more sound sources defining sound paths to each of the microphones.
3. An apparatus for capturing image and sound during interactivity with a computer program as recited in claim 2, wherein the sound paths include particular delays that enable calculation of direction of each of the one or more sound sources relative to the apparatus for capturing image and sound.
4. An apparatus for capturing image and sound during interactivity with a computer program as recited in claim 1, further comprising: a computing system for interfacing with the apparatus for capturing image and sound, the computing system including, a processor, and memory, the memory being configured to store at least part of the computer program and selective sound source listening code, the selective sound source listening code enabling the identification of which of the one or more sound sources identify as the zone of focus.
5. An apparatus for capturing image and sound during interactivity with a computer program as recited in claim 1, wherein the sound capture unit includes at least four microphones, and one of the four microphones is not in a same plane as the others.
6. An apparatus for capturing image and sound during interactivity with a computer program as recited in claim 5, wherein the four microphones define a spatial volume.
7. An apparatus for capturing image and sound during interactivity with a computer program as recited in claim 6, wherein the spatial volume is defined as a volume of focus for listening during interactivity with the computer program.
8. An apparatus for capturing image and sound during interactivity with a computer program as recited in claim 7, wherein the computer program is a game program.
9. An apparatus for capturing image and sound during interactivity with a computer program as recited in claim 1, wherein the computer program is a game program.
10. An apparatus for capturing image and sound during interactivity with a computer program as recited in claim 9, wherein the image capture unit is camera and the sound capture unit is defined by an array of two or more microphones.
11. A method for selective sound source listening during interactivity with a computer program, comprising: receiving input from one or more sound sources at two or more sound source capture microphones; determining delay paths from each of the sound sources; identifying a direction for each of the received inputs of each of the one or more sound sources; and filtering out sound sources that are not in an identified direction of a zone of focus, the zone of focus supplying the sound source for the interactivity with the computer program.
12. A method for selective sound source listening during interactivity with a computer program as recited in claim 11, wherein filtering receives processed input data after analysis by an image capture unit, the image capture unit being directionally positioned to receive image input for the computer program.
13. A method for selective sound source listening during interactivity with a computer program as recited in claim 11, wherein the computer program is a game, and the game receives interactive input from both image data and sound data, the sound data being from the sound source of the zone of focus.
14. A method for selective sound source listening during interactivity with a computer program as recited in claim 11, wherein the two or more sound capture microphones include at least four microphones, and at least one of the four microphones is on a different plane than the others.
15. A method for selective sound source listening during interactivity with a computer program as recited in claim 14, wherein identifying the direction for each of the received inputs of each of the one or more sound sources includes processing a triangulation algorithm, the triangulation algorithm defining the direction that is relative to a location at which input is received from the one or more sound sources at the two or more sound source capture microphones.
16. A method for selective sound source listening during interactivity with a computer program as recited in claim 15, further comprising: buffering the received input from the one or more sound sources associated with the two or more sound source capture microphones; and delay processing the received buffered inputs; the filtering further, including, selecting the one of the sound sources, the selected sound source output being a summation of sound from each of the sound source capture microphones.
17. A game system, comprising: an image-sound capture device, the image-sound capture device being configured to interface with a computing system that enables execution of an interactive computer game, the image-capture device including, video capture hardware capable of being positioned to capture video from a zone of focus, and an array of microphones for capturing sound from one or more sound sources, each sound source being identified and associated with a direction relative to the image-sound capture device, the zone of focus associated with the video capture hardware is configured to be used to identify one of the sound sources at the direction that is in the proximity of the zone of focus.
18. A game system as recited in claim 17, wherein the video capture hardware receives video data to enable interactivity with features of the computer game.
19. A game system as recited in claim 17, wherein the sound source in the proximity of the zone of focus is enabled interactivity with the computer game or voice communication with other game users.
20. A game system as recited in claim 19, wherein sound sources outside of the zone of focus are filtered out of interactivity with the computer game.
21. An apparatus for capturing sound during interactivity with a computer program, comprising: a sound capture unit for capturing sound from one or more sound sources; a processor and memory for processing and receiving the sound, the processor being configured to execute instructions to identify one of the sound sources as associated with a zone of focus, the sound from the identified sound source being processed to enable interactive input with the computer program.
22. An apparatus for capturing sound during interactivity with a computer program as recited in claim 21, wherein the instructions to identify one of the sound sources uses triangulation to identify a direction of each of the sound sources.
23. An apparatus for capturing sound during interactivity with a computer program as recited in claim 21, wherein the instructions to identify one of the sound sources uses sound frequencies to identify each of the sound sources.
24. An apparatus for capturing sound during interactivity with a computer program as recited in claim 21, wherein the interactive input is one of communication with a program or communication with a third party.
25. An apparatus for capturing sound during interactivity with a computer program as recited in claim 21, wherein the input is used for interactive input interfaces with features of a computer game.
26. An apparatus for capturing sound during interactivity with a computer program as recited in claim 21, wherein the interactive input interfaces with an electronic apparatus.
PCT/US2006/016670 2005-05-05 2006-04-28 Selective sound source listening in conjunction with computer interactive processing WO2006121681A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP06758867A EP1877149A1 (en) 2005-05-05 2006-04-28 Selective sound source listening in conjunction with computer interactive processing
CN2006800064384A CN101132839B (en) 2005-05-05 2006-04-28 Selective sound source listening in conjunction with computer interactive processing
JP2008510106A JP5339900B2 (en) 2005-05-05 2006-04-28 Selective sound source listening by computer interactive processing

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US67841305P 2005-05-05 2005-05-05
US60/678,413 2005-05-05

Publications (1)

Publication Number Publication Date
WO2006121681A1 true WO2006121681A1 (en) 2006-11-16

Family

ID=36721197

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2006/016670 WO2006121681A1 (en) 2005-05-05 2006-04-28 Selective sound source listening in conjunction with computer interactive processing

Country Status (6)

Country Link
EP (1) EP1877149A1 (en)
JP (1) JP5339900B2 (en)
KR (1) KR100985694B1 (en)
CN (1) CN101132839B (en)
TW (1) TWI308080B (en)
WO (1) WO2006121681A1 (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009021124A2 (en) * 2007-08-07 2009-02-12 Dna Digital Media Group System and method for a motion sensing amusement device
US7783061B2 (en) 2003-08-27 2010-08-24 Sony Computer Entertainment Inc. Methods and apparatus for the targeted sound detection
US7803050B2 (en) 2002-07-27 2010-09-28 Sony Computer Entertainment Inc. Tracking device with sound emitter for use in obtaining information for controlling game program execution
US7809145B2 (en) 2006-05-04 2010-10-05 Sony Computer Entertainment Inc. Ultra small microphone array
US8073157B2 (en) 2003-08-27 2011-12-06 Sony Computer Entertainment Inc. Methods and apparatus for targeted sound detection and characterization
US8139793B2 (en) 2003-08-27 2012-03-20 Sony Computer Entertainment Inc. Methods and apparatus for capturing audio signals based on a visual image
WO2014143940A1 (en) * 2013-03-15 2014-09-18 The Nielsen Company (Us), Llc Methods and apparatus to detect spillover in an audience monitoring system
US8947347B2 (en) 2003-08-27 2015-02-03 Sony Computer Entertainment Inc. Controlling actions in a video game unit
US9094710B2 (en) 2004-09-27 2015-07-28 The Nielsen Company (Us), Llc Methods and apparatus for using location information to manage spillover in an audience monitoring system
US9118960B2 (en) 2013-03-08 2015-08-25 The Nielsen Company (Us), Llc Methods and systems for reducing spillover by detecting signal distortion
US9174119B2 (en) 2002-07-27 2015-11-03 Sony Computer Entertainement America, LLC Controller for providing inputs to control execution of a program when inputs are combined
US9191704B2 (en) 2013-03-14 2015-11-17 The Nielsen Company (Us), Llc Methods and systems for reducing crediting errors due to spillover using audio codes and/or signatures
US9217789B2 (en) 2010-03-09 2015-12-22 The Nielsen Company (Us), Llc Methods, systems, and apparatus to calculate distance from audio sources
US9219928B2 (en) 2013-06-25 2015-12-22 The Nielsen Company (Us), Llc Methods and apparatus to characterize households with media meter data
US9220980B2 (en) 2011-12-19 2015-12-29 Empire Technology Development Llc Pause and resume schemes for gesture-based game
US9258607B2 (en) 2010-12-14 2016-02-09 The Nielsen Company (Us), Llc Methods and apparatus to determine locations of audience members
US9264748B2 (en) 2013-03-01 2016-02-16 The Nielsen Company (Us), Llc Methods and systems for reducing spillover by measuring a crest factor
US9426525B2 (en) 2013-12-31 2016-08-23 The Nielsen Company (Us), Llc. Methods and apparatus to count people in an audience
US9563265B2 (en) 2012-01-12 2017-02-07 Qualcomm Incorporated Augmented reality with sound and geometric analysis
US9680583B2 (en) 2015-03-30 2017-06-13 The Nielsen Company (Us), Llc Methods and apparatus to report reference media data to multiple data collection facilities
US9848222B2 (en) 2015-07-15 2017-12-19 The Nielsen Company (Us), Llc Methods and apparatus to detect spillover
US9924224B2 (en) 2015-04-03 2018-03-20 The Nielsen Company (Us), Llc Methods and apparatus to determine a state of a media presentation device

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8323106B2 (en) 2008-05-30 2012-12-04 Sony Computer Entertainment America Llc Determination of controller three-dimensional location using image analysis and ultrasonic communication
EP1880866A1 (en) 2006-07-19 2008-01-23 Sicpa Holding S.A. Oriented image coating on transparent substrate
TWI404967B (en) * 2007-10-19 2013-08-11 Chi Mei Comm Systems Inc System and method for locating sound sources
US8953029B2 (en) * 2009-05-08 2015-02-10 Sony Computer Entertainment America Llc Portable device interaction via motion sensitive controller
CN101819758B (en) * 2009-12-22 2013-01-16 中兴通讯股份有限公司 System of controlling screen display by voice and implementation method
EP2517478B1 (en) 2009-12-24 2017-11-01 Nokia Technologies Oy An apparatus
US9361730B2 (en) * 2012-07-26 2016-06-07 Qualcomm Incorporated Interactions of tangible and augmented reality objects
CN104422922A (en) * 2013-08-19 2015-03-18 中兴通讯股份有限公司 Method and device for realizing sound source localization by utilizing mobile terminal
US10163455B2 (en) 2013-12-03 2018-12-25 Lenovo (Singapore) Pte. Ltd. Detecting pause in audible input to device
WO2017184149A1 (en) 2016-04-21 2017-10-26 Hewlett-Packard Development Company, L.P. Electronic device microphone listening modes
CN106067301B (en) * 2016-05-26 2019-06-25 浪潮金融信息技术有限公司 A method of echo noise reduction is carried out using multidimensional technology
CN109307856A (en) * 2017-07-27 2019-02-05 深圳市冠旭电子股份有限公司 A kind of sterically defined exchange method of robot and device
CN107886965B (en) * 2017-11-28 2021-04-20 游密科技(深圳)有限公司 Echo cancellation method for game background sound
CN109168075B (en) * 2018-10-30 2021-11-30 重庆辉烨物联科技有限公司 Video information transmission method, system and server
CN110602424A (en) * 2019-08-28 2019-12-20 维沃移动通信有限公司 Video processing method and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5993314A (en) * 1997-02-10 1999-11-30 Stadium Games, Ltd. Method and apparatus for interactive audience participation by audio command
US20020048376A1 (en) 2000-08-24 2002-04-25 Masakazu Ukita Signal processing apparatus and signal processing method
US20040046736A1 (en) 1997-08-22 2004-03-11 Pryor Timothy R. Novel man machine interfaces and applications

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07218614A (en) * 1994-01-31 1995-08-18 Suzuki Motor Corp Method and apparatus for calculating position of sound source
JPH11331827A (en) * 1998-05-12 1999-11-30 Fujitsu Ltd Television camera
JP2000163178A (en) * 1998-11-26 2000-06-16 Hitachi Ltd Interaction device with virtual character and storage medium storing program generating video of virtual character
IL134979A (en) * 2000-03-09 2004-02-19 Be4 Ltd System and method for optimization of three-dimensional audio
JP4868671B2 (en) * 2001-09-27 2012-02-01 中部電力株式会社 Sound source exploration system
US7613310B2 (en) * 2003-08-27 2009-11-03 Sony Computer Entertainment Inc. Audio input system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5993314A (en) * 1997-02-10 1999-11-30 Stadium Games, Ltd. Method and apparatus for interactive audience participation by audio command
US20040046736A1 (en) 1997-08-22 2004-03-11 Pryor Timothy R. Novel man machine interfaces and applications
US20020048376A1 (en) 2000-08-24 2002-04-25 Masakazu Ukita Signal processing apparatus and signal processing method

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9174119B2 (en) 2002-07-27 2015-11-03 Sony Computer Entertainement America, LLC Controller for providing inputs to control execution of a program when inputs are combined
US7803050B2 (en) 2002-07-27 2010-09-28 Sony Computer Entertainment Inc. Tracking device with sound emitter for use in obtaining information for controlling game program execution
US8947347B2 (en) 2003-08-27 2015-02-03 Sony Computer Entertainment Inc. Controlling actions in a video game unit
US7783061B2 (en) 2003-08-27 2010-08-24 Sony Computer Entertainment Inc. Methods and apparatus for the targeted sound detection
US8073157B2 (en) 2003-08-27 2011-12-06 Sony Computer Entertainment Inc. Methods and apparatus for targeted sound detection and characterization
US8139793B2 (en) 2003-08-27 2012-03-20 Sony Computer Entertainment Inc. Methods and apparatus for capturing audio signals based on a visual image
US9094710B2 (en) 2004-09-27 2015-07-28 The Nielsen Company (Us), Llc Methods and apparatus for using location information to manage spillover in an audience monitoring system
US9794619B2 (en) 2004-09-27 2017-10-17 The Nielsen Company (Us), Llc Methods and apparatus for using location information to manage spillover in an audience monitoring system
US7809145B2 (en) 2006-05-04 2010-10-05 Sony Computer Entertainment Inc. Ultra small microphone array
WO2009021124A3 (en) * 2007-08-07 2009-07-02 Dna Digital Media Group System and method for a motion sensing amusement device
WO2009021124A2 (en) * 2007-08-07 2009-02-12 Dna Digital Media Group System and method for a motion sensing amusement device
US9250316B2 (en) 2010-03-09 2016-02-02 The Nielsen Company (Us), Llc Methods, systems, and apparatus to synchronize actions of audio source monitors
US9217789B2 (en) 2010-03-09 2015-12-22 The Nielsen Company (Us), Llc Methods, systems, and apparatus to calculate distance from audio sources
US9258607B2 (en) 2010-12-14 2016-02-09 The Nielsen Company (Us), Llc Methods and apparatus to determine locations of audience members
US9220980B2 (en) 2011-12-19 2015-12-29 Empire Technology Development Llc Pause and resume schemes for gesture-based game
US9563265B2 (en) 2012-01-12 2017-02-07 Qualcomm Incorporated Augmented reality with sound and geometric analysis
US9264748B2 (en) 2013-03-01 2016-02-16 The Nielsen Company (Us), Llc Methods and systems for reducing spillover by measuring a crest factor
US9332306B2 (en) 2013-03-08 2016-05-03 The Nielsen Company (Us), Llc Methods and systems for reducing spillover by detecting signal distortion
US9118960B2 (en) 2013-03-08 2015-08-25 The Nielsen Company (Us), Llc Methods and systems for reducing spillover by detecting signal distortion
US9191704B2 (en) 2013-03-14 2015-11-17 The Nielsen Company (Us), Llc Methods and systems for reducing crediting errors due to spillover using audio codes and/or signatures
US9380339B2 (en) 2013-03-14 2016-06-28 The Nielsen Company (Us), Llc Methods and systems for reducing crediting errors due to spillover using audio codes and/or signatures
US9912990B2 (en) 2013-03-15 2018-03-06 The Nielsen Company (Us), Llc Methods and apparatus to detect spillover in an audience monitoring system
US9197930B2 (en) 2013-03-15 2015-11-24 The Nielsen Company (Us), Llc Methods and apparatus to detect spillover in an audience monitoring system
US10219034B2 (en) 2013-03-15 2019-02-26 The Nielsen Company (Us), Llc Methods and apparatus to detect spillover in an audience monitoring system
US9503783B2 (en) 2013-03-15 2016-11-22 The Nielsen Company (Us), Llc Methods and apparatus to detect spillover in an audience monitoring system
US10057639B2 (en) 2013-03-15 2018-08-21 The Nielsen Company (Us), Llc Methods and apparatus to detect spillover in an audience monitoring system
WO2014143940A1 (en) * 2013-03-15 2014-09-18 The Nielsen Company (Us), Llc Methods and apparatus to detect spillover in an audience monitoring system
US9219928B2 (en) 2013-06-25 2015-12-22 The Nielsen Company (Us), Llc Methods and apparatus to characterize households with media meter data
US11711576B2 (en) 2013-12-31 2023-07-25 The Nielsen Company (Us), Llc Methods and apparatus to count people in an audience
US9918126B2 (en) 2013-12-31 2018-03-13 The Nielsen Company (Us), Llc Methods and apparatus to count people in an audience
US10560741B2 (en) 2013-12-31 2020-02-11 The Nielsen Company (Us), Llc Methods and apparatus to count people in an audience
US9426525B2 (en) 2013-12-31 2016-08-23 The Nielsen Company (Us), Llc. Methods and apparatus to count people in an audience
US11197060B2 (en) 2013-12-31 2021-12-07 The Nielsen Company (Us), Llc Methods and apparatus to count people in an audience
US9680583B2 (en) 2015-03-30 2017-06-13 The Nielsen Company (Us), Llc Methods and apparatus to report reference media data to multiple data collection facilities
US9924224B2 (en) 2015-04-03 2018-03-20 The Nielsen Company (Us), Llc Methods and apparatus to determine a state of a media presentation device
US10735809B2 (en) 2015-04-03 2020-08-04 The Nielsen Company (Us), Llc Methods and apparatus to determine a state of a media presentation device
US11363335B2 (en) 2015-04-03 2022-06-14 The Nielsen Company (Us), Llc Methods and apparatus to determine a state of a media presentation device
US11678013B2 (en) 2015-04-03 2023-06-13 The Nielsen Company (Us), Llc Methods and apparatus to determine a state of a media presentation device
US10694234B2 (en) 2015-07-15 2020-06-23 The Nielsen Company (Us), Llc Methods and apparatus to detect spillover
US11184656B2 (en) 2015-07-15 2021-11-23 The Nielsen Company (Us), Llc Methods and apparatus to detect spillover
US10264301B2 (en) 2015-07-15 2019-04-16 The Nielsen Company (Us), Llc Methods and apparatus to detect spillover
US9848222B2 (en) 2015-07-15 2017-12-19 The Nielsen Company (Us), Llc Methods and apparatus to detect spillover
US11716495B2 (en) 2015-07-15 2023-08-01 The Nielsen Company (Us), Llc Methods and apparatus to detect spillover

Also Published As

Publication number Publication date
JP5339900B2 (en) 2013-11-13
CN101132839B (en) 2011-09-07
KR20080009153A (en) 2008-01-24
KR100985694B1 (en) 2010-10-05
TWI308080B (en) 2009-04-01
CN101132839A (en) 2008-02-27
EP1877149A1 (en) 2008-01-16
TW200708328A (en) 2007-03-01
JP2008539874A (en) 2008-11-20

Similar Documents

Publication Publication Date Title
US8723984B2 (en) Selective sound source listening in conjunction with computer interactive processing
EP1877149A1 (en) Selective sound source listening in conjunction with computer interactive processing
US8947347B2 (en) Controlling actions in a video game unit
EP2352149B1 (en) Selective sound source listening in conjunction with computer interactive processing
US10911882B2 (en) Methods and systems for generating spatialized audio
KR101576294B1 (en) Apparatus and method to perform processing a sound in a virtual reality system
JP4921550B2 (en) How to give emotional features to computer-generated avatars during gameplay
US7113610B1 (en) Virtual sound source positioning
WO2009104564A1 (en) Conversation server in virtual space, method for conversation and computer program
US20060015560A1 (en) Multi-sensory emoticons in a communication system
US20220023756A1 (en) Method for game service and computing device for executing the method
EP1499096A1 (en) Network game method, network game terminal, and server
JP2012050791A (en) Character display device, character display method, and program
CN113856199A (en) Game data processing method and device and game control system
US20230218998A1 (en) 3D Spatialisation of Voice Chat
JP2022125665A (en) Audio reproduction program and audio reproduction device
CN115497491A (en) Audio cancellation system and method
CN115487491A (en) Audio cancellation system and method
JP2006005618A (en) Sound source environment restoration system, automatic sound source environment restoring device, and method and program for sound source environment restoration

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200680006438.4

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application
ENP Entry into the national phase

Ref document number: 2008510106

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2006758867

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

NENP Non-entry into the national phase

Ref country code: RU

WWE Wipo information: entry into national phase

Ref document number: 1020077028369

Country of ref document: KR