WO2014144579A1 - System and method for updating an adaptive speech recognition model - Google Patents
System and method for updating an adaptive speech recognition model Download PDFInfo
- Publication number
- WO2014144579A1 WO2014144579A1 PCT/US2014/029050 US2014029050W WO2014144579A1 WO 2014144579 A1 WO2014144579 A1 WO 2014144579A1 US 2014029050 W US2014029050 W US 2014029050W WO 2014144579 A1 WO2014144579 A1 WO 2014144579A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- implementations
- user
- speech recognition
- recognition model
- sound
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Definitions
- the disclosed implementations relate generally to digital assistants. More specifically, to a method and system for obtaining training data to update an adaptive speech recognition model for use when interacting with a digital assistant.
- voice-based digital assistants such as Apple's SIRI
- voice-based digital assistant systems utilize either speaker-independent models or speaker-dependent models in order to generate speech-to-text (STT) input to the digital assistant.
- STT speech-to-text
- the speaker-dependent model increases accuracy in generating the STT input, and therefore, enables the digital assistant to provide better results to the user.
- speaker-dependent models require significant training data in order to function with increased accuracy. Reciting many lines of predefined text in order to train a speaker-dependent model has several drawbacks. Many users would prefer not to expend the time and effort in providing training data for a model. In addition, a user's speech is markedly different when reading as opposed to ordinary conversation, therefore the accuracy of a speech model trained with data obtained from a user reading is worse than one trained with data obtained from a user's ordinary conversation. Finally, a user's speech changes with time and environment.
- the implementations described below provide systems and methods for obtaining training data to update an adaptive speech recognition model for use when interacting with a digital assistant.
- the systems and methods obtain training data to update a speaker-dependent speech recognition model using a user's ordinary conversations.
- Interactions with a voice-based digital assistant or other speech-based services, such as a speech-to-text transcription service
- a speaker-dependent speech recognition model e.g., an adaptive speech recognition model
- the accuracy of speaker-dependent speech recognition models depends on the volume and quality of training data.
- an adaptive speech recognition model is updated by deriving training data from tapping an outbound audio channel of a mobile communication device to obtain a call audio signal.
- Some implementations provide a method for obtaining training data to update an adaptive speech recognition model. The method is performed at a first mobile
- the method includes determining that a first user of a first mobile communication device is engaged in a call over a communications network and providing an adaptive speech recognition model.
- the method further includes tapping into an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device, and updating the adaptive speech recognition model with training data derived from the call audio signal.
- Some implementations provide a method for obtaining training data to update an adaptive speech recognition model.
- the method is performed at a server system distinct from a first mobile communication device.
- the method includes determining that a first user of the first mobile communication device is engaged in a call over a communications network and providing an adaptive speech recognition model.
- the method further includes tapping into an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device, and updating the adaptive speech recognition model with training data derived from the call audio signal.
- the first mobile communication device is a mobile telephone. In some implementations, the first mobile communication device is a laptop computer. In some implementations, the first mobile communication device is a tablet computer. [0009] In some implementations, the call is a mobile telephone call. In some implementations, the call is a multimedia communication. In some implementations, the call is a VoIP communication. In some implementations, the call comprises an interaction between the first user of the first mobile communication device and a second mobile device. In some implementations, the call comprises a conversation between the first user of the first mobile communication device and a user of a second device.
- providing an adaptive speech recognition model comprises providing a speaker-independent model. In some implementations, providing an adaptive speech recognition model comprises providing a speaker-dependent model associated with a user of the first mobile communication device.
- the method further includes, prior to tapping into the outbound audio channel, converting the call audio signal from an analog audio signal to a digital audio signal. In some implementations the method further comprises, prior to tapping into the outbound audio channel, determining that the first mobile communication device is in an adaptive- speaker- training mode.
- tapping into the outbound audio channel comprises tapping into the baseband unit.
- tapping includes tapping into the digital signal processor (DSP).
- tapping includes tapping into the application processor.
- the method further includes, prior to updating the adaptive speech recognition model, determining that the call has ended.
- training data comprises one or more speaker- dependent sound units.
- updating the adaptive speech recognition model comprises replacing the adaptive speech recognition model with a new adaptive speech recognition model generated from the training data.
- updating the adaptive speech recognition model comprises generating a speaker-dependent model from the data, comparing the speaker-dependent model to the adaptive speech recognition model, and updating the adaptive speech recognition model based on the comparison.
- the method further includes modifying the adaptive speech recognition model with training data derived from audio user interaction with a digital assistant.
- the method further comprises, after said updating, receiving invocation of the digital assistant, receiving speech input from a second user, generating speech-to-text output corresponding to the speech input, and providing the speech-to-text output to the digital assistant.
- generating speech-to- text output corresponding to the speech input comprises comparing the speech input with the adaptive speech recognition model; in accordance with a determination that the second user is the same as the first user, performing automatic speech recognition using the adaptive speech recognition model to generate speech-to-text output; and in accordance with a determination that the second user is distinct from the first user, performing automatic speech recognition using a speaker-independent model to generate speech-to-text output.
- the method further includes storing the adaptive speech recognition model in memory.
- the memory is a component of the first mobile communication device. In some implementations, the memory is a component of a server distinct from the first mobile communication device.
- tapping into the outbound audio channel comprises utilizing a voice activity detector to determine when there is active speech on the outbound audio channel; in accordance with a determination that there is active speech on the outbound audio channel, tapping into the outbound audio channel; and in accordance with a voice activity detector to determine when there is active speech on the outbound audio channel; in accordance with a determination that there is active speech on the outbound audio channel, tapping into the outbound audio channel; and in accordance with a
- updating the adaptive speech recognition model comprises comparing the call audio signal with the adaptive speech recognition model, generating a confidence score based on the comparison; in accordance with a determination that the confidence score is at or above a predetermined threshold, updating the adaptive speech recognition model with the training data derived from the call audio signal; and in accordance with a determination that the confidence score is below the predetermined threshold, forgoing to update the adaptive speech recognition model.
- the training data derived from the call audio signal includes one or more speaker-dependent sound units
- updating the adaptive speech recognition model comprises comparing at least a subset of the one or more speaker-dependent sound units to the adaptive speech recognition model, generating one or more adaptive speech vectors based on the comparison, and modifying the adaptive speech recognition model based on at least a subset of the one or more adaptive speech vectors.
- a system includes one or more processors, memory, and one or more programs stored in the memory.
- the one or more programs comprising instructions to determine that a first user of a first mobile
- the communication device is engaged in a call over a communications network and provide an adaptive speech recognition model.
- the one or more programs further comprising instructions to tap into an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device and update the adaptive speech recognition model with training data derived from the call audio signal.
- the system comprises the first mobile communication device having the one or more microphones. In accordance with some implementations, the system comprises a server system distinct from the first mobile communication device.
- a computer-readable storage medium e.g., a non-transitory computer readable storage medium
- the computer-readable storage medium storing one or more programs for execution by one or more processors of an electronic device, the one or more programs including instructions for performing any of the methods described herein.
- an electronic device e.g., a portable electronic device
- an electronic device e.g., a portable electronic device
- a processing unit configured to perform any of the methods described herein.
- an electronic device e.g., a portable electronic device
- an information processing apparatus for use in an electronic device, the information processing apparatus comprising means for performing any of the methods described herein.
- Figure 1 is a block diagram illustrating an environment in which a digital assistant operates in accordance with some implementations.
- Figure 2 is a block diagram illustrating a digital assistant client system in accordance with some implementations.
- Figure 3A is a block diagram illustrating a standalone digital assistant system or a digital assistant server system in accordance with some implementations.
- Figure 3B is a block diagram illustrating functions of the digital assistant shown in Figure 3A in accordance with some implementations.
- Figure 3C is a network diagram illustrating a portion of an ontology in accordance with some implementations.
- Figure 4 is a block diagram illustrating components of an audio system, in accordance with some implementations.
- Figures 5A-5B are flow charts illustrating methods for updating an adaptive speech recognition model, in accordance with some implementations.
- Figure 6 is a functional block diagram of an electronic device in accordance with some embodiments.
- FIG 1 is a block diagram of an operating environment 100 of a digital assistant according to some implementations.
- digital assistant virtual assistant
- intelligent automated assistant e.g., identify a task type that corresponds to the natural language input
- voice-based digital assistant e.g., voice-based digital assistant
- automated digital assistant refers to any information processing system that interprets natural language input in spoken and/or textual form to deduce user intent (e.g., identify a task type that corresponds to the natural language input), and performs actions based on the deduced user intent (e.g., perform a task corresponding to the identified task type).
- the system can perform one or more of the following: identifying a task flow with steps and parameters designed to accomplish the deduced user intent (e.g., identifying a task type), inputting specific requirements from the deduced user intent into the task flow, executing the task flow by invoking programs, methods, services, APIs, or the like (e.g., sending a request to a service provider); and generating output responses to the user in an audible (e.g., speech) and/or visual form.
- identifying a task flow with steps and parameters designed to accomplish the deduced user intent e.g., identifying a task type
- inputting specific requirements from the deduced user intent into the task flow e.g., identifying a task type
- executing the task flow by invoking programs, methods, services, APIs, or the like (e.g., sending a request to a service provider)
- generating output responses to the user in an audible (e.g., speech) and/or visual form e.
- a digital assistant system is capable of accepting a user request at least partially in the form of a natural language command, request, statement, narrative, and/or inquiry.
- the user request seeks either an informational answer or performance of a task by the digital assistant system.
- a satisfactory response to the user request is generally either provision of the requested informational answer, performance of the requested task, or a combination of the two.
- a user may ask the digital assistant system a question, such as "Where am I right now?" Based on the user's current location, the digital assistant may answer, "you are in Central Park near the west gate.” The user may also request the performance of a task, for example, by stating "Please invite my friends to my girlfriend's birthday party next week.” In response, the digital assistant may acknowledge the request by generating a voice output, "Yes, right away," and then send a suitable calendar invite from the user's email address to each of the user's friends listed in the user's electronic address book or contact list.
- the digital assistant can also provide responses in other visual or audio forms (e.g., as text, alerts, music, videos, animations, etc.).
- a digital assistant system is implemented according to a client-server model.
- the digital assistant system includes a client-side portion (e.g., 102a and 102b) (hereafter “digital assistant (DA) client 102") executed on a user device (e.g., 104a and 104b), and a server-side portion 106 (hereafter "digital assistant (DA) server 106") executed on a server system 108.
- the DA client 102 communicates with the DA server 106 through one or more networks 110.
- the DA client 102 provides client-side functionalities such as user-facing input and output processing and communications with the DA server 106.
- the DA server 106 provides server-side functionalities for any number of DA clients 102 each residing on a respective user device 104 (also called a client device or electronic device).
- the DA server 106 includes a client-facing I/O interface 112, one or more processing modules 114, data and models 116, an I/O interface to external services 118, a photo and tag database 130, and a photo-tag module 132.
- the client-facing I/O interface facilitates the client-facing input and output processing for the digital assistant server 106.
- the one or more processing modules 114 utilize the data and models 116 to determine the user's intent based on natural language input and perform task execution based on the deduced user intent.
- Photo and tag database 130 stores fingerprints of digital photographs, and, optionally digital photographs themselves, as well as tags associated with the digital photographs.
- Photo-tag module 132 creates tags, stores tags in association with photographs and/or fingerprints, automatically tags photographs, and links tags to locations within photographs.
- the DA server 106 communicates with external services 120 (e.g., navigation service(s) 122-1, messaging service(s) 122-2, information service(s) 122-3, calendar service 122-4, telephony service 122-5, photo service(s) 122-6, etc.) through the network(s) 110 for task completion or information acquisition.
- external services 120 e.g., navigation service(s) 122-1, messaging service(s) 122-2, information service(s) 122-3, calendar service 122-4, telephony service 122-5, photo service(s) 122-6, etc.
- the I/O interface to the external services 118 facilitates such communications.
- Examples of the user device 104 include, but are not limited to, a handheld computer, a personal digital assistant (PDA), a tablet computer, a laptop computer, a desktop computer, a cellular telephone, a smartphone, an enhanced general packet radio service (EGPRS) mobile phone, a media player, a navigation device, a game console, a television, a remote control, or a combination of any two or more of these data processing devices or any other suitable data processing devices. More details on the user device 104 are provided in reference to an exemplary user device 104 shown in Figure 2.
- PDA personal digital assistant
- EVS enhanced general packet radio service
- Examples of the communication network(s) 110 include local area networks
- the communication network(s) 110 may be implemented using any known network protocol, including various wired or wireless protocols, such as Ethernet, Universal Serial Bus (USB), FIREWIRE, Global System for Mobile Communications (GSM), Enhanced Data GSM Environment (EDGE), code division multiple access (CDMA), time division multiple access (TDMA), Bluetooth, Wi-Fi, voice over Internet Protocol (VoIP), Wi-MAX, or any other suitable communication protocol.
- USB Universal Serial Bus
- GSM Global System for Mobile Communications
- EDGE Enhanced Data GSM Environment
- CDMA code division multiple access
- TDMA time division multiple access
- Bluetooth Wi-Fi
- Wi-Fi voice over Internet Protocol
- Wi-MAX wireless wide area network
- the server system 108 can be implemented on at least one data processing apparatus and/or a distributed network of computers.
- the server system 108 also employs various virtual devices and/or services of third party service providers (e.g., third-party cloud service providers) to provide the underlying computing resources and/or infrastructure resources of the server system 108.
- third party service providers e.g., third-party cloud service providers
- a digital assistant system refers only to the server-side portion (e.g., the DA server 106).
- the functions of a digital assistant can be implemented as a standalone application installed on a user device.
- the divisions of functionalities between the client and server portions of the digital assistant can vary in different implementations.
- the DA client 102 is a thin-client that provides only user-facing input and output processing functions, and delegates all other functionalities of the digital assistant to the DA server 106. In some other
- the DA client 102 is configured to perform or assist one or more functions of the DA server 106.
- FIG. 2 is a block diagram of a user device 104 in accordance with some implementations.
- the user device 104 includes a memory interface 202, one or more processors 204, and a peripherals interface 206.
- the various components in the user device 104 are coupled by one or more communication buses or signal lines.
- the user device 104 includes various sensors, subsystems, and peripheral devices that are coupled to the peripherals interface 206. The sensors, subsystems, and peripheral devices gather information and/or facilitate various functionalities of the user device 104.
- a motion sensor 210 e.g., an accelerometer
- a light sensor 212 e.g., a GPS receiver 213, a temperature sensor, and a proximity sensor 214 are coupled to the peripherals interface 206 to facilitate orientation, light, and proximity sensing functions.
- other sensors 216 such as a biometric sensor, barometer, and the like, are connected to the peripherals interface 206, to facilitate related functionalities.
- the user device 104 includes a camera subsystem 220 coupled to the peripherals interface 206.
- an optical sensor 222 of the camera subsystem 220 facilitates camera functions, such as taking photographs and recording video clips.
- the user device 104 includes one or more wired and/or wireless communication subsystems 224 provide communication functions.
- the communication subsystems 224 typically includes various communication ports, radio frequency receivers and transmitters, and/or optical (e.g., infrared) receivers and transmitters.
- the user device 104 includes an audio subsystem 226 coupled to one or more speakers 228 and one or more microphones 230 to facilitate voice-enabled functions, such as voice recognition, voice replication, digital recording, and telephony functions.
- the audio subsystem 226 is coupled to a voice trigger system 400.
- the voice trigger system 400 and/or the audio subsystem 226 includes low-power audio circuitry and/or programs (i.e., including hardware and/or software) for receiving and/or analyzing sound inputs, including, for example, one or more analog-to-digital converters, digital signal processors (DSPs), sound detectors, memory buffers, codecs, and the like.
- DSPs digital signal processors
- the low-power audio circuitry (alone or in addition to other components of the user device 104) provides voice (or sound) trigger functionality for one or more aspects of the user device 104, such as a voice-based digital assistant or other speech-based service.
- the low-power audio circuitry provides voice trigger functionality even when other components of the user device 104 are shut down and/or in a standby mode, such as the processor(s) 204, I/O subsystem 240, memory 250, and the like.
- the voice trigger system 400 and the audio subsystem 226 are described in further detail with respect to Figure 4.
- an I/O subsystem 240 is also coupled to the peripheral interface 206.
- the user device 104 includes a touch screen 246, and the I/O subsystem 240 includes a touch screen controller 242 coupled to the touch screen 246.
- the touch screen 246 and the touch screen controller 242 are typically configured to, for example, detect contact and movement or break thereof using any of a plurality of touch sensitivity technologies, such as capacitive, resistive, infrared, surface acoustic wave technologies, proximity sensor arrays, and the like.
- the user device 104 includes a display that does not include a touch-sensitive surface.
- the user device 104 includes a separate touch-sensitive surface.
- the user device 104 includes other input controller(s) 244.
- the other input controller(s) 244 are typically coupled to other input/control devices 248, such as one or more buttons, rocker switches, thumb-wheel, infrared port, USB port, and/or a pointer device such as a stylus.
- the memory interface 202 is coupled to memory 250. In some embodiments,
- memory 250 includes a non-transitory computer readable medium, such as high-speed random access memory and/or non-volatile memory (e.g., one or more magnetic disk storage devices, one or more flash memory devices, one or more optical storage devices, and/or other non-volatile solid-state memory devices).
- non-transitory computer readable medium such as high-speed random access memory and/or non-volatile memory (e.g., one or more magnetic disk storage devices, one or more flash memory devices, one or more optical storage devices, and/or other non-volatile solid-state memory devices).
- memory 250 stores an operating system 252, a communications module 254, a graphical user interface module 256, a sensor processing module 258, a phone module 260, and applications 262, and a subset or superset thereof.
- the operating system 252 includes instructions for handling basic system services and for performing hardware dependent tasks.
- the communications module 254 facilitates communicating with one or more additional devices, one or more computers and/or one or more servers.
- the graphical user interface module 256 facilitates graphic user interface processing.
- the sensor processing module 258 facilitates sensor-related processing and functions (e.g., processing voice input received with the one or more microphones 228).
- the phone module 260 facilitates phone-related processes and functions.
- the application module 262 facilitates various functionalities of user applications, such as electronic-messaging, web browsing, media processing, navigation, imaging and/or other processes and functions.
- the user device 104 stores in memory 250 one or more software applications 270-1 and 270-2 each associated with at least one of the external service providers.
- memory 250 also stores client- side digital assistant instructions (e.g., in a digital assistant client module 264) and various user data 266 (e.g., user-specific vocabulary data, preference data, and/or other data such as the user's electronic address book or contact list, to-do lists, shopping lists, etc.) to provide the client-side functionalities of the digital assistant.
- client- side digital assistant instructions e.g., in a digital assistant client module 264
- various user data 266 e.g., user-specific vocabulary data, preference data, and/or other data such as the user's electronic address book or contact list, to-do lists, shopping lists, etc.
- the digital assistant client module 264 is capable of accepting voice input, text input, touch input, and/or gestural input through various user interfaces (e.g., the I/O subsystem 244) of the user device 104.
- the digital assistant client module 264 is also capable of providing output in audio, visual, and/or tactile forms.
- output can be provided as voice, sound, alerts, text messages, menus, graphics, videos, animations, vibrations, and/or combinations of two or more of the above.
- the digital assistant client module 264 communicates with the digital assistant server (e.g., the digital assistant server 106, Figure 1) using the communication subsystems 224.
- the digital assistant client module 264 utilizes various sensors, subsystems and peripheral devices to gather additional information from the surrounding environment of the user device 104 to establish a context associated with a user input.
- the digital assistant client module 264 provides the context information or a subset thereof with the user input to the digital assistant server (e.g., the digital assistant server 106, Figure 1) to help deduce the user's intent.
- the digital assistant server e.g., the digital assistant server 106, Figure 1
- the context information that can accompany the user input includes sensor information, e.g., lighting, ambient noise, ambient temperature, images or videos of the surrounding environment, etc.
- the context information also includes the physical state of the device, e.g., device orientation, device location, device temperature, power level, speed, acceleration, motion patterns, cellular signals strength, etc.
- information related to the software state of the user device 106 e.g., running processes, installed programs, past and present network activities, background services, error logs, resources usage, etc., of the user device 104 is also provided to the digital assistant server (e.g., the digital assistant server 106, Figure 1) as context information associated with a user input.
- the digital assistant server e.g., the digital assistant server 106, Figure 1
- the DA client module 264 selectively provides information (e.g., at least a portion of the user data 266) stored on the user device 104 in response to requests from the digital assistant server.
- the digital assistant client module 264 also elicits additional input from the user via a natural language dialogue or other user interfaces upon request by the digital assistant server 106 ( Figure 1).
- the digital assistant client module 264 passes the additional input to the digital assistant server 106 to help the digital assistant server 106 in intent deduction and/or fulfillment of the user's intent expressed in the user request.
- memory 250 may include additional instructions or fewer instructions. Furthermore, various functions of the user device 104 may be
- Figure 3A is a block diagram of an exemplary digital assistant system 300
- the digital assistant system 300 is implemented on a standalone computer system. In some implementations, the digital assistant system 300 is distributed across multiple computers. In some implementations, some of the modules and functions of the digital assistant are divided into a server portion and a client portion, where the client portion resides on a user device (e.g., the user device 104) and communicates with the server portion (e.g., the server system 108) through one or more networks, e.g., as shown in Figure 1. In some implementations, the digital assistant system 300 is an embodiment of the server system 108 (and/or the digital assistant server 106) shown in Figure 1.
- the digital assistant system 300 is implemented in a user device (e.g., the user device 104, Figure 1), thereby eliminating the need for a client-server system. It should be noted that the digital assistant system 300 is only one example of a digital assistant system, and that the digital assistant system 300 may have more or fewer components than shown, may combine two or more components, or may have a different configuration or arrangement of the components.
- the various components shown in Figure 3A may be implemented in hardware, software, firmware, including one or more signal processing and/or application specific integrated circuits, or a combination thereof.
- the digital assistant system 300 includes memory 302, one or more processors
- I/O input/output
- network communications interface 308 a network communications interface 308. These components communicate with one another over one or more communication buses or signal lines 310.
- memory 302 includes a non-transitory computer readable medium, such as high-speed random access memory and/or a non-volatile computer readable storage medium (e.g., one or more magnetic disk storage devices, one or more flash memory devices, one or more optical storage devices, and/or other non-volatile solid-state memory devices).
- a non-transitory computer readable medium such as high-speed random access memory and/or a non-volatile computer readable storage medium (e.g., one or more magnetic disk storage devices, one or more flash memory devices, one or more optical storage devices, and/or other non-volatile solid-state memory devices).
- the I/O interface 306 couples input/output devices 316 of the digital assistant system 300, such as displays, keyboards, touch screens, and microphones, to the user interface module 322.
- the digital assistant system 300 when the digital assistant is implemented on a standalone user device, the digital assistant system 300 includes any of the components and I/O and communication interfaces described with respect to the user device 104 in Figure 2 (e.g., one or more microphones 230).
- the digital assistant system 300 represents the server portion of a digital assistant implementation, and interacts with the user through a client-side portion residing on a user device (e.g., the user device 104 shown in Figure 2).
- the network communications interface 308 includes wired communication port(s) 312 and/or wireless transmission and reception circuitry 314.
- the wired communication port(s) receive and send communication signals via one or more wired interfaces, e.g., Ethernet, Universal Serial Bus (USB), FIREWIRE, etc.
- the wireless circuitry 314 typically receives and sends RF signals and/or optical signals from/to communications networks and other communications devices.
- the wireless communications may use any of a plurality of communications standards, protocols and technologies, such as GSM, EDGE, CDMA, TDMA, Bluetooth, Wi-Fi, VoIP, Wi-MAX, or any other suitable communication protocol.
- the network communications interface 308 enables communication between the digital assistant system 300 with networks, such as the Internet, an intranet and/or a wireless network, such as a cellular telephone network, a wireless local area network (LAN) and/or a metropolitan area network (MAN), and other devices.
- networks such as the Internet, an intranet and/or a wireless network, such as a cellular telephone network, a wireless local area network (LAN) and/or a metropolitan area network (MAN), and other devices.
- networks such as the Internet, an intranet and/or a wireless network, such as a cellular telephone network, a wireless local area network (LAN) and/or a metropolitan area network (MAN), and other devices.
- networks such as the Internet, an intranet and/or a wireless network, such as a cellular telephone network, a wireless local area network (LAN) and/or a metropolitan area network (MAN), and other devices.
- LAN wireless local area network
- MAN metropolitan area network
- the non-transitory computer readable storage medium of memory 302 stores programs, modules, instructions, and data structures including all or a subset of: an operating system 318, a communications module 320, a user interface module 322, one or more applications 324, and a digital assistant module 326.
- the one or more processors 304 execute these programs, modules, and instructions, and reads/writes from/to the data structures.
- the operating system 318 e.g., Darwin, RTXC, LINUX, UNIX, OS X, iOS,
- WINDOWS or an embedded operating system such as VxWorks
- VxWorks includes various software components and/or drivers for controlling and managing general system tasks (e.g., memory management, storage device control, power management, etc.) and facilitates
- the communications module 320 facilitates communications between the digital assistant system 300 with other devices over the network communications interface 308.
- the communication module 320 may communicate with the
- the communications module 320 also includes various software components for handling data received by the wireless circuitry 314 and/or wired communications port 312.
- the user interface module 322 receives commands and/or inputs from a user via the I/O interface 306 (e.g., from a keyboard, touch screen, and/or microphone), and provides user interface objects on a display.
- a user via the I/O interface 306 (e.g., from a keyboard, touch screen, and/or microphone), and provides user interface objects on a display.
- the applications 324 include programs and/or modules that are configured to be executed by the one or more processors 304.
- the applications 324 may include user applications, such as games, a calendar application, a navigation application, or an email application.
- the applications 324 may include resource management applications, diagnostic applications, or scheduling
- Memory 302 also stores the digital assistant module (or the server portion of a digital assistant) 326.
- the digital assistant module 326 includes the following sub-modules, or a subset or superset thereof: an input/output processing module 328, a speech-to-text (STT) processing module 330, a natural language processing module 332, a dialogue flow processing module 334, a task flow processing module 336, a service processing module 338, and a photo module 132.
- STT speech-to-text
- Each of these processing modules has access to one or more of the following data and models of the digital assistant 326, or a subset or superset thereof: ontology 360, vocabulary index 344, user data 348, categorization module 349, disambiguation module 350, task flow models 354, service models 356, photo tagging module 358, search module 360, and local tag/photo storage 362.
- the digital assistant system 300 uses the processing modules (e.g., the input/output processing module 328, the STT processing module 330, the natural language processing module 332, the dialogue flow processing module 334, the task flow processing module 336, and/or the service processing module 338), data, and models implemented in the digital assistant module 326, the digital assistant system 300 performs at least some of the following: identifying a user' s intent expressed in a natural language input received from the user;
- the processing modules e.g., the input/output processing module 328, the STT processing module 330, the natural language processing module 332, the dialogue flow processing module 334, the task flow processing module 336, and/or the service processing module 338
- the digital assistant also takes appropriate actions when a satisfactory response was not or could not be provided to the user for various reasons.
- the digital assistant system 300 identifies, from a natural language input, a user's intent to tag a digital photograph, and processes the natural language input so as to tag the digital photograph with appropriate information. In some implementations, the digital assistant system 300 performs other tasks related to photographs as well, such as searching for digital photographs using natural language input, auto-tagging photographs, and the like.
- the I/O processing module In some implementations, the I/O processing module
- the I/O processing module 328 interacts with the user through the I/O devices 316 in Figure 3 A or with a user device (e.g., a user device 104 in Figure 1) through the network communications interface 308 in Figure 3 A to obtain user input (e.g., a speech input) and to provide responses to the user input.
- the I/O processing module 328 optionally obtains context information associated with the user input from the user device, along with or shortly after the receipt of the user input.
- the context information includes user- specific data, vocabulary, and/or preferences relevant to the user input.
- the context information also includes software and hardware states of the device (e.g., the user device 104 in Figure 1) at the time the user request is received, and/or information related to the surrounding environment of the user at the time that the user request was received.
- the I/O processing module 328 also sends follow-up questions to, and receives answers from, the user regarding the user request.
- the I/O processing module 328 forwards the speech input to the speech-to-text (STT) processing module 330 for speech-to-text conversions.
- STT speech-to-text
- the speech-to-text processing module 330 receives speech input (e.g., a user utterance captured in a voice recording) through the I/O processing module 328.
- the speech-to-text processing module 330 uses various acoustic and language models to recognize the speech input as a sequence of sound units (e.g., phonemes), and ultimately, a sequence of words or tokens written in one or more languages.
- the speech-to-text processing module 330 is implemented using any suitable speech recognition techniques (e.g., an adaptive speech recognition model), acoustic models, and language models, such as Hidden Markov Models, Dynamic Time Warping
- the speech-to-text processing can be performed at least partially by a third party service or on the user's device.
- the speech-to-text processing module 330 obtains the result of the speech-to-text processing (e.g., a sequence of words or tokens), it passes the result to the natural language processing module 332 for intent deduction.
- the natural language processing module 332 (“natural language processor") of the digital assistant 326 takes the sequence of words or tokens ("token sequence") generated by the speech-to-text processing module 330, and attempts to associate the token sequence with one or more "actionable intents" recognized by the digital assistant.
- an "actionable intent” represents a task that can be performed by the digital assistant 326 and/or the digital assistant system 300 ( Figure 3A), and has an associated task flow implemented in the task flow models 354.
- the associated task flow is a series of programmed actions and steps that the digital assistant system 300 takes in order to perform the task.
- the scope of a digital assistant system's capabilities is dependent on the number and variety of task flows that have been implemented and stored in the task flow models 354, or in other words, on the number and variety of "actionable intents" that the digital assistant system 300 recognizes.
- the effectiveness of the digital assistant system 300 is also dependent on the digital assistant system's ability to deduce the correct "actionable intent(s)" from the user request expressed in natural language.
- the natural language processor 332 in addition to the sequence of words or tokens obtained from the speech-to-text processing module 330, the natural language processor 332 also receives context information associated with the user request (e.g., from the I/O processing module 328).
- the natural language processor 332 optionally uses the context information to clarify, supplement, and/or further define the information contained in the token sequence received from the speech-to-text processing module 330.
- the context information includes, for example, user preferences, hardware and/or software states of the user device, sensor information collected before, during, or shortly after the user request, prior interactions (e.g., dialogue) between the digital assistant and the user, and the like.
- the natural language processing is based on an ontology 360.
- the ontology 360 is a hierarchical structure containing a plurality of nodes, each node representing either an "actionable intent” or a “property” relevant to one or more of the “actionable intents” or other "properties.”
- an “actionable intent” represents a task that the digital assistant system 300 is capable of performing (e.g., a task that is "actionable” or can be acted on).
- a "property” represents a parameter associated with an actionable intent or a sub-aspect of another property.
- a linkage between an actionable intent node and a property node in the ontology 360 defines how a parameter represented by the property node pertains to the task represented by the actionable intent node.
- the ontology 360 is made up of actionable intent nodes and property nodes.
- each actionable intent node is linked to one or more property nodes either directly or through one or more intermediate property nodes.
- each property node is linked to one or more actionable intent nodes either directly or through one or more intermediate property nodes.
- the ontology 360 shown in Figure 3C includes a "restaurant reservation” node, which is an actionable intent node.
- Property nodes "restaurant,” “date/time” (for the reservation), and "party size” are each directly linked to the "restaurant reservation” node (i.e., the actionable intent node).
- property nodes “cuisine,” “price range,” “phone number,” and “location” are sub-nodes of the property node “restaurant,” and are each linked to the "restaurant reservation” node (i.e., the actionable intent node) through the intermediate property node "restaurant.”
- the ontology 360 shown in Figure 3C also includes a "set reminder” node, which is another actionable intent node.
- Property nodes “date/time” (for setting the reminder) and “subject” (for the reminder) are each linked to the "set reminder” node.
- the property node “date/time” is linked to both the "restaurant reservation” node and the "set reminder” node in the ontology 360.
- An actionable intent node along with its linked concept nodes, may be described as a "domain.”
- each domain is associated with a respective actionable intent, and refers to the group of nodes (and the relationships therebetween) associated with the particular actionable intent.
- the ontology 360 shown in Figure 3C includes an example of a restaurant reservation domain 362 and an example of a reminder domain 364 within the ontology 360.
- the restaurant reservation domain includes the actionable intent node "restaurant reservation,” property nodes
- the reminder domain 364 includes the actionable intent node “set reminder,” and property nodes “subject” and "date/time.”
- the ontology 360 is made up of many domains. Each domain may share one or more property nodes with one or more other domains.
- the "date/time" property node may be associated with many other domains (e.g., a scheduling domain, a travel reservation domain, a movie ticket domain, etc.), in addition to the restaurant reservation domain 362 and the reminder domain 364.
- Figure 3C illustrates two exemplary domains within the ontology 360
- the ontology 360 may include other domains (or actionable intents), such as "initiate a phone call,” “find directions,” “schedule a meeting,” “send a message,” “provide an answer to a question,” “tag a photo,” and so on.
- a "send a message” domain is associated with a "send a message” actionable intent node, and may further include property nodes such as "recipient(s),” “message type,” and “message body.”
- the property node "recipient” may be further defined, for example, by the sub-property nodes such as "recipient name” and "message address.”
- the ontology 360 includes all the domains (and hence actionable intents) that the digital assistant is capable of understanding and acting upon.
- the ontology 360 may be modified, such as by adding or removing domains or nodes, or by modifying relationship between the nodes within the ontology 360.
- nodes associated with multiple related actionable intents may be clustered under a "super domain" in the ontology 360.
- a "travel" super-domain may include a cluster of property nodes and actionable intent nodes related to travels.
- the actionable intent nodes related to travels may include "airline reservation,” "hotel reservation,” “car rental,” “get directions,” “find points of interest,” and so on.
- the actionable intent nodes under the same super domain may have many property nodes in common.
- the actionable intent nodes for "airline reservation,” “hotel reservation,” “car rental,” “get directions,” and “find points of interest” may share one or more of the property nodes “start location,” “destination,” “departure date/time,” “arrival date/time,” and “party size.”
- each node in the ontology 360 is associated with a set of words and/or phrases that are relevant to the property or actionable intent represented by the node.
- the respective set of words and/or phrases associated with each node is the so-called "vocabulary" associated with the node.
- the respective set of words and/or phrases associated with each node can be stored in the vocabulary index 344 ( Figure 3B) in association with the property or actionable intent represented by the node.
- the vocabulary associated with the node for the property of restaurant may include words such as "food,” “drinks,” “cuisine,” “hungry,” “eat,” “pizza,” “fast food,” “meal,” and so on.
- the vocabulary associated with the node for the actionable intent of "initiate a phone call” may include words and phrases such as “call,” “phone,” “dial,” “ring,” “call this number,” “make a call to,” and so on.
- the vocabulary index 344 optionally includes words and phrases in different languages.
- the 3B receives the token sequence (e.g., a text string) from the speech-to-text processing module 330, and determines what nodes are implicated by the words in the token sequence. In some implementations, if a word or phrase in the token sequence is found to be associated with one or more nodes in the ontology 360 (via the vocabulary index 344), the word or phrase will "trigger” or “activate” those nodes. When multiple nodes are "triggered,” based on the quantity and/or relative importance of the activated nodes, the natural language processor 332 will select one of the actionable intents as the task (or task type) that the user intended the digital assistant to perform. In some implementations, the domain that has the most
- triggered nodes is selected.
- the domain having the highest confidence value e.g., based on the relative importance of its various triggered nodes
- the domain is selected based on a combination of the number and the importance of the triggered nodes.
- additional factors are considered in selecting the node as well, such as whether the digital assistant system 300 has previously correctly interpreted a similar request from a user.
- the digital assistant system 300 also stores names of specific entities in the vocabulary index 344, so that when one of these names is detected in the user request, the natural language processor 332 will be able to recognize that the name refers to a specific instance of a property or sub-property in the ontology.
- the names of specific entities are names of businesses, restaurants, people, movies, and the like.
- the digital assistant system 300 can search and identify specific entity names from other data sources, such as the user's address book or contact list, a movies database, a musicians database, and/or a restaurant database.
- the natural language processor 332 identifies that a word in the token sequence is a name of a specific entity (such as a name in the user's address book or contact list), that word is given additional significance in selecting the actionable intent within the ontology for the user request.
- User data 348 includes user-specific information, such as user-specific vocabulary, user preferences, user address, user's default and secondary languages, user's contact list, and other short-term or long-term information for each user.
- the natural language processor 332 can use the user-specific information to supplement the information contained in the user input to further define the user intent. For example, for a user request "invite my friends to my birthday party," the natural language processor 332 is able to access user data 348 to determine who the "friends" are and when and where the "birthday party" would be held, rather than requiring the user to provide such information explicitly in his/her request.
- natural language processor 332 includes categorization module 349.
- the categorization module 349 determines whether each of the one or more terms in a text string (e.g., corresponding to a speech input associated with a digital photograph) is one of an entity, an activity, or a location, as discussed in greater detail below.
- the categorization module 349 classifies each term of the one or more terms as one of an entity, an activity, or a location.
- the natural language processor 332 identifies an actionable intent (or domain) based on the user request, the natural language processor 332 generates a structured query to represent the identified actionable intent.
- the structured query includes parameters for one or more nodes within the domain for the actionable intent, and at least some of the parameters are populated with the specific information and requirements specified in the user request. For example, the user may say "Make me a dinner reservation at a sushi place at 7.” In this case, the natural language processor 332 may be able to correctly identify the actionable intent to be "restaurant reservation" based on the user input.
- a structured query for a "restaurant reservation” domain may include parameters such as ⁇ Cuisine ⁇ , ⁇ Time ⁇ , ⁇ Date ⁇ , ⁇ Party Size ⁇ , and the like.
- the natural language processor 332 populates some parameters of the structured query with received context information. For example, if the user requested a sushi restaurant "near me," the natural language processor 332 may populate a ⁇ location ⁇ parameter in the structured query with GPS coordinates from the user device 104.
- the natural language processor 332 passes the structured query (including any completed parameters) to the task flow processing module 336 ("task flow processor").
- the task flow processor 336 is configured to perform one or more of: receiving the structured quely from the natural language processor 332, completing the structured query, and performing the actions required to "complete" the user's ultimate request.
- the various procedures necessary to complete these tasks are provided in task flow models 354.
- the task flow models 354 include procedures for obtaining additional information from the user, and task flows for performing actions associated with the actionable intent.
- the task flow processor 336 may need to initiate additional dialogue with the user in order to obtain additional information, and/or disambiguate potentially ambiguous utterances.
- the task flow processor 336 invokes the dialogue processing module 334 ("dialogue processor") to engage in a dialogue with the user.
- dialogue processor the dialogue processing module 334
- the dialogue processing module 334 determines how (and/or when) to ask the user for the additional information, and receives and processes the user responses.
- the questions are provided to and answers are received from the users through the I/O processing module 328.
- the dialogue processing module 334 presents dialogue output to the user via audio and/or visual output, and receives input from the user via spoken or physical (e.g., touch gesture) responses.
- the task flow processor 336 invokes the dialogue processor 334 to determine the "party size" and "date" information for the structured query associated with the domain "restaurant reservation," the dialogue processor 334 generates questions such as "For how many people?" and "On which day?” to pass to the user.
- the dialogue processing module 334 populates the structured query with the missing information, or passes the information to the task flow processor 336 to complete the missing information from the structured query.
- the task flow processor 336 may receive a structured query that has one or more ambiguous properties. For example, a structured query for the "send a message" domain may indicate that the intended recipient is "Bob," and the user may have multiple contacts named “Bob.” The task flow processor 336 will request that the dialogue processor 334 disambiguate this property of the structured query. In turn, the dialogue processor 334 may ask the user "Which Bob?", and display (or read) a list of contacts named "Bob" from which the user may choose. [0090] In some implementations, dialogue processor 334 includes disambiguation module 350.
- disambiguation module 350 disambiguates one or more ambiguous terms (e.g., one or more ambiguous terms in a text string corresponding to a speech input associated with a digital photograph). In some implementations, disambiguation module 350 identifies that a first term of the one or more terms has multiple candidate meanings, prompts a user for additional information about the first term, receives the additional information from the user in response to the prompt and identifies the entity, activity, or location associated with the first term in accordance with the additional information.
- ambiguous terms e.g., one or more ambiguous terms in a text string corresponding to a speech input associated with a digital photograph.
- disambiguation module 350 identifies that a first term of the one or more terms has multiple candidate meanings, prompts a user for additional information about the first term, receives the additional information from the user in response to the prompt and identifies the entity, activity, or location associated with the first term in accordance with the additional information.
- disambiguation module 350 disambiguates pronouns. In such implementations, disambiguation module 350 identifies one of the one or more terms as a pronoun and determines a noun to which the pronoun refers. In some implementations, disambiguation module 350 determines a noun to which the pronoun refers by using a contact list associated with a user of the electronic device. Alternatively, or in addition, disambiguation module 350 determines a noun to which the pronoun refers as a name of an entity, an activity, or a location identified in a previous speech input associated with a previously tagged digital photograph. Alternatively, or in addition, disambiguation module 350 determines a noun to which the pronoun refers as a name of a person identified based on a previous speech input associated with a previously tagged digital photograph.
- disambiguation module 350 accesses information obtained from one or more sensors (e.g., proximity sensor 214, light sensor 212, GPS receiver 213, temperature sensor 215, and motion sensor 210) of a handheld electronic device (e.g., user device 104) for determining a meaning of one or more of the terms.
- disambiguation module 350 identifies two terms each associated with one of an entity, an activity, or a location. For example, a first of the two terms refers to a person, and a second of the two terms refers to a location. In some implementations, disambiguation module 350 identifies three terms each associated with one of an entity, an activity, or a location.
- the task flow processor 336 proceeds to perform the ultimate task associated with the actionable intent. Accordingly, the task flow processor 336 executes the steps and instructions in the task flow model according to the specific parameters contained in the structured query. For example, the task flow model for the actionable intent of
- “restaurant reservation” may include steps and instructions for contacting a restaurant and actually requesting a reservation for a particular party size at a particular time.
- the task flow processor 336 may perform the steps of: (I) logging onto a server of the ABC cafe or a restaurant reservation system that is configured to accept reservations for multiple restaurants, such as the ABC cafe, (2) entering the date, time, and party size information in a form on the website, (3) submitting the form, and (4) making a calendar entry for the reservation in the user's calendar.
- the task flow processor 336 executes steps and instructions associated with tagging or searching for digital photographs in response to a voice input, e.g., in conjunction with photo module 132.
- the task flow processor 336 employs the assistance of a service processing module 338 ("service processor") to complete a task requested in the user input or to provide an informational answer requested in the user input.
- service processor can act on behalf of the task flow processor 336 to make a phone call, set a calendar entry, invoke a map search, invoke or interact with other user applications installed on the user device, and invoke or interact with third party services (e.g. a restaurant reservation portal, a social networking website or service, a banking portal, etc.).
- the protocols and application programming interfaces (API) required by each service can be specified by a respective service model among the service models 356.
- the service processor 338 accesses the appropriate service model for a service and generates requests for the service in accordance with the protocols and APIs required by the service according to the service model.
- the restaurant can submit a service model specifying the necessary parameters for making a reservation and the APIs for communicating the values of the necessary parameters to the online reservation service.
- the service processor 338 can establish a network connection with the online reservation service using the web address stored in the service models 356, and send the necessary parameters of the reservation (e.g., time, date, party size) to the online reservation interface in a format according to the API of the online reservation service.
- the natural language processor 332, dialogue processor 334, and task flow processor 336 are used collectively and iteratively to deduce and define the user's intent, obtain information to further clarify and refine the user intent, and finally generate a response (e.g., provide an output to the user, or complete a task) to fulfill the user' s intent.
- the digital assistant 326 formulates a confirmation response, and sends the response back to the user through the I/O processing module 328. If the user request seeks an informational answer, the confirmation response presents the requested information to the user. In some implementations, the digital assistant also requests the user to indicate whether the user is satisfied with the response produced by the digital assistant 326.
- Figure 4 is a block diagram illustrating components of an audio subsystem 226 and a voice trigger system 400, in accordance with some implementations.
- the voice trigger system 400 is not limited to voice, and
- the audio subsystem 226 and the voice trigger system 400 are composed of various components, modules, and/or software programs within the electronic device 104.
- the audio subsystem 226 includes a baseband subsystem 412, an application processor 418, a codec 410, and a buffer 414. In some implementations, more or fewer of these modules are used.
- the baseband subsystem 412, application processor 418, codec 410, and buffer 414 may be referred to as modules, and may include hardware (e.g., circuitry, memory, processors, etc.), software (e.g., programs, software-on-a-chip, firmware, etc.), and/or any combinations thereof for performing the functionality described herein.
- the codec 410 includes an analog to digital converter (ADC) and a digital to analog converter (DAC).
- the audio subsystem 226 is coupled to one or more microphones 230 ( Figure 2) and one or more speakers 228 ( Figure 2).
- the baseband subsystem 412, application processor 418, codec 410, and buffer 414 are connected using an Integrated Interchip Sound (I S) interface.
- the baseband subsystem 412, application processor 418, codec 410, and buffer 414 are connected using a high-speed interchip (HSIC) interface.
- the audio subsystem 226 is coupled to an external audio system 416 that includes at least one microphone 418 and at least one speaker 420.
- the audio subsystem 226 provides sound inputs to the voice trigger system 400 (as well as other components or modules, such as a phone and/or communication(s) subsystem of a phone) for processing and/or analysis.
- the baseband subsystem is not a component of audio subsystem 226.
- the baseband subsystem is a component of communications subsystem 220.
- the baseband unit (e.g., baseband subsystem 412 in Figure 4) has a per-device privacy key and tapping into the baseband unit to obtain a call audio signal does not introduce any
- the adaptive speech recognition model is also encrypted to preserve privacy.
- the outbound audio channel of the first mobile communication device is encrypted and only authorized systems can tap into the outbound audio channel. Thus, unauthorized persons cannot tap into the outbound audio channel in a similar/analogous manner.
- the adaptive speech recognition model does not include data which could be used to reconstruct a user's call audio. In other words, in these implementations, obtaining training data from a user's calls does not compromise the user's privacy during the calls. Thus, in some implementations, the data yielded from phone conversations need not be saved or transmitted to a server, thereby avoiding privacy issues.
- baseband subsystem 412 includes an audio digital signal processor (DSP) 416.
- the audio digital signal processor 416 is included within the application processor 418.
- the audio digital signal processor (DSP) 416 is included within the codec 410.
- the audio digital signal processor (DSP) 416 is a standalone module within the audio subsystem 226.
- application processor 418 includes an embedded recognition engine. In some of these implementations, the embedded recognition engine is used to align sound units for updating the adaptive speech recognition model. In some implementations, application processor 418 corresponds to one or more processor(s) 204.
- the voice trigger system 400 includes a noise detector 402, a sound-type detector 404, a trigger sound detector 406, and a speech-based service 408, and an audio subsystem 226, each coupled to an audio bus 401. In some implementations, more or fewer of these modules are used.
- the sound detectors 402, 404, and 406 may be referred to as modules, and may include hardware (e.g., circuitry, memory, processors, etc.), software (e.g., programs, software-on-a-chip, firmware, etc.), and/or any combinations thereof for performing the functionality described herein. In some implementations, hardware (e.g., circuitry, memory, processors, etc.), software (e.g., programs, software-on-a-chip, firmware, etc.), and/or any combinations thereof for performing the functionality described herein. In some
- the sound detectors are communicatively, programmatically, physically, and/or operationally coupled to one another (e.g., via a communications bus), as illustrated in Figure 4 by the broken lines.
- Figure 4 shows each sound detector coupled only to adjacent sound detectors. It will be understood that each sound detector can be coupled to any of the other sound detectors as well.
- the speech-based service 408 is a voice-based digital assistant, and corresponds to one or more components or functionalities of the digital assistant system described above with reference to Figures 1-3C.
- the speech-based service is a speech-to-text service, a dictation service, or the like.
- the noise detector 402 monitors an audio channel to determine whether a sound input from the audio subsystem 226 satisfies a predetermined condition, such as an amplitude threshold.
- the audio channel corresponds to a stream of audio information received by one or more sound pickup devices, such as the one or more microphones 230 ( Figure 2).
- the audio channel refers to the audio information regardless of its state of processing or the particular hardware that is processing and/or transmitting the audio information.
- the audio channel may refer to analog electrical impulses (and/or the circuits on which they are propagated) from the microphone 230, as well as a digitally encoded audio stream resulting from processing of the analog electrical impulses (e.g., by the audio subsystem 226 and/or any other audio processing system of the electronic device 104).
- the predetermined condition is whether the sound input is above a certain volume for a predetermined amount of time.
- the noise detector uses time-domain analysis of the sound input, which requires relatively little computational and battery resources as compared to other types of analysis (e.g., as performed by the sound- type detector 404, the trigger word detector 406, and/or the speech-based service 408). In some implementations, other types of signal processing and/or audio analysis are used, including, for example, frequency-domain analysis.
- the noise detector 402 determines that the sound input satisfies the predetermined condition, it initiates an upstream sound detector, such as the sound-type detector 404 (e.g., by providing a control signal to initiate one or more processing routines, and/or by providing power to the upstream sound detector).
- the upstream sound detector is initiated in response to other conditions being satisfied. For example, in some implementations, the upstream sound detector is initiated in response to determining that the device is not being stored in an enclosed space (e.g., based on a light detector detecting a threshold level of light).
- the sound-type detector 404 monitors the audio channel to determine whether a sound input corresponds to a certain type of sound, such as sound that is characteristic of a human voice, whistle, clap, etc.
- the type of sound that the sound-type detector 404 is configured to recognize will correspond to the particular trigger sound(s) that the voice trigger is configured to recognize.
- the trigger sound is a spoken word or phrase
- the sound-type detector 404 includes a "voice activity detector" (VAD).
- VAD voice activity detector
- the sound-type detector 404 uses frequency-domain analysis of the sound input.
- the sound-type detector 404 generates a spectrogram of a received sound input (e.g., using a Fourier transform), and analyzes the spectral components of the sound input to determine whether the sound input is likely to correspond to a particular type or category of sounds (e.g., human speech).
- a particular type or category of sounds e.g., human speech.
- the trigger sound is a spoken word or phrase
- the audio channel is picking up ambient sound (e.g., traffic noise) but not human speech
- the VAD will not initiate the trigger sound detector 406.
- the sound-type detector 404 remains active for as long as predetermined conditions of any downstream sound detector (e.g., the noise detector 402) are satisfied.
- the sound-type detector 404 remains active as long as the sound input includes sound above a predetermined amplitude threshold (as determined by the noise detector 402), and is deactivated when the sound drops below the predetermined threshold.
- the sound-type detector 404 once initiated, the sound-type detector 404 remains active until a condition is met, such as the expiration of a timer (e.g., for 1, 2, 5, or 10 seconds, or any other appropriate duration), the expiration of a certain number of on/off cycles of the sound-type detector 404, or the occurrence of an event (e.g., the amplitude of the sound falls below a second threshold, as determined by the noise detector 402 and/or the sound- type detector 404).
- a timer e.g., for 1, 2, 5, or 10 seconds, or any other appropriate duration
- an event e.g., the amplitude of the sound falls below a second threshold, as determined by the noise detector 402 and/or the sound- type detector 404.
- the sound-type detector 404 determines that the sound input corresponds to a predetermined type of sound, it initiates an upstream sound detector (e.g., by providing a control signal to initiate one or more processing routines, and/or by providing power to the upstream sound detector), such as the trigger sound detector 406.
- an upstream sound detector e.g., by providing a control signal to initiate one or more processing routines, and/or by providing power to the upstream sound detector
- the trigger sound detector 406 is configured to determine whether a sound input includes at least part of certain predetermined content (e.g., at least part of the trigger word, phrase, or sound). In some implementations, the trigger sound detector 406 compares a representation of the sound input (an "input representation") to one or more reference representations of the trigger word. If the input representation matches at least one of the one or more reference representations with an acceptable confidence, the trigger sound detector 406 initiates the speech-based service 408 (e.g., by providing a control signal to initiate one or more processing routines, and/or by providing power to the upstream sound detector).
- a representation of the sound input an "input representation”
- the trigger sound detector 406 initiates the speech-based service 408 (e.g., by providing a control signal to initiate one or more processing routines, and/or by providing power to the upstream sound detector).
- the input representation and the one or more reference representations are spectrograms (or mathematical representations thereof), which represent how the spectral density of a signal varies with time.
- the representations are other types of audio signatures or voiceprints.
- initiating the speech-based service 408 includes bringing one or more circuits, programs, and/or processors out of a standby mode, and invoking the sound-based service. The sound-based service is then ready to provide more comprehensive speech recognition, speech-to-text processing, and/or natural language processing.
- the voice-trigger system 400 includes voice authentication functionality, so that it can determine if a sound input corresponds to a voice of a particular person, such as an owner/user of the device. For example, in some implementations, in some
- the sound-type detector 404 uses a voiceprinting technique to determine that the sound input was uttered by an authorized user. Voice authentication and
- voice authentication is included in any of the sound detectors described herein (e.g., the noise detector 402, the sound- type detector 404, the trigger sound detector 406, and/or the speech-based service 408).
- voice authentication is implemented as a separate module from the sound detectors listed above (e.g., as voice authentication module 426, Figure 4), and may be operationally positioned after the noise detector 402, after the sound-type detector 404, after the trigger sound detector 406, or at any other appropriate position.
- the trigger sound detector 406 remains active for as long as conditions of any downstream sound detector(s) (e.g., the noise detector 402 and/or the sound-type detector 404) are satisfied.
- the trigger sound detector 406 remains active as long as the sound input includes sound above a predetermined threshold (as detected by the noise detector 402).
- it remains active as long as the sound input includes sound of a certain type (as detected by the sound- type detector 404).
- it remains active as long as both of the foregoing conditions are met.
- the trigger sound detector 406 once initiated, the trigger sound detector 406 remains active until a condition is met, such as the expiration of a timer (e.g., for 1, 2, 5, or 10 seconds, or any other appropriate duration), the expiration of a certain number of on/off cycles of the trigger sound detector 406, or the occurrence of an event (e.g., the amplitude of the sound falls below a second threshold).
- a condition such as the expiration of a timer (e.g., for 1, 2, 5, or 10 seconds, or any other appropriate duration), the expiration of a certain number of on/off cycles of the trigger sound detector 406, or the occurrence of an event (e.g., the amplitude of the sound falls below a second threshold).
- both sound detectors when one sound detector initiates another detector, both sound detectors remain active. However, the sound detectors may be active or inactive at various times, and it is not necessary that all of the downstream (e.g., the lower power and/or sophistication) sound detectors be active (or that their respective conditions are met) in order for upstream sound detectors to be active. For example, in some implementations, after the noise detector 402 and the sound-type detector 404 determine that their respective conditions are met, and the trigger sound detector 406 is initiated, one or both of the noise detector 402 and the sound-type detector 404 are deactivated and/or enter a standby mode while the trigger sound detector 406 operates.
- both the noise detector 402 and the sound-type detector 404 (or one or the other) stay active while the trigger sound detector 406 operates.
- different combinations of the sound detectors are active at different times, and whether one is active or inactive may depend on the state of other sound detectors, or may be independent of the state of other sound detectors.
- Figure 4 describes three separate sound detectors, each configured to detect different aspects of a sound input, more or fewer sound detectors are used in various implementations of the voice trigger.
- the trigger sound detector 406 is used in conjunction with either the noise detector 402 or the sound-type detector 404. In some implementations, all of the detectors 402-406 are used. In some implementations, additional sound detectors are included as well.
- different combinations of sound detectors may be used at different times.
- the particular combination of sound detectors and how they interact may depend on one or more conditions, such as the context or operating state of a device.
- the trigger sound detector 406 is active, while the noise detector 402 and the sound-type detector 404 remain inactive.
- the device is in a pocket or backpack, all sound detectors are inactive.
- the noise detector 402 operates according to a duty cycle so that it performs effectively continuous noise detection, even though the noise detector is off for at least part of the time.
- the noise detector 402 is on for 10 milliseconds and off for 90 milliseconds.
- the noise detector 402 is on for 20 milliseconds and off for 500 milliseconds. Other on and off durations are also possible.
- the noise detector 402 detects a noise during its
- the noise detector 402 will remain on in order to further process and/or analyze the sound input.
- the noise detector 402 may be configured to initiate an upstream sound detector if it detects sound above a predetermined amplitude for a
- the noise detector 402 detects sound above a predetermined amplitude during its 10 millisecond "on” interval, it will not immediately enter the "off interval. Instead, the noise detector 402 remains active and continues to process the sound input to determine whether it exceeds the threshold for the full predetermined duration (e.g., 100 milliseconds).
- the sound- type detector 404 operates according to a duty cycle. In some implementations, the sound-type detector 404 is on for 20 milliseconds and off for 100 milliseconds. Other on and off durations are also possible. In some implementations, the sound-type detector 404 is able to determine whether a sound input corresponds to a predetermined type of sound within the "on" interval of its duty cycle. Thus, the sound-type detector 404 will initiate the trigger sound detector 406 (or any other upstream sound detector) if the sound-type detector 404 determines, during its "on" interval, that the sound is of a certain type.
- the sound-type detector 404 if the sound-type detector 404 detects, during the "on" interval, sound that may correspond to the predetermined type, the detector will not immediately enter the "off interval. Instead, the sound-type detector 404 remains active and continues to process the sound input and determine whether it corresponds to the predetermined type of sound. In some implementations, if the sound detector determines that the predetermined type of sound has been detected, it initiates the trigger sound detector 406 to further process the sound input and determine if the trigger sound has been detected.
- the trigger sound detector 406 operates according to a duty cycle. In some implementations, the trigger sound detector 406 is on for 50 milliseconds and off for 50 milliseconds. Other on and off durations are also possible. If the trigger sound detector 406 detects, during its "on" interval, that there is sound that may correspond to a trigger sound, the detector will not immediately enter the "off interval. Instead, the trigger sound detector 406 remains active and continues to process the sound input and determine whether it includes the trigger sound. In some implementations, if such a sound is detected, the trigger sound detector 406 remains active to process the audio for a predetermined duration, such as 1, 2, 5, or 10 seconds, or any other appropriate duration.
- a predetermined duration such as 1, 2, 5, or 10 seconds, or any other appropriate duration.
- the duration is selected based on the length of the particular trigger word or sound that it is configured to detect. For example, if the trigger phrase is "Hey, SIRI,” the trigger word detector is operated for about 2 seconds to determine whether the sound input includes that phrase.
- some of the sound detectors are operated according to a duty cycle, while others operate continuously when active.
- only the first sound detector is operated according to a duty cycle (e.g., the noise detector 402 in Figure 4), and upstream sound detectors are operated continuously once they are initiated.
- the noise detector 402 and the sound-type detector 404 are operated according to a duty cycle, while the trigger sound detector 406 is operated continuously. Whether a particular sound detector is operated continuously or according to a duty cycle depends on one or more conditions, such as the context or operating state of a device.
- the noise detector 402 (or any of the sound detectors) operates according to a duty cycle if the device is in a pocket or backpack (e.g., as determined by sensor and/or microphone signals), but operates continuously when it is determined that the device is likely not being stored.
- whether a particular sound detector is operated continuously or according to a duty cycle depends on the battery charge level of the device. For example, the noise detector 402 operates continuously when the battery charge is above 50%, and operates according to a duty cycle when the battery charge is below 50%.
- the voice trigger includes noise, echo, and/or sound cancellation functionality (referred to collectively as noise cancellation).
- noise cancellation includes noise, echo, and/or sound cancellation functionality (referred to collectively as noise cancellation).
- noise cancellation is performed by the audio subsystem 226 (e.g., by the audio DSP 416).
- Noise cancellation reduces or removes unwanted noise or sounds from the sound input prior to it being processed by the sound detectors.
- the unwanted noise is background noise from the user's environment, such as a fan or the clicking from a keyboard.
- the unwanted noise is any sound above, below, or at predetermined amplitudes or frequencies. For example, in some implementations, sound above the typical human vocal range (e.g., 3,000 Hz) is filtered out or removed from the signal.
- multiple microphones e.g., the microphones 230 are used to help determine what components of received sound should be reduced and/or removed.
- the audio subsystem 226 uses beam forming techniques to identify sounds or portions of sound inputs that appear to originate from a single point in space (e.g., a user's mouth). The audio subsystem 226 then focuses on this sound by removing from the sound input sounds that are received equally by all microphones (e.g., ambient sound that does not appear to originate from any particular direction).
- the DSP 416 is configured to cancel or remove from the sound input sounds that are being output by the device on which the digital assistant is operating. For example, if the audio subsystem 226 is outputting music, radio, a podcast, a voice output, or any other audio content (e.g., via the speaker 228), the DSP 416 removes any of the outputted sound that was picked up by a microphone and included in the sound input. Thus, the sound input is free of the outputted audio (or at least contains less of the outputted audio). Accordingly, the sound input that is provided to the sound detectors will be cleaner, and the triggers more accurate. Aspects of noise cancellation are described in more detail in U.S. Patent No. 7,272,224, assigned to the assignee of the instant application, which is hereby incorporated by reference in its entirety.
- different sound detectors require that the sound input be filtered and/or preprocessed in different ways. For example, in some
- the noise detector 402 is configured to analyze time-domain audio signal between 60 and 20,000 Hz, and the sound-type detector is configured to perform
- the audio DSP 46 preprocesses received audio according to the respective needs of the sound detectors. In some implementations, the audio DSP 46 (and/or other audio DSPs of the device 104) preprocesses received audio according to the respective needs of the sound detectors. In some implementations, the audio DSP 46 (and/or other audio DSPs of the device 104) preprocesses received audio according to the respective needs of the sound detectors. In some implementations, the audio DSP 46 (and/or other audio DSPs of the device 104) preprocesses received audio according to the respective needs of the sound detectors. In some implementations, the audio DSP 46 (and/or other audio DSPs of the device 104) preprocesses received audio according to the respective needs of the sound detectors. In some implementations, the audio DSP 46 (and/or other audio DSPs of the device 104) preprocesses received audio according to the respective needs of the sound detectors. In some implementations, the audio DSP 46 (and/or other audio DSPs of the device 104) preprocesse
- the sound detectors are configured to filter and/or preprocess the audio from the audio subsystem 226 according to their specific needs.
- the audio DSP 416 may still perform noise cancellation prior to providing the sound input to the sound detectors.
- the context of the electronic device is used to help determine whether and how to operate the voice trigger. For example, it may be unlikely that users will invoke a speech-based service, such as a voice-based digital assistant, when the device is stored in their pocket, purse, or backpack. Also, it may be unlikely that users will invoke a speech-based service when they are at a loud rock concert. For some users, it is unlikely that they will invoke a speech-based service at certain times of the day (e.g., late at night). On the other hand, there are also contexts in which it is more likely that a user will invoke a speech-based service using a voice trigger.
- a speech-based service such as a voice-based digital assistant
- the device uses information from any one or more of the following components or information sources to determine the context of a device: GPS receivers, light sensors, microphones, proximity sensors, orientation sensors, inertial sensors, cameras, communications circuitry and/or antennas, charging and/or power circuitry, switch positions, temperature sensors, compasses, accelerometers, calendars, user preferences, etc. [0125] The context of the device can then be used to adjust how and whether the voice trigger operates.
- the voice trigger will be deactivated (or operated in a different mode) as long as that context is maintained.
- the voice trigger is deactivated when the phone is in a predetermined orientation (e.g., lying face-down on a surface), during predetermined time periods (e.g., between 10:00 PM and 8:00AM), when the phone is in a "silent" or a "do not disturb" mode (e.g., based on a switch position, mode setting, or user preference), when the device is in a substantially enclosed space (e.g., a pocket, bag, purse, drawer, or glove box), when the device is near other devices that have a voice trigger and/or speech -based services (e.g., based on proximity sensors, acoustic/wireless/infrared communications), and the like.
- a predetermined orientation e.g., lying face-down on a surface
- time periods e.g., between 10:00 PM and 8:00AM
- the voice trigger system 400 is operated in a low-power mode (e.g., by operating the noise detector 402 according to a duty cycle with a 10 millisecond "on" interval and a 5 second "off interval).
- a voice trigger uses a different sound detector or combination of sound detectors when it is in a low-power mode than when it is in a normal mode.
- the voice trigger may be capable of numerous different modes or operating states, each of which may use a different amount of power, and different implementations will use them according to their specific designs.
- the voice trigger when the device is in some other contexts, the voice trigger will be activated (or operated in a different mode) so long as that context is maintained.
- the voice trigger remains active while it is plugged into a power source, when the phone is in a predetermined orientation (e.g., lying face-up on a surface), during predetermined time periods (e.g., between 8:00AM and 10:00 PM), when the device is travelling and/or in a car (e.g., based on GPS signals, BLUETOOTH connection or docking with a vehicle, etc.), and the like.
- the voice trigger system 400 is active (e.g., listening) can depend on the physical orientation of a device.
- the voice trigger is active when the device is placed "face-up" on a surface (e.g., with the display and/or touchscreen surface visible), and/or is inactive when it is "face-down.” This provides a user with an easy way to activate and/or deactivate the voice trigger without requiring manipulation of settings menus, switches, or buttons.
- the device detects whether it is face-up or face-down on a surface using light sensors (e.g., based on the difference in incident light on a front and a back face of the device 104), proximity sensors, magnetic sensors, accelerometers, gyroscopes, tilt sensors, cameras, and the like.
- light sensors e.g., based on the difference in incident light on a front and a back face of the device 104
- proximity sensors e.g., based on the difference in incident light on a front and a back face of the device 104
- magnetic sensors e.g., based on the difference in incident light on a front and a back face of the device 104
- accelerometers e.g., g., accelerometers, gyroscopes, tilt sensors, cameras, and the like.
- the particular trigger sound, word, or phrase of the voice trigger is listening for depends on the orientation and/or position of the device.
- the voice trigger listens for a first trigger word, phrase, or sound when the device is in one orientation (e.g., laying face-up on a surface), and a different trigger word, phrase, or sound when the device is in another orientation (e.g., laying face-down).
- the trigger phrase for a face-down orientation is longer and/or more complex than for a face-up orientation.
- a user can place a device face-down when they are around other people or in a noisy environment so that the voice trigger can still be operational while also reducing false accepts, which may be more frequent for shorter or simpler trigger words.
- a face-up trigger phrase may be "Hey, SIRI”
- a face-down trigger phrase may be "Hey, SIRI, this is Andrew, please wake up.”
- the longer trigger phrase also provides a larger voice sample for the sound detectors and/or voice authenticators to process and/or analyze, thus increasing the accuracy of the voice trigger and decreasing false accepts.
- the device 104 detects whether it is in a vehicle
- a voice trigger is particularly beneficial for invoking a speech-based service when the user is in a vehicle, as it helps reduce the physical interactions that are necessary to operate the device and/or the speech based service.
- a voice-based digital assistant is that it can be used to perform tasks where looking at and touching a device would be impractical or unsafe.
- the voice trigger may be used when the device is in a vehicle so that the user does not have to touch the device in order to invoke the digital assistant.
- the device determines that it is in a vehicle by detecting that it has been connected to and/or paired with a vehicle, such as through
- the device determines that it is in a vehicle by determining the device's location and/or speed (e.g., using GPS receivers, accelerometers, and/or gyroscopes). If it is determined that the device is likely in a vehicle, because it is travelling above 20 miles per hour and is determined to be travelling along a road, for example, then the voice trigger remains active and/or in a high-power or more sensitive state.
- the device detects whether the device is stored (e.g., in a pocket, purse, bag, a drawer, or the like) by determining whether it is in a substantially enclosed space.
- the device uses light sensors (e.g., dedicated ambient light sensors and/or cameras) to determine that it is stored. For example, in some implementations, the device is likely being stored if light sensors detect little or no light.
- the time of day and/or location of the device are also considered. For example, if the light sensors detect low light levels when high light levels would be expected (e.g., during the day), the device may be in storage and the voice trigger system 400 not needed. Thus, the voice trigger system 400 will be placed in a low-power or standby state.
- the difference in light detected by sensors located on opposite faces of a device can be used to determine its position, and hence whether or not it is stored.
- users are likely to attempt to activate a voice trigger when the device is resting on a table or surface rather than when it is being stored in a pocket or bag.
- a device is lying face-down (or face -up) on a surface such as a table or desk, one surface of the device will be occluded so that little or no light reaches that surface, while the other surface will be exposed to ambient light.
- light sensors on the front and back face of a device detect significantly different light levels, the device determines that it is not being stored.
- the device determines that it is being stored in a substantially enclosed space. Also, if the light sensors both detect a low light level during the daytime (or when the device would expect the phone to be in a bright environment, the device determines with a greater confidence that it is being stored.
- the device emits one or more sounds (e.g., tones, clicks, pings, etc.) from a speaker or transducer (e.g., speaker 228), and monitors one or more microphones or transducers (e.g., microphone 230) to detect echoes of the omitted sound(s).
- the device emits inaudible signals, such as sound outside of the human hearing range.
- the device determines characteristics of the surrounding environment. For example, a relatively large environment (e.g., a room or a vehicle) will reflect the sound differently than a relatively small, enclosed environment (e.g., a pocket, purse, bag, drawer, or the like).
- the voice trigger system 400 operates differently if it is near other devices (such as other devices that have voice triggers and/or speech-based services) than if it is not near other devices. This may be useful, for example, to shut down or decrease the sensitivity of the voice trigger system 400 when many devices are close together so that if one person utters a trigger word, other surrounding devices are not triggered as well.
- a device determines proximity to other devices using RFID, near-field communications, infrared/acoustic signals, or the like.
- Voice triggers are particularly useful when a device is being operated in a hands-free mode, such as when the user is driving.
- users often use external audio systems, such as wired or wireless headsets, watches with speakers and/or
- wireless headsets and vehicle audio systems may connect to an electronic device using BLUETOOTH communications, or any other appropriate wireless communication.
- BLUETOOTH communications or any other appropriate wireless communication.
- a voice trigger it may be inefficient for a voice trigger to monitor audio received via a wireless audio accessory because of the power required to maintain an open audio channel with the wireless accessory.
- a wireless headset may hold enough charge in its battery to provide a few hours of continuous talk-time, and it is therefore preferable to reserve the battery for when the headset is needed for actual communication, instead of using it to simply monitor ambient audio and wait for a possible trigger sound.
- the voice trigger system 400 monitors audio from the microphone 230 on the device even when the device is coupled to an external microphone (wired or wireless). Then, when the voice trigger detects the trigger word, the device initializes an active audio link with the external microphone in order to receive subsequent sound inputs (such as a command to a voice-based digital assistant) via the external microphone rather than the on-device microphone 230.
- an active communication link can be maintained between an external audio system 416 (which may be communicatively coupled to the device 104 via wires or wirelessly) and the device so that the voice trigger system 400 can listen for a trigger sound via the external audio system 416 instead of (or in addition to) the on-device microphone 230.
- characteristics of the motion of the electronic device and/or the external audio system 416 e.g., as determined by accelerometers, gyroscopes, etc. on the respective devices
- the voice trigger system 400 should monitor ambient sound using the on-device microphone 230 or an external microphone 418.
- the difference between the motion of the device and the external audio system 416 provides information about whether the external audio system 416 is actually in use. For example, if both the device and a wireless headset are moving (or not moving) substantially identically, it may be determined that the headset is not in use or is not being worn. This may occur, for example, because both devices are near to each other and idle (e.g., sitting on a table or stored in a pocket, bag, purse, drawer, etc.). Accordingly, under these conditions, the voice trigger system 400 monitors the on-device microphone, because it is unlikely that the headset is actually being used. If there is a difference in motion between the wireless headset and the device, however, it is determined that the headset is being worn by a user.
- the voice trigger system 400 maintains an active communication link and monitors the microphone 418 of the headset instead of (or in addition to) the on-device microphone 230. And because this technique focuses on the difference in the motion of the device and the headset, motion that is common to both devices can be canceled out.
- the device e.g., a cellular phone
- the relative motion of the headset as compared to the device can be determined in order to determine whether the headset is likely in use (or, whether the headset is not being worn). While the above discussion refers to wireless headsets, similar techniques are applied to wired headsets as well.
- the voice trigger system 400 is able to adapt its voice and/or sound recognition profiles for a particular user or group of users (e.g., by using an adaptive speech recognition model).
- sound detectors e.g., the sound- type detector 404 and/or the trigger sound detector 406 may be configured to compare a representation of a sound input (e.g., the sound or utterance provided by a user) to one or more reference
- the device adjusts the reference representation to which the input representation is compared.
- the reference representation is adjusted (or created) as part of a voice enrollment or "training" procedure, where a user outputs the trigger sound several times so that the device can adjust (or create) the reference representation. The device can then create a reference representation using that person's actual voice.
- the device uses trigger sounds that are received under normal use conditions to adjust the reference representation. For example, after a successful voice triggering event (e.g., where the sound input was found to satisfy all of the triggering criteria) the device will use information from the sound input to adjust and/or tune the reference representation. In some implementations, only sound inputs that were determined to satisfy all or some of the triggering criteria with a certain confidence level are used to adjust the reference representation. Thus, when the voice trigger is less confident that a sound input corresponds to or includes a trigger sound, that voice input may be ignored for the purposes of adjusting the reference representation. On the other hand, in some implementations, sound inputs that satisfied the voice trigger system 400 to a lower confidence are used to adjust the reference representation.
- the device 104 iteratively adjusts the reference representation (using these or other techniques) as more and more sound inputs are received so that slight changes in a user' s voice over time can be accommodated. For example, in some implementations, the device 104 (and/or associated devices or services) adjusts the reference representation after each successful triggering event. In some implementations, the device 104 analyzes the sound input associated with each successful triggering event and determines if the reference representations should be adjusted based on that input (e.g., if certain conditions are met), and only adjusts the reference representation if it is appropriate to do so. In some implementations, the device 104 maintains a moving average of the reference representation over time.
- the voice trigger system 400 detects sounds that do not satisfy one or more of the triggering criteria (e.g., as determined by one or more of the sound detectors), but that may actually be attempts by an authorized user to do so.
- voice trigger system 400 may be configured to respond to a trigger phrase such as "Hey, SIRI", but if a user's voice has changed (e.g., due to sickness, age, accent/inflection changes, etc.), the voice trigger system 400 may not recognize the user's attempt to activate the device.
- the voice trigger system 400 does not respond to the user' s first attempt to active the voice trigger, the user is likely to repeat the trigger phrase.
- the device detects that these repeated sound inputs are similar to one another, and/or that they are similar to the trigger phrase (though not similar enough to cause the voice trigger system 400 to activate the speech-based service). If such conditions are met, the device determines that the sound inputs correspond to valid attempts to activate the voice trigger system 400.
- the voice trigger system 400 uses those received sound inputs to adjust one or more aspects of the voice trigger system 400 so that similar utterances by the user will be accepted as valid triggers in the future.
- these sound inputs are used to adapt the voice trigger system 400 only if certain conditions or combinations of conditions are met. For example, in some
- the sound inputs are used to adapt the voice trigger system 400 when a predetermined number of sound inputs are received in succession (e.g., 2, 3, 4, 5, or any other appropriate number), when the sound inputs are sufficiently similar to the reference representation, when the sound inputs are sufficiently similar to each other, when the sound inputs are close together (e.g., when they are received within a predetermined time period and/or at or near a predetermined interval), and/or any combination of these or other conditions.
- a predetermined number of sound inputs are received in succession (e.g., 2, 3, 4, 5, or any other appropriate number)
- the sound inputs are sufficiently similar to the reference representation
- the sound inputs are sufficiently similar to each other
- close together e.g., when they are received within a predetermined time period and/or at or near a predetermined interval
- the voice trigger system 400 may detect one or more sound inputs that do not satisfy one or more of the triggering criteria, followed by a manual initiation of the speech-based service (e.g., by pressing a button or icon). In some cases, the voice trigger system 400 may detect one or more sound inputs that do not satisfy one or more of the triggering criteria, followed by a manual initiation of the speech-based service (e.g., by pressing a button or icon). In some examples of the speech-based service may be detected by a manual initiation of the speech-based service.
- the voice trigger system 400 determines that, because speech-based service was initiated shortly after the sound inputs were received, the sound inputs actually corresponded to failed voice triggering attempts. Accordingly, the voice trigger system 400 uses those received sound inputs to adjust one or more aspects of the voice trigger system 400 so that utterances by the user will be accepted as valid triggers in the future, as described above.
- the device adjusts how sound inputs are filtered and/or what filters are applied to sound inputs, such as to focus on and/or eliminate certain frequencies or ranges of frequencies of a sound input.
- the device adjusts an algorithm that is used to compare the input representation with the reference representation. For example, in some implementations, one or more terms of a mathematical function used to determine the difference between an input representation and a reference representation are changed, added, or removed, or a different mathematical function is substituted.
- adaptation techniques such as those described above require more resources than the voice trigger system 400 is able to or is configured to provide.
- the sound detectors may not have, or have access to, the amount or the types of processors, data, or memory that are necessary to perform the iterative adaptation of a reference representation and/or a sound detection algorithm (or any other appropriate aspect of the voice trigger system 400).
- one or more of the above described adaptation techniques are performed by a more powerful processor, such as an application processor (e.g., the processor(s) 204), or by a different device (e.g., the server system 108).
- the voice trigger system 400 is designed to operate even when the application processor is in a standby mode.
- the sound inputs which are to be used to adapt the voice trigger system 400 are received when the application processor is not active and cannot process the sound input.
- the sound input is stored by the device so that it can be further processed and/or analyzed after it is received.
- the sound input is stored in the memory buffer 414 of the audio subsystem 226.
- the sound input is stored in system memory (e.g., memory 250, Figure 2) using direct memory access (DMA) techniques (including, for example, using a DMA engine so that data can be copied or moved without requiring the application processor to be initiated).
- DMA direct memory access
- the stored sound input is then provided to or accessed by the application processor (or the server system 108, or another appropriate device) once it is initiated so that the application processor can execute one or more of the adaptation techniques described above.
- Figures 5A and 5B are flow diagrams representing methods for obtaining training data to update an adaptive speech recognition model, according to certain
- the methods are, optionally, governed by instructions that are stored in a computer memory or non-transitory computer readable storage medium (e.g., memory 250 of client device 104, memory 302 associated with the digital assistant system 300) and that are executed by one or more processors of one or more computer systems of a digital assistant system, including, but not limited to, the server system 108, and/or the user device 104a.
- the computer readable storage medium may include a magnetic or optical disk storage device, solid state storage devices such as Flash memory, or other non- volatile memory device or devices.
- the computer readable instructions stored on the computer readable storage medium may include one or more of: source code, assembly language code, object code, or other instruction format that is interpreted by one or more processors.
- some operations in each method may be combined and/or the order of some operations may be changed from the order shown in the figures.
- operations shown in separate figures and/or discussed in association with separate methods may be combined to form other methods, and operations shown in the same figure and/or discussed in association with the same method may be separated into different methods.
- one or more operations in the methods are performed by modules of the digital assistant system 300 and/or an electronic device (e.g., the user device 104), including, for example, the natural language processing module 332, the dialogue flow processing module 334, the audio subsystem 226, the noise detector 402, the sound-type detector 404, the trigger sound detector 406, the speech-based service 408, and/or any sub modules thereof.
- Figures 5A-5B illustrate a method 500 of obtaining training data to update an adaptive speech recognition model, according to some implementations.
- the method 500 is performed at a system including one or more processors and memory storing instructions for execution by the one or more processors (e.g., server system 108 in Figure 1).
- the system determines (502) that a first user of a first mobile communication device (e.g., a mobile telephone) is engaged in a call over a communications network (e.g., communications network(s) 110 in Figure 1).
- a device e.g., device 104a in Figure 1
- receives a user request e.g., via audio subsystem 226 in Figure 2
- a device receives a call request through a communications network (e.g., via communication subsystem(s) 224 in Figure 2).
- the call is a mobile telephone call (e.g., telephony service 122-5 in Figure 1). In some implementations, the call is a multimedia communication. In some implementations, the call is a VoIP communication. In some implementations, the call comprises an interaction between the first user of the first mobile communication device and a second mobile device. In some implementations, the call comprises a conversation between the first user of the first mobile communication device and a user of a second device. As an example, a call may comprise a conversation between a user of device 104a and a user of device 104b in Figure 1.
- the system provides (504) an adaptive speech recognition model.
- the system provides a speaker-independent model (e.g., a canonical model).
- a speaker-independent model e.g., a canonical model.
- the system determines that a speaker-dependent model has not been stored for a corresponding user and, in accordance with that
- the system determines that the first mobile communication device is not associated with any stored speaker-dependent models and, in accordance with that determination, provides a speaker-independent model.
- the system provides a speaker- dependent model associated with a user of the first mobile communication device. For example, the system determines that a stored speaker-dependent model corresponds to the user and, in accordance with that determination, provides the stored speaker-dependent model
- the system taps (506) into an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device.
- tapping into the outbound audio channel includes tapping into a baseband unit (e.g., baseband subsystem 412 in Figure 4) of the first mobile communication device.
- a baseband unit e.g., baseband subsystem 412 in Figure 4
- tapping into the outbound audio channel includes tapping into an audio DSP (e.g., audio DSP 416 in Figure 4) of the first mobile communication device.
- tapping into the outbound audio channel includes tapping into an application processor (e.g., application processor 418 in Figure 4) of the first mobile communication device.
- the system prior to tapping into the outbound audio channel, converts the audio signal from an analog audio signal to a digital audio signal. For example, in some implementations, the system converts the audio signal in a codec (e.g., codec 410 in Figure 4) prior to tapping into the outbound audio channel in an application processor (e.g., application processor 418 in Figure 4). In some implementations, prior to tapping into the outbound audio channel, the system determines that the mobile
- the communication device is in an adaptive- speaker- training mode. For example, in some implementations, prior to tapping into the outbound audio channel, the system sends the user of a mobile communication device a request to enter into a speaker-training mode, the user accepts the request, and in accordance with the user's acceptance, the device enters a speaker-training mode. In this example, if the user does not want the system to tap into the outbound audio channel (e.g., the user has a problem affecting the user's voice and/or the user is not the primary user of the mobile device), then the user can reject the request to enter into speaker-training mode.
- the tapped audio is rendered through the application processor where an embedded recognition engine is used to recognize sound units for updating model statistics.
- tapping into the outbound audio channel comprises utilizing (508) a voice activity detector (VAD) to determine when there is active speech on the outbound audio channel; in accordance with a determination that there is active speech on the outbound audio channel, tapping into the outbound audio channel; and in accordance with a determination that there is not active speech on the outbound audio channel, forgoing tapping into the outbound audio channel.
- VAD voice activity detector
- the system utilizes a VAD included within a voice trigger system (e.g., voice trigger system 400 in Figure 4).
- voice trigger system e.g., voice trigger system 400 in Figure 4
- the system updates (510) the adaptive speech recognition model with training data (e.g., one or more speaker-dependent sound units) derived from the call audio signal.
- training data e.g., one or more speaker-dependent sound units
- the system determines that the call has ended.
- updating the adaptive speech recognition model comprises replacing the adaptive speech recognition model with a new adaptive speech recognition model generated from the training data. For example, in some of these implementations, the system discards the provided adaptive speech recognition model and stores a new adaptive speech recognition model generated from the training data obtained during the call.
- updating the adaptive speech recognition model comprises generating a speaker-dependent model from the data, comparing the
- updating the adaptive speech recognition model based on the comparison includes directly adapting the model parameters. In some other implementations, updating the adaptive speech recognition model based on the comparison includes applying Linear transforms (e.g., an MLLR transform) for a set of Gaussians to the model parameters. In some implementations, updating the adaptive speech recognition model based on the comparison includes aligning a user's speech to phonetic sound units. These alignments are subsequently used to update each sound unit in the adaptive speech recognition model. In some implementations, the system updates the adaptive speech recognition model with training data only if the amount of training data reaches a predetermined threshold (e.g., batch adaptation). In some of these implementations, the system updates the adaptive speech recognition model with training data derived from the call audio signal and discards any prior data.
- a predetermined threshold e.g., batch adaptation
- the system utilizes the adaptive speech recognition model to authenticate a user on a device (e.g., the first mobile communication device).
- voice-trigger system 400 includes voice authentication module 426 and voice authentication module 426 utilizes the adaptive speech recognition model to authenticate the user.
- the adaptive speech recognition model comprises a Gaussian mixture model that models the unique characteristics of the associated user. A statistical likelihood measure is used to render an authentication by how the user' s voice matches the adaptive speech recognition model versus a second model calculated from large amounts of anti-user data.
- updating the adaptive speech recognition model comprises comparing (512) the call audio signal with the adaptive speech recognition model, generating a confidence score based on the comparison; in accordance with a determination that the confidence score is at or above a predetermined threshold, updating the adaptive speech recognition model with the training data derived from the call audio signal; and, in accordance with a determination that the confidence score is below the predetermined threshold, forgoing the updating of the adaptive speech recognition model.
- a confidence score below the predetermined threshold may indicate that a user is not associated with the provided adaptive speech recognition model or that the user is sick or that conditions have temporarily altered the user's voice.
- forgoing the updating of the adaptive speech recognition model prevents a potential decrease in the model's accuracy which may result from updating the model with the training data.
- the training data derived from the call audio signal includes (514) one or more speaker-dependent sound units (e.g., phonemes).
- updating the adaptive speech recognition model comprises comparing at least a subset of the one or more speaker-dependent sound units to the adaptive speech recognition model, generating one or more adaptive speech vectors based on the comparison, and modifying the adaptive speech recognition model based on at least a subset of the one or more adaptive speech vectors.
- the system stores (516) the adaptive speech recognition model in memory.
- the memory is a component of the first mobile device (e.g., memory 250 in Figure 2).
- the memory is a component of a server (e.g., memory 302 in Figure 3A).
- the system modifies (518) the adaptive speech recognition model with training data derived from audio user interaction with a digital assistant.
- the adaptive speech recognition model is updated with training data obtained both from calls and from audio user interactions with a digital assistant. For example, the adaptive speech recognition model is updated when a user makes a call and is then updated again when the user audibly interacts with a digital assistant.
- the system receives (520) invocation of the digital assistant, receives speech input from a second user, generates speech-to-text output corresponding to the speech input, and provides the speech-to-text output to the digital assistant.
- a user e.g., a user of device 104a in Figure 1 invokes a digital assistant (e.g., digital assistant 102a in Figure 1) and the system uses the updated adaptive speech recognition model to generate the speech-to-text output (e.g., in STT processing module 330 in Figure 3B).
- generating speech-to-text output corresponding to the speech input comprises comparing (522) the speech input with the adaptive speech recognition model; in accordance with a determination that the second user is the same as the first user, performing automatic speech recognition using the adaptive speech recognition model to generate speech-to-text output; and in accordance with a determination that the second user is distinct from the first user, performing automatic speech recognition using a speaker- independent model to generate speech-to-text output.
- Figure 6 shows a functional block diagram of a system 600 configured in accordance with the principles of the invention as described above.
- the functional blocks of the device may be implemented by hardware, software, or a combination of hardware and software to carry out the principles of the invention. It is understood by persons of skill in the art that the functional blocks described in Figure 6 may be combined or separated into sub-blocks to implement the principles of the invention as described above. Therefore, the description herein may support any possible combination or separation or further definition of the functional blocks described herein.
- system 600 includes sound receiving unit 602 configured to receive sound input.
- System 600 also includes processing unit 604 coupled to sound receiving unit 602.
- voice activity detecting unit 606 is coupled to sound receiving unit 602 and processing unit 604.
- processing unit 604 includes determining unit 608, model providing unit 610, tapping unit 612, model updating unit 614, model modifying unit 616, speech-to-text (STT) generating unit 618, STT providing unit 620, speech comparing unit 622, storing unit 624, score generating unit 626, and speech vector generating unit 628.
- model updating unit 614 is the same as model modifying unit 616.
- STT generating unit 618 corresponds to STT processing module 330.
- STT providing unit 620 corresponds to STT processing module 330.
- storing unit 624 corresponds to memory interface 202.
- storing unit 624 corresponds to memory 302.
- model providing unit 610 corresponds to digital assistant client module 264.
- model providing unit 610 corresponds to I/O processing module 328.
- Processing unit 604 is configured to: determine (e.g., with determining unit
- a first user of a first mobile communication device is engaged in a call over a communications network and providing an adaptive speech recognition mode; tap (e.g., with tapping unit 612) into an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device; and update (e.g., with model updating unit 614) the adaptive speech recognition model with training data derived from the call audio signal.
- processing unit 604 is part of said first mobile communication device having said one or more microphones. In some implementations, processing unit 604 is part of a server system distinct from said first mobile communication device.
- processing unit 604 is further configured to modify
- the adaptive speech recognition model with training data derived from audio user interaction with a digital assistant.
- processing unit 604 is further configured to, after said updating, receive invocation of the digital assistant, receive speech input from a second user, generate (e.g., with STT generating unit 618) speech-to-text output corresponding to the speech input, and provide (e.g., with STT providing unit 620) the speech-to-text output to the digital assistant.
- processing unit 604 is further configured to store
- the adaptive speech recognition model in memory.
- tapping into the outbound audio channel comprises: utilizing a voice activity detector to determine when there is active speech on the outbound audio channel; in accordance with a determination that there is active speech on the outbound audio channel, tapping into the outbound audio channel; and in accordance with a determination that there is not active speech on the outbound audio channel, forgoing tapping into the outbound audio channel.
- updating comprises: comparing (e.g., with speech comparing unit 622) the call audio signal with the adaptive speech recognition model; generating (e.g., with score generating unit 626) a confidence score based on the comparison; in accordance with a determination that the confidence score is at or above a predetermined threshold, updating the adaptive speech recognition model with the training data derived from the call audio signal; and in accordance with a determination that the confidence score is below the predetermined threshold, forgoing to update the adaptive speech recognition model.
- the training data derived from the call audio signal includes one or more speaker-dependent sound units; and updating (e.g., with model updating unit 614) the adaptive speech recognition model comprises: comparing at least a subset of the one or more speaker-dependent sound units to the adaptive speech recognition model;
- first means “first,” “second,” etc.
- these elements should not be limited by these terms. These terms are only used to distinguish one element from another.
- a first sound detector could be termed a second sound detector, and, similarly, a second sound detector could be termed a first sound detector, without changing the meaning of the description, so long as all occurrences of the "first sound detector” are renamed consistently and all occurrences of the "second sound detector” are renamed consistently.
- the first sound detector and the second sound detector are both sound detectors, but they are not the same sound detector.
Abstract
A method for updating an adaptive speech recognition model is provided. In some implementations, the method is performed at a communications device including one or more processors and memory storing instructions for execution by the one or more processors. The method includes determining that a first user of a first mobile communication device is engaged in a call over a communications network and providing an adaptive speech recognition model The method also includes tapping into an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device and updating the adaptive speech recognition model with training data derived from the call audio signal.
Description
SYSTEM AND METHOD FOR UPDATING AN ADAPTIVE
SPEECH RECOGNITION MODEL
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This Application claims the benefit of U.S. Provisional Application No.
61/799,479, filed on March 15, 2013, entitled SYSTEM AND METHOD FOR UPDATING AN ADAPTIVE SPEECH RECOGNITION MODEL, which is hereby incorporated by reference in its entity for all purposes.
TECHNICAL FIELD
[0002] The disclosed implementations relate generally to digital assistants. More specifically, to a method and system for obtaining training data to update an adaptive speech recognition model for use when interacting with a digital assistant.
BACKGROUND
[0003] Recently, voice-based digital assistants, such as Apple's SIRI, have been introduced into the marketplace to handle various tasks such as web searching and navigation. Currently, voice-based digital assistant systems utilize either speaker-independent models or speaker-dependent models in order to generate speech-to-text (STT) input to the digital assistant. The speaker-dependent model increases accuracy in generating the STT input, and therefore, enables the digital assistant to provide better results to the user.
However, speaker-dependent models require significant training data in order to function with increased accuracy. Reciting many lines of predefined text in order to train a speaker-dependent model has several drawbacks. Many users would prefer not to expend the time and effort in providing training data for a model. In addition, a user's speech is markedly different when reading as opposed to ordinary conversation, therefore the accuracy of a speech model trained with data obtained from a user reading is worse than one trained with data obtained from a user's ordinary conversation. Finally, a user's speech changes with time and environment.
SUMMARY
[0004] The implementations described below provide systems and methods for obtaining training data to update an adaptive speech recognition model for use when interacting with a digital assistant. The systems and methods obtain training data to update a speaker-dependent speech recognition model using a user's ordinary conversations.
[0005] Interactions with a voice-based digital assistant (or other speech-based services, such as a speech-to-text transcription service) often utilize a speaker-dependent speech recognition model (e.g., an adaptive speech recognition model). However, the accuracy of speaker-dependent speech recognition models depends on the volume and quality of training data. As described herein, an adaptive speech recognition model is updated by deriving training data from tapping an outbound audio channel of a mobile communication device to obtain a call audio signal.
[0006] Some implementations provide a method for obtaining training data to update an adaptive speech recognition model. The method is performed at a first mobile
communication device including one or more processors and memory storing instructions for execution by the one or more processors. The method includes determining that a first user of a first mobile communication device is engaged in a call over a communications network and providing an adaptive speech recognition model. The method further includes tapping into an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device, and updating the adaptive speech recognition model with training data derived from the call audio signal.
[0007] Some implementations provide a method for obtaining training data to update an adaptive speech recognition model. In these implementations, the method is performed at a server system distinct from a first mobile communication device. The method includes determining that a first user of the first mobile communication device is engaged in a call over a communications network and providing an adaptive speech recognition model. The method further includes tapping into an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device, and updating the adaptive speech recognition model with training data derived from the call audio signal.
[0008] In some implementations, the first mobile communication device is a mobile telephone. In some implementations, the first mobile communication device is a laptop computer. In some implementations, the first mobile communication device is a tablet computer.
[0009] In some implementations, the call is a mobile telephone call. In some implementations, the call is a multimedia communication. In some implementations, the call is a VoIP communication. In some implementations, the call comprises an interaction between the first user of the first mobile communication device and a second mobile device. In some implementations, the call comprises a conversation between the first user of the first mobile communication device and a user of a second device.
[0010] In some implementations, providing an adaptive speech recognition model comprises providing a speaker-independent model. In some implementations, providing an adaptive speech recognition model comprises providing a speaker-dependent model associated with a user of the first mobile communication device.
[0011] In some implementations, the method further includes, prior to tapping into the outbound audio channel, converting the call audio signal from an analog audio signal to a digital audio signal. In some implementations the method further comprises, prior to tapping into the outbound audio channel, determining that the first mobile communication device is in an adaptive- speaker- training mode.
[0012] In some implementations, tapping into the outbound audio channel comprises tapping into the baseband unit. In some implementations, tapping includes tapping into the digital signal processor (DSP). In some implementations, tapping includes tapping into the application processor.
[0013] In some implementations, the method further includes, prior to updating the adaptive speech recognition model, determining that the call has ended.
[0014] In some implementations, training data comprises one or more speaker- dependent sound units. In some implementations, updating the adaptive speech recognition model comprises replacing the adaptive speech recognition model with a new adaptive speech recognition model generated from the training data. In some implementations, updating the adaptive speech recognition model comprises generating a speaker-dependent model from the data, comparing the speaker-dependent model to the adaptive speech recognition model, and updating the adaptive speech recognition model based on the comparison.
[0015] In some implementations, the method further includes modifying the adaptive speech recognition model with training data derived from audio user interaction with a digital assistant. In some implementations, the method further comprises, after said updating, receiving invocation of the digital assistant, receiving speech input from a second user, generating speech-to-text output corresponding to the speech input, and providing the speech-to-text output to the digital assistant. In some implementations, generating speech-to- text output corresponding to the speech input comprises comparing the speech input with the adaptive speech recognition model; in accordance with a determination that the second user is the same as the first user, performing automatic speech recognition using the adaptive speech recognition model to generate speech-to-text output; and in accordance with a determination that the second user is distinct from the first user, performing automatic speech recognition using a speaker-independent model to generate speech-to-text output.
[0016] In some implementations, the method further includes storing the adaptive speech recognition model in memory. In some implementations, the memory is a component of the first mobile communication device. In some implementations, the memory is a component of a server distinct from the first mobile communication device.
[0017] In some implementations, tapping into the outbound audio channel comprises utilizing a voice activity detector to determine when there is active speech on the outbound audio channel; in accordance with a determination that there is active speech on the outbound audio channel, tapping into the outbound audio channel; and in accordance with a
determination that there is not active speech on the outbound audio channel, forgoing tapping into the outbound audio channel.
[0018] n some implementations, updating the adaptive speech recognition model comprises comparing the call audio signal with the adaptive speech recognition model, generating a confidence score based on the comparison; in accordance with a determination that the confidence score is at or above a predetermined threshold, updating the adaptive speech recognition model with the training data derived from the call audio signal; and in accordance with a determination that the confidence score is below the predetermined threshold, forgoing to update the adaptive speech recognition model.
[0019] In some implementations, the training data derived from the call audio signal includes one or more speaker-dependent sound units, and updating the adaptive speech
recognition model comprises comparing at least a subset of the one or more speaker-dependent sound units to the adaptive speech recognition model, generating one or more adaptive speech vectors based on the comparison, and modifying the adaptive speech recognition model based on at least a subset of the one or more adaptive speech vectors.
[0020] In accordance with some implementations, a system includes one or more processors, memory, and one or more programs stored in the memory. The one or more programs comprising instructions to determine that a first user of a first mobile
communication device is engaged in a call over a communications network and provide an adaptive speech recognition model. The one or more programs further comprising instructions to tap into an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device and update the adaptive speech recognition model with training data derived from the call audio signal.
[0021] In accordance with some implementations, the system comprises the first mobile communication device having the one or more microphones. In accordance with some implementations, the system comprises a server system distinct from the first mobile communication device.
[0022] In accordance with some implementations, a computer-readable storage medium (e.g., a non-transitory computer readable storage medium) is provided, the computer-readable storage medium storing one or more programs for execution by one or more processors of an electronic device, the one or more programs including instructions for performing any of the methods described herein.
[0023] In accordance with some implementations, an electronic device (e.g., a portable electronic device) is provided that comprises means for performing any of the methods described herein.
[0024] In accordance with some implementations, an electronic device (e.g., a portable electronic device) is provided that comprises a processing unit configured to perform any of the methods described herein.
[0025] In accordance with some implementations, an electronic device (e.g., a portable electronic device) is provided that comprises one or more processors and memory
storing one or more programs for execution by the one or more processors, the one or more programs including instructions for performing any of the methods described herein.
[0026] In accordance with some implementations, an information processing apparatus for use in an electronic device is provided, the information processing apparatus comprising means for performing any of the methods described herein.
BRIEF DESCRIPTION OF THE DRAWINGS
[0027] Figure 1 is a block diagram illustrating an environment in which a digital assistant operates in accordance with some implementations.
[0028] Figure 2 is a block diagram illustrating a digital assistant client system in accordance with some implementations.
[0029] Figure 3A is a block diagram illustrating a standalone digital assistant system or a digital assistant server system in accordance with some implementations.
[0030] Figure 3B is a block diagram illustrating functions of the digital assistant shown in Figure 3A in accordance with some implementations.
[0031] Figure 3C is a network diagram illustrating a portion of an ontology in accordance with some implementations.
[0032] Figure 4 is a block diagram illustrating components of an audio system, in accordance with some implementations.
[0033] Figures 5A-5B are flow charts illustrating methods for updating an adaptive speech recognition model, in accordance with some implementations.
[0034] Figure 6 is a functional block diagram of an electronic device in accordance with some embodiments.
[0035] Like reference numerals refer to corresponding parts throughout the drawings.
DESCRIPTION OF IMPLEMENTATIONS
[0036] Figure 1 is a block diagram of an operating environment 100 of a digital assistant according to some implementations. The terms "digital assistant," "virtual assistant," "intelligent automated assistant," "voice-based digital assistant," or "automatic
digital assistant," refer to any information processing system that interprets natural language input in spoken and/or textual form to deduce user intent (e.g., identify a task type that corresponds to the natural language input), and performs actions based on the deduced user intent (e.g., perform a task corresponding to the identified task type). For example, to act on a deduced user intent, the system can perform one or more of the following: identifying a task flow with steps and parameters designed to accomplish the deduced user intent (e.g., identifying a task type), inputting specific requirements from the deduced user intent into the task flow, executing the task flow by invoking programs, methods, services, APIs, or the like (e.g., sending a request to a service provider); and generating output responses to the user in an audible (e.g., speech) and/or visual form.
[0037] Specifically, once initiated, a digital assistant system is capable of accepting a user request at least partially in the form of a natural language command, request, statement, narrative, and/or inquiry. Typically, the user request seeks either an informational answer or performance of a task by the digital assistant system. A satisfactory response to the user request is generally either provision of the requested informational answer, performance of the requested task, or a combination of the two. For example, a user may ask the digital assistant system a question, such as "Where am I right now?" Based on the user's current location, the digital assistant may answer, "you are in Central Park near the west gate." The user may also request the performance of a task, for example, by stating "Please invite my friends to my girlfriend's birthday party next week." In response, the digital assistant may acknowledge the request by generating a voice output, "Yes, right away," and then send a suitable calendar invite from the user's email address to each of the user's friends listed in the user's electronic address book or contact list. There are numerous other ways of interacting with a digital assistant to request information or performance of various tasks. In addition to providing verbal responses and taking programmed actions, the digital assistant can also provide responses in other visual or audio forms (e.g., as text, alerts, music, videos, animations, etc.).
[0038] As shown in Figure 1, in some implementations, a digital assistant system is implemented according to a client-server model. The digital assistant system includes a client-side portion (e.g., 102a and 102b) (hereafter "digital assistant (DA) client 102") executed on a user device (e.g., 104a and 104b), and a server-side portion 106 (hereafter "digital assistant (DA) server 106") executed on a server system 108. The DA client 102
communicates with the DA server 106 through one or more networks 110. The DA client 102 provides client-side functionalities such as user-facing input and output processing and communications with the DA server 106. The DA server 106 provides server-side functionalities for any number of DA clients 102 each residing on a respective user device 104 (also called a client device or electronic device).
[0039] In some implementations, the DA server 106 includes a client-facing I/O interface 112, one or more processing modules 114, data and models 116, an I/O interface to external services 118, a photo and tag database 130, and a photo-tag module 132. The client-facing I/O interface facilitates the client-facing input and output processing for the digital assistant server 106. The one or more processing modules 114 utilize the data and models 116 to determine the user's intent based on natural language input and perform task execution based on the deduced user intent. Photo and tag database 130 stores fingerprints of digital photographs, and, optionally digital photographs themselves, as well as tags associated with the digital photographs. Photo-tag module 132 creates tags, stores tags in association with photographs and/or fingerprints, automatically tags photographs, and links tags to locations within photographs.
[0040] In some implementations, the DA server 106 communicates with external services 120 (e.g., navigation service(s) 122-1, messaging service(s) 122-2, information service(s) 122-3, calendar service 122-4, telephony service 122-5, photo service(s) 122-6, etc.) through the network(s) 110 for task completion or information acquisition. The I/O interface to the external services 118 facilitates such communications.
[0041] Examples of the user device 104 include, but are not limited to, a handheld computer, a personal digital assistant (PDA), a tablet computer, a laptop computer, a desktop computer, a cellular telephone, a smartphone, an enhanced general packet radio service (EGPRS) mobile phone, a media player, a navigation device, a game console, a television, a remote control, or a combination of any two or more of these data processing devices or any other suitable data processing devices. More details on the user device 104 are provided in reference to an exemplary user device 104 shown in Figure 2.
[0042] Examples of the communication network(s) 110 include local area networks
(LAN) and wide area networks (WAN), e.g., the Internet. The communication network(s) 110 may be implemented using any known network protocol, including various wired or
wireless protocols, such as Ethernet, Universal Serial Bus (USB), FIREWIRE, Global System for Mobile Communications (GSM), Enhanced Data GSM Environment (EDGE), code division multiple access (CDMA), time division multiple access (TDMA), Bluetooth, Wi-Fi, voice over Internet Protocol (VoIP), Wi-MAX, or any other suitable communication protocol.
[0043] The server system 108 can be implemented on at least one data processing apparatus and/or a distributed network of computers. In some implementations, the server system 108 also employs various virtual devices and/or services of third party service providers (e.g., third-party cloud service providers) to provide the underlying computing resources and/or infrastructure resources of the server system 108.
[0044] Although the digital assistant system shown in Figure 1 includes both a client- side portion (e.g., the DA client 102) and a server-side portion (e.g., the DA server 106), in some implementations, a digital assistant system refers only to the server-side portion (e.g., the DA server 106). In some implementations, the functions of a digital assistant can be implemented as a standalone application installed on a user device. In addition, the divisions of functionalities between the client and server portions of the digital assistant can vary in different implementations. For example, in some implementations, the DA client 102 is a thin-client that provides only user-facing input and output processing functions, and delegates all other functionalities of the digital assistant to the DA server 106. In some other
implementations, the DA client 102 is configured to perform or assist one or more functions of the DA server 106.
[0045] Figure 2 is a block diagram of a user device 104 in accordance with some implementations. The user device 104 includes a memory interface 202, one or more processors 204, and a peripherals interface 206. The various components in the user device 104 are coupled by one or more communication buses or signal lines. The user device 104 includes various sensors, subsystems, and peripheral devices that are coupled to the peripherals interface 206. The sensors, subsystems, and peripheral devices gather information and/or facilitate various functionalities of the user device 104.
[0046] For example, in some implementations, a motion sensor 210 (e.g., an accelerometer), a light sensor 212, a GPS receiver 213, a temperature sensor, and a proximity sensor 214 are coupled to the peripherals interface 206 to facilitate orientation, light, and proximity sensing functions. In some implementations, other sensors 216, such as a biometric
sensor, barometer, and the like, are connected to the peripherals interface 206, to facilitate related functionalities.
[0047] n some implementations, the user device 104 includes a camera subsystem 220 coupled to the peripherals interface 206. In some implementations, an optical sensor 222 of the camera subsystem 220 facilitates camera functions, such as taking photographs and recording video clips. In some implementations, the user device 104 includes one or more wired and/or wireless communication subsystems 224 provide communication functions. The communication subsystems 224 typically includes various communication ports, radio frequency receivers and transmitters, and/or optical (e.g., infrared) receivers and transmitters. In some implementations, the user device 104 includes an audio subsystem 226 coupled to one or more speakers 228 and one or more microphones 230 to facilitate voice-enabled functions, such as voice recognition, voice replication, digital recording, and telephony functions. In some implementations, the audio subsystem 226 is coupled to a voice trigger system 400. In some implementations, the voice trigger system 400 and/or the audio subsystem 226 includes low-power audio circuitry and/or programs (i.e., including hardware and/or software) for receiving and/or analyzing sound inputs, including, for example, one or more analog-to-digital converters, digital signal processors (DSPs), sound detectors, memory buffers, codecs, and the like. In some implementations, the low-power audio circuitry (alone or in addition to other components of the user device 104) provides voice (or sound) trigger functionality for one or more aspects of the user device 104, such as a voice-based digital assistant or other speech-based service. In some implementations, the low-power audio circuitry provides voice trigger functionality even when other components of the user device 104 are shut down and/or in a standby mode, such as the processor(s) 204, I/O subsystem 240, memory 250, and the like. The voice trigger system 400 and the audio subsystem 226 are described in further detail with respect to Figure 4.
[0048] In some implementations, an I/O subsystem 240 is also coupled to the peripheral interface 206. In some implementations, the user device 104 includes a touch screen 246, and the I/O subsystem 240 includes a touch screen controller 242 coupled to the touch screen 246. When the user device 104 includes the touch screen 246 and the touch screen controller 242, the touch screen 246 and the touch screen controller 242 are typically configured to, for example, detect contact and movement or break thereof using any of a plurality of touch sensitivity technologies, such as capacitive, resistive, infrared, surface
acoustic wave technologies, proximity sensor arrays, and the like. In some implementations, the user device 104 includes a display that does not include a touch-sensitive surface. In some implementations, the user device 104 includes a separate touch-sensitive surface. In some implementations, the user device 104 includes other input controller(s) 244. When the user device 104 includes the other input controller(s) 244, the other input controller(s) 244 are typically coupled to other input/control devices 248, such as one or more buttons, rocker switches, thumb-wheel, infrared port, USB port, and/or a pointer device such as a stylus.
[0049] The memory interface 202 is coupled to memory 250. In some
implementations, memory 250 includes a non-transitory computer readable medium, such as high-speed random access memory and/or non-volatile memory (e.g., one or more magnetic disk storage devices, one or more flash memory devices, one or more optical storage devices, and/or other non-volatile solid-state memory devices).
[0050] In some implementations, memory 250 stores an operating system 252, a communications module 254, a graphical user interface module 256, a sensor processing module 258, a phone module 260, and applications 262, and a subset or superset thereof. The operating system 252 includes instructions for handling basic system services and for performing hardware dependent tasks. The communications module 254 facilitates communicating with one or more additional devices, one or more computers and/or one or more servers. The graphical user interface module 256 facilitates graphic user interface processing. The sensor processing module 258 facilitates sensor-related processing and functions (e.g., processing voice input received with the one or more microphones 228). The phone module 260 facilitates phone-related processes and functions. The application module 262 facilitates various functionalities of user applications, such as electronic-messaging, web browsing, media processing, navigation, imaging and/or other processes and functions. In some implementations, the user device 104 stores in memory 250 one or more software applications 270-1 and 270-2 each associated with at least one of the external service providers.
[0051] As described above, in some implementations, memory 250 also stores client- side digital assistant instructions (e.g., in a digital assistant client module 264) and various user data 266 (e.g., user-specific vocabulary data, preference data, and/or other data such as the user's electronic address book or contact list, to-do lists, shopping lists, etc.) to provide the client-side functionalities of the digital assistant.
[0052] In various implementations, the digital assistant client module 264 is capable of accepting voice input, text input, touch input, and/or gestural input through various user interfaces (e.g., the I/O subsystem 244) of the user device 104. The digital assistant client module 264 is also capable of providing output in audio, visual, and/or tactile forms. For example, output can be provided as voice, sound, alerts, text messages, menus, graphics, videos, animations, vibrations, and/or combinations of two or more of the above. During operation, the digital assistant client module 264 communicates with the digital assistant server (e.g., the digital assistant server 106, Figure 1) using the communication subsystems 224.
[0053] In some implementations, the digital assistant client module 264 utilizes various sensors, subsystems and peripheral devices to gather additional information from the surrounding environment of the user device 104 to establish a context associated with a user input. In some implementations, the digital assistant client module 264 provides the context information or a subset thereof with the user input to the digital assistant server (e.g., the digital assistant server 106, Figure 1) to help deduce the user's intent.
[0054] In some implementations, the context information that can accompany the user input includes sensor information, e.g., lighting, ambient noise, ambient temperature, images or videos of the surrounding environment, etc. In some implementations, the context information also includes the physical state of the device, e.g., device orientation, device location, device temperature, power level, speed, acceleration, motion patterns, cellular signals strength, etc. In some implementations, information related to the software state of the user device 106, e.g., running processes, installed programs, past and present network activities, background services, error logs, resources usage, etc., of the user device 104 is also provided to the digital assistant server (e.g., the digital assistant server 106, Figure 1) as context information associated with a user input.
[0055] In some implementations, the DA client module 264 selectively provides information (e.g., at least a portion of the user data 266) stored on the user device 104 in response to requests from the digital assistant server. In some implementations, the digital assistant client module 264 also elicits additional input from the user via a natural language dialogue or other user interfaces upon request by the digital assistant server 106 (Figure 1). The digital assistant client module 264 passes the additional input to the digital assistant
server 106 to help the digital assistant server 106 in intent deduction and/or fulfillment of the user's intent expressed in the user request.
[0056] In some implementations, memory 250 may include additional instructions or fewer instructions. Furthermore, various functions of the user device 104 may be
implemented in hardware and/or in firmware, including in one or more signal processing and/or application specific integrated circuits, and the user device 104, thus, need not include all modules and applications illustrated in Figure 2.
[0057] Figure 3A is a block diagram of an exemplary digital assistant system 300
(also referred to as the digital assistant) in accordance with some implementations. In some implementations, the digital assistant system 300 is implemented on a standalone computer system. In some implementations, the digital assistant system 300 is distributed across multiple computers. In some implementations, some of the modules and functions of the digital assistant are divided into a server portion and a client portion, where the client portion resides on a user device (e.g., the user device 104) and communicates with the server portion (e.g., the server system 108) through one or more networks, e.g., as shown in Figure 1. In some implementations, the digital assistant system 300 is an embodiment of the server system 108 (and/or the digital assistant server 106) shown in Figure 1. In some implementations, the digital assistant system 300 is implemented in a user device (e.g., the user device 104, Figure 1), thereby eliminating the need for a client-server system. It should be noted that the digital assistant system 300 is only one example of a digital assistant system, and that the digital assistant system 300 may have more or fewer components than shown, may combine two or more components, or may have a different configuration or arrangement of the components. The various components shown in Figure 3A may be implemented in hardware, software, firmware, including one or more signal processing and/or application specific integrated circuits, or a combination thereof.
[0058] The digital assistant system 300 includes memory 302, one or more processors
304, an input/output (I/O) interface 306, and a network communications interface 308. These components communicate with one another over one or more communication buses or signal lines 310.
[0059] In some implementations, memory 302 includes a non-transitory computer readable medium, such as high-speed random access memory and/or a non-volatile computer
readable storage medium (e.g., one or more magnetic disk storage devices, one or more flash memory devices, one or more optical storage devices, and/or other non-volatile solid-state memory devices).
[0060] The I/O interface 306 couples input/output devices 316 of the digital assistant system 300, such as displays, keyboards, touch screens, and microphones, to the user interface module 322. The I/O interface 306, in conjunction with the user interface module 322, receives user inputs (e.g., voice input, keyboard inputs, touch inputs, etc.) and process them accordingly. In some implementations, when the digital assistant is implemented on a standalone user device, the digital assistant system 300 includes any of the components and I/O and communication interfaces described with respect to the user device 104 in Figure 2 (e.g., one or more microphones 230). In some implementations, the digital assistant system 300 represents the server portion of a digital assistant implementation, and interacts with the user through a client-side portion residing on a user device (e.g., the user device 104 shown in Figure 2).
[0061] In some implementations, the network communications interface 308 includes wired communication port(s) 312 and/or wireless transmission and reception circuitry 314. The wired communication port(s) receive and send communication signals via one or more wired interfaces, e.g., Ethernet, Universal Serial Bus (USB), FIREWIRE, etc. The wireless circuitry 314 typically receives and sends RF signals and/or optical signals from/to communications networks and other communications devices. The wireless communications may use any of a plurality of communications standards, protocols and technologies, such as GSM, EDGE, CDMA, TDMA, Bluetooth, Wi-Fi, VoIP, Wi-MAX, or any other suitable communication protocol. The network communications interface 308 enables communication between the digital assistant system 300 with networks, such as the Internet, an intranet and/or a wireless network, such as a cellular telephone network, a wireless local area network (LAN) and/or a metropolitan area network (MAN), and other devices.
[0062] In some implementations, the non-transitory computer readable storage medium of memory 302 stores programs, modules, instructions, and data structures including all or a subset of: an operating system 318, a communications module 320, a user interface module 322, one or more applications 324, and a digital assistant module 326. The one or more processors 304 execute these programs, modules, and instructions, and reads/writes from/to the data structures.
[0063] The operating system 318 (e.g., Darwin, RTXC, LINUX, UNIX, OS X, iOS,
WINDOWS, or an embedded operating system such as VxWorks) includes various software components and/or drivers for controlling and managing general system tasks (e.g., memory management, storage device control, power management, etc.) and facilitates
communications between various hardware, firmware, and software components.
[0064] The communications module 320 facilitates communications between the digital assistant system 300 with other devices over the network communications interface 308. For example, the communication module 320 may communicate with the
communications module 254 of the device 104 shown in Figure 2. The communications module 320 also includes various software components for handling data received by the wireless circuitry 314 and/or wired communications port 312.
[0065] In some implementations, the user interface module 322 receives commands and/or inputs from a user via the I/O interface 306 (e.g., from a keyboard, touch screen, and/or microphone), and provides user interface objects on a display.
[0066] The applications 324 include programs and/or modules that are configured to be executed by the one or more processors 304. For example, if the digital assistant system is implemented on a standalone user device, the applications 324 may include user applications, such as games, a calendar application, a navigation application, or an email application. If the digital assistant system 300 is implemented on a server farm, the applications 324 may include resource management applications, diagnostic applications, or scheduling
applications, for example.
[0067] Memory 302 also stores the digital assistant module (or the server portion of a digital assistant) 326. In some implementations, the digital assistant module 326 includes the following sub-modules, or a subset or superset thereof: an input/output processing module 328, a speech-to-text (STT) processing module 330, a natural language processing module 332, a dialogue flow processing module 334, a task flow processing module 336, a service processing module 338, and a photo module 132. Each of these processing modules has access to one or more of the following data and models of the digital assistant 326, or a subset or superset thereof: ontology 360, vocabulary index 344, user data 348, categorization module 349, disambiguation module 350, task flow models 354, service models 356, photo tagging module 358, search module 360, and local tag/photo storage 362.
[0068] In some implementations, using the processing modules (e.g., the input/output processing module 328, the STT processing module 330, the natural language processing module 332, the dialogue flow processing module 334, the task flow processing module 336, and/or the service processing module 338), data, and models implemented in the digital assistant module 326, the digital assistant system 300 performs at least some of the following: identifying a user' s intent expressed in a natural language input received from the user;
actively eliciting and obtaining information needed to fully deduce the user's intent (e.g., by disambiguating words, names, intentions, etc.); determining the task flow for fulfilling the deduced intent; and executing the task flow to fulfill the deduced intent. In some
implementations, the digital assistant also takes appropriate actions when a satisfactory response was not or could not be provided to the user for various reasons.
[0069] In some implementations, as discussed below, the digital assistant system 300 identifies, from a natural language input, a user's intent to tag a digital photograph, and processes the natural language input so as to tag the digital photograph with appropriate information. In some implementations, the digital assistant system 300 performs other tasks related to photographs as well, such as searching for digital photographs using natural language input, auto-tagging photographs, and the like.
[0070] As shown in Figure 3B, in some implementations, the I/O processing module
328 interacts with the user through the I/O devices 316 in Figure 3 A or with a user device (e.g., a user device 104 in Figure 1) through the network communications interface 308 in Figure 3 A to obtain user input (e.g., a speech input) and to provide responses to the user input. The I/O processing module 328 optionally obtains context information associated with the user input from the user device, along with or shortly after the receipt of the user input. The context information includes user- specific data, vocabulary, and/or preferences relevant to the user input. In some implementations, the context information also includes software and hardware states of the device (e.g., the user device 104 in Figure 1) at the time the user request is received, and/or information related to the surrounding environment of the user at the time that the user request was received. In some implementations, the I/O processing module 328 also sends follow-up questions to, and receives answers from, the user regarding the user request. In some implementations, when a user request is received by the I/O processing module 328 and the user request contains a speech input, the I/O processing
module 328 forwards the speech input to the speech-to-text (STT) processing module 330 for speech-to-text conversions.
[0071] In some implementations, the speech-to-text processing module 330 receives speech input (e.g., a user utterance captured in a voice recording) through the I/O processing module 328. In some implementations, the speech-to-text processing module 330 uses various acoustic and language models to recognize the speech input as a sequence of sound units (e.g., phonemes), and ultimately, a sequence of words or tokens written in one or more languages. The speech-to-text processing module 330 is implemented using any suitable speech recognition techniques (e.g., an adaptive speech recognition model), acoustic models, and language models, such as Hidden Markov Models, Dynamic Time Warping
(DTW)-based speech recognition, maximum a posteriori (MAP) model adaptation (e.g., structural MAP approach), maximum likelihood linear regression (MLLR) model adaptation (e.g., constrained MLLR, mean-only MLLR), and other statistical and/or analytical techniques. In some implementations, the speech-to-text processing can be performed at least partially by a third party service or on the user's device. Once the speech-to-text processing module 330 obtains the result of the speech-to-text processing (e.g., a sequence of words or tokens), it passes the result to the natural language processing module 332 for intent deduction.
[0072] The natural language processing module 332 ("natural language processor") of the digital assistant 326 takes the sequence of words or tokens ("token sequence") generated by the speech-to-text processing module 330, and attempts to associate the token sequence with one or more "actionable intents" recognized by the digital assistant. As used herein, an "actionable intent" represents a task that can be performed by the digital assistant 326 and/or the digital assistant system 300 (Figure 3A), and has an associated task flow implemented in the task flow models 354. The associated task flow is a series of programmed actions and steps that the digital assistant system 300 takes in order to perform the task. The scope of a digital assistant system's capabilities is dependent on the number and variety of task flows that have been implemented and stored in the task flow models 354, or in other words, on the number and variety of "actionable intents" that the digital assistant system 300 recognizes. The effectiveness of the digital assistant system 300, however, is also dependent on the digital assistant system's ability to deduce the correct "actionable intent(s)" from the user request expressed in natural language.
[0073] In some implementations, in addition to the sequence of words or tokens obtained from the speech-to-text processing module 330, the natural language processor 332 also receives context information associated with the user request (e.g., from the I/O processing module 328). The natural language processor 332 optionally uses the context information to clarify, supplement, and/or further define the information contained in the token sequence received from the speech-to-text processing module 330. The context information includes, for example, user preferences, hardware and/or software states of the user device, sensor information collected before, during, or shortly after the user request, prior interactions (e.g., dialogue) between the digital assistant and the user, and the like.
[0074] In some implementations, the natural language processing is based on an ontology 360. The ontology 360 is a hierarchical structure containing a plurality of nodes, each node representing either an "actionable intent" or a "property" relevant to one or more of the "actionable intents" or other "properties." As noted above, an "actionable intent" represents a task that the digital assistant system 300 is capable of performing (e.g., a task that is "actionable" or can be acted on). A "property" represents a parameter associated with an actionable intent or a sub-aspect of another property. A linkage between an actionable intent node and a property node in the ontology 360 defines how a parameter represented by the property node pertains to the task represented by the actionable intent node.
[0075] In some implementations, the ontology 360 is made up of actionable intent nodes and property nodes. Within the ontology 360, each actionable intent node is linked to one or more property nodes either directly or through one or more intermediate property nodes. Similarly, each property node is linked to one or more actionable intent nodes either directly or through one or more intermediate property nodes. For example, the ontology 360 shown in Figure 3C includes a "restaurant reservation" node, which is an actionable intent node. Property nodes "restaurant," "date/time" (for the reservation), and "party size" are each directly linked to the "restaurant reservation" node (i.e., the actionable intent node). In addition, property nodes "cuisine," "price range," "phone number," and "location" are sub-nodes of the property node "restaurant," and are each linked to the "restaurant reservation" node (i.e., the actionable intent node) through the intermediate property node "restaurant." For another example, the ontology 360 shown in Figure 3C also includes a "set reminder" node, which is another actionable intent node. Property nodes "date/time" (for setting the reminder) and "subject" (for the reminder) are each linked to the "set reminder"
node. Since the property "date/time" is relevant to both the task of making a restaurant reservation and the task of setting a reminder, the property node "date/time" is linked to both the "restaurant reservation" node and the "set reminder" node in the ontology 360.
[0076] An actionable intent node, along with its linked concept nodes, may be described as a "domain." In the present discussion, each domain is associated with a respective actionable intent, and refers to the group of nodes (and the relationships therebetween) associated with the particular actionable intent. For example, the ontology 360 shown in Figure 3C includes an example of a restaurant reservation domain 362 and an example of a reminder domain 364 within the ontology 360. The restaurant reservation domain includes the actionable intent node "restaurant reservation," property nodes
"restaurant," "date/time," and "party size," and sub-property nodes "cuisine," "price range," "phone number," and "location." The reminder domain 364 includes the actionable intent node "set reminder," and property nodes "subject" and "date/time." In some implementations, the ontology 360 is made up of many domains. Each domain may share one or more property nodes with one or more other domains. For example, the "date/time" property node may be associated with many other domains (e.g., a scheduling domain, a travel reservation domain, a movie ticket domain, etc.), in addition to the restaurant reservation domain 362 and the reminder domain 364.
[0077] While Figure 3C illustrates two exemplary domains within the ontology 360, the ontology 360 may include other domains (or actionable intents), such as "initiate a phone call," "find directions," "schedule a meeting," "send a message," "provide an answer to a question," "tag a photo," and so on. For example, a "send a message" domain is associated with a "send a message" actionable intent node, and may further include property nodes such as "recipient(s)," "message type," and "message body." The property node "recipient" may be further defined, for example, by the sub-property nodes such as "recipient name" and "message address."
[0078] In some implementations, the ontology 360 includes all the domains (and hence actionable intents) that the digital assistant is capable of understanding and acting upon. In some implementations, the ontology 360 may be modified, such as by adding or removing domains or nodes, or by modifying relationship between the nodes within the ontology 360.
[0079] In some implementations, nodes associated with multiple related actionable intents may be clustered under a "super domain" in the ontology 360. For example, a "travel" super-domain may include a cluster of property nodes and actionable intent nodes related to travels. The actionable intent nodes related to travels may include "airline reservation," "hotel reservation," "car rental," "get directions," "find points of interest," and so on. The actionable intent nodes under the same super domain (e.g., the "travels" super domain) may have many property nodes in common. For example, the actionable intent nodes for "airline reservation," "hotel reservation," "car rental," "get directions," and "find points of interest" may share one or more of the property nodes "start location," "destination," "departure date/time," "arrival date/time," and "party size."
[0080] In some implementations, each node in the ontology 360 is associated with a set of words and/or phrases that are relevant to the property or actionable intent represented by the node. The respective set of words and/or phrases associated with each node is the so-called "vocabulary" associated with the node. The respective set of words and/or phrases associated with each node can be stored in the vocabulary index 344 (Figure 3B) in association with the property or actionable intent represented by the node. For example, returning to Figure 3B, the vocabulary associated with the node for the property of restaurant" may include words such as "food," "drinks," "cuisine," "hungry," "eat," "pizza," "fast food," "meal," and so on. For another example, the vocabulary associated with the node for the actionable intent of "initiate a phone call" may include words and phrases such as "call," "phone," "dial," "ring," "call this number," "make a call to," and so on. The vocabulary index 344 optionally includes words and phrases in different languages.
[0081] In some implementations, the natural language processor 332 shown in Figure
3B receives the token sequence (e.g., a text string) from the speech-to-text processing module 330, and determines what nodes are implicated by the words in the token sequence. In some implementations, if a word or phrase in the token sequence is found to be associated with one or more nodes in the ontology 360 (via the vocabulary index 344), the word or phrase will "trigger" or "activate" those nodes. When multiple nodes are "triggered," based on the quantity and/or relative importance of the activated nodes, the natural language processor 332 will select one of the actionable intents as the task (or task type) that the user intended the digital assistant to perform. In some implementations, the domain that has the most
"triggered" nodes is selected. In some implementations, the domain having the highest
confidence value (e.g., based on the relative importance of its various triggered nodes) is selected. In some implementations, the domain is selected based on a combination of the number and the importance of the triggered nodes. In some implementations, additional factors are considered in selecting the node as well, such as whether the digital assistant system 300 has previously correctly interpreted a similar request from a user.
[0082] In some implementations, the digital assistant system 300 also stores names of specific entities in the vocabulary index 344, so that when one of these names is detected in the user request, the natural language processor 332 will be able to recognize that the name refers to a specific instance of a property or sub-property in the ontology. In some
implementations, the names of specific entities are names of businesses, restaurants, people, movies, and the like. In some implementations, the digital assistant system 300 can search and identify specific entity names from other data sources, such as the user's address book or contact list, a movies database, a musicians database, and/or a restaurant database. In some implementations, when the natural language processor 332 identifies that a word in the token sequence is a name of a specific entity (such as a name in the user's address book or contact list), that word is given additional significance in selecting the actionable intent within the ontology for the user request.
[0083] For example, when the words "Mr. Santo" are recognized from the user request, and the last name "Santo" is found in the vocabulary index 344 as one of the contacts in the user's contact list, then it is likely that the user request corresponds to a "send a message" or "initiate a phone call" domain. For another example, when the words "ABC Cafe" are found in the user request, and the term "ABC Cafe" is found in the vocabulary index 344 as the name of a particular restaurant in the user' s city, then it is likely that the user request corresponds to a "restaurant reservation" domain.
[0084] User data 348 includes user-specific information, such as user-specific vocabulary, user preferences, user address, user's default and secondary languages, user's contact list, and other short-term or long-term information for each user. The natural language processor 332 can use the user-specific information to supplement the information contained in the user input to further define the user intent. For example, for a user request "invite my friends to my birthday party," the natural language processor 332 is able to access user data 348 to determine who the "friends" are and when and where the "birthday party"
would be held, rather than requiring the user to provide such information explicitly in his/her request.
[0085] In some implementations, natural language processor 332 includes categorization module 349. In some implementations, the categorization module 349 determines whether each of the one or more terms in a text string (e.g., corresponding to a speech input associated with a digital photograph) is one of an entity, an activity, or a location, as discussed in greater detail below. In some implementations, the categorization module 349 classifies each term of the one or more terms as one of an entity, an activity, or a location.
[0086] Once the natural language processor 332 identifies an actionable intent (or domain) based on the user request, the natural language processor 332 generates a structured query to represent the identified actionable intent. In some implementations, the structured query includes parameters for one or more nodes within the domain for the actionable intent, and at least some of the parameters are populated with the specific information and requirements specified in the user request. For example, the user may say "Make me a dinner reservation at a sushi place at 7." In this case, the natural language processor 332 may be able to correctly identify the actionable intent to be "restaurant reservation" based on the user input. According to the ontology, a structured query for a "restaurant reservation" domain may include parameters such as {Cuisine}, {Time}, {Date}, {Party Size}, and the like. Based on the information contained in the user's utterance, the natural language processor 332 may generate a partial structured query for the restaurant reservation domain, where the partial structured query includes the parameters {Cuisine = "Sushi"} and {Time = "7pm"}.
However, in this example, the user's utterance contains insufficient information to complete the structured query associated with the domain. Therefore, other necessary parameters such as {Party Size} and {Date} are not specified in the structured query based on the information currently available. In some implementations, the natural language processor 332 populates some parameters of the structured query with received context information. For example, if the user requested a sushi restaurant "near me," the natural language processor 332 may populate a {location} parameter in the structured query with GPS coordinates from the user device 104.
[0087] In some implementations, the natural language processor 332 passes the structured query (including any completed parameters) to the task flow processing
module 336 ("task flow processor"). The task flow processor 336 is configured to perform one or more of: receiving the structured quely from the natural language processor 332, completing the structured query, and performing the actions required to "complete" the user's ultimate request. In some implementations, the various procedures necessary to complete these tasks are provided in task flow models 354. In some implementations, the task flow models 354 include procedures for obtaining additional information from the user, and task flows for performing actions associated with the actionable intent.
[0088] As described above, in order to complete a structured query, the task flow processor 336 may need to initiate additional dialogue with the user in order to obtain additional information, and/or disambiguate potentially ambiguous utterances. When such interactions are necessary, the task flow processor 336 invokes the dialogue processing module 334 ("dialogue processor") to engage in a dialogue with the user. In some
implementations, the dialogue processing module 334 determines how (and/or when) to ask the user for the additional information, and receives and processes the user responses. In some implementations, the questions are provided to and answers are received from the users through the I/O processing module 328. For example, the dialogue processing module 334 presents dialogue output to the user via audio and/or visual output, and receives input from the user via spoken or physical (e.g., touch gesture) responses. Continuing with the example above, when the task flow processor 336 invokes the dialogue processor 334 to determine the "party size" and "date" information for the structured query associated with the domain "restaurant reservation," the dialogue processor 334 generates questions such as "For how many people?" and "On which day?" to pass to the user. Once answers are received from the user, the dialogue processing module 334 populates the structured query with the missing information, or passes the information to the task flow processor 336 to complete the missing information from the structured query.
[0089] In some cases, the task flow processor 336 may receive a structured query that has one or more ambiguous properties. For example, a structured query for the "send a message" domain may indicate that the intended recipient is "Bob," and the user may have multiple contacts named "Bob." The task flow processor 336 will request that the dialogue processor 334 disambiguate this property of the structured query. In turn, the dialogue processor 334 may ask the user "Which Bob?", and display (or read) a list of contacts named "Bob" from which the user may choose.
[0090] In some implementations, dialogue processor 334 includes disambiguation module 350. In some implementations, disambiguation module 350 disambiguates one or more ambiguous terms (e.g., one or more ambiguous terms in a text string corresponding to a speech input associated with a digital photograph). In some implementations, disambiguation module 350 identifies that a first term of the one or more terms has multiple candidate meanings, prompts a user for additional information about the first term, receives the additional information from the user in response to the prompt and identifies the entity, activity, or location associated with the first term in accordance with the additional information.
[0091] In some implementations, disambiguation module 350 disambiguates pronouns. In such implementations, disambiguation module 350 identifies one of the one or more terms as a pronoun and determines a noun to which the pronoun refers. In some implementations, disambiguation module 350 determines a noun to which the pronoun refers by using a contact list associated with a user of the electronic device. Alternatively, or in addition, disambiguation module 350 determines a noun to which the pronoun refers as a name of an entity, an activity, or a location identified in a previous speech input associated with a previously tagged digital photograph. Alternatively, or in addition, disambiguation module 350 determines a noun to which the pronoun refers as a name of a person identified based on a previous speech input associated with a previously tagged digital photograph.
[0092] In some implementations, disambiguation module 350 accesses information obtained from one or more sensors (e.g., proximity sensor 214, light sensor 212, GPS receiver 213, temperature sensor 215, and motion sensor 210) of a handheld electronic device (e.g., user device 104) for determining a meaning of one or more of the terms. In some implementations, disambiguation module 350 identifies two terms each associated with one of an entity, an activity, or a location. For example, a first of the two terms refers to a person, and a second of the two terms refers to a location. In some implementations, disambiguation module 350 identifies three terms each associated with one of an entity, an activity, or a location.
[0093] Once the task flow processor 336 has completed the structured query for an actionable intent, the task flow processor 336 proceeds to perform the ultimate task associated with the actionable intent. Accordingly, the task flow processor 336 executes the steps and instructions in the task flow model according to the specific parameters contained
in the structured query. For example, the task flow model for the actionable intent of
"restaurant reservation" may include steps and instructions for contacting a restaurant and actually requesting a reservation for a particular party size at a particular time. For example, using a structured query such as: {restaurant reservation, restaurant = ABC Cafe, date = 3/12/2012, time = 7pm, party size = 5}, the task flow processor 336 may perform the steps of: (I) logging onto a server of the ABC Cafe or a restaurant reservation system that is configured to accept reservations for multiple restaurants, such as the ABC Cafe, (2) entering the date, time, and party size information in a form on the website, (3) submitting the form, and (4) making a calendar entry for the reservation in the user's calendar. In another example, described in greater detail below, the task flow processor 336 executes steps and instructions associated with tagging or searching for digital photographs in response to a voice input, e.g., in conjunction with photo module 132.
[0094] In some implementations, the task flow processor 336 employs the assistance of a service processing module 338 ("service processor") to complete a task requested in the user input or to provide an informational answer requested in the user input. For example, the service processor 338 can act on behalf of the task flow processor 336 to make a phone call, set a calendar entry, invoke a map search, invoke or interact with other user applications installed on the user device, and invoke or interact with third party services (e.g. a restaurant reservation portal, a social networking website or service, a banking portal, etc.). In some implementations, the protocols and application programming interfaces (API) required by each service can be specified by a respective service model among the service models 356. The service processor 338 accesses the appropriate service model for a service and generates requests for the service in accordance with the protocols and APIs required by the service according to the service model.
[0095] For example, if a restaurant has enabled an online reservation service, the restaurant can submit a service model specifying the necessary parameters for making a reservation and the APIs for communicating the values of the necessary parameters to the online reservation service. When requested by the task flow processor 336, the service processor 338 can establish a network connection with the online reservation service using the web address stored in the service models 356, and send the necessary parameters of the reservation (e.g., time, date, party size) to the online reservation interface in a format according to the API of the online reservation service.
[0096] In some implementations, the natural language processor 332, dialogue processor 334, and task flow processor 336 are used collectively and iteratively to deduce and define the user's intent, obtain information to further clarify and refine the user intent, and finally generate a response (e.g., provide an output to the user, or complete a task) to fulfill the user' s intent.
[0097] In some implementations, after all of the tasks needed to fulfill the user' s request have been performed, the digital assistant 326 formulates a confirmation response, and sends the response back to the user through the I/O processing module 328. If the user request seeks an informational answer, the confirmation response presents the requested information to the user. In some implementations, the digital assistant also requests the user to indicate whether the user is satisfied with the response produced by the digital assistant 326.
[0098] Attention is now directed to Figure 4, which is a block diagram illustrating components of an audio subsystem 226 and a voice trigger system 400, in accordance with some implementations. (The voice trigger system 400 is not limited to voice, and
implementations described herein apply equally to non-voice sounds.) The audio subsystem 226 and the voice trigger system 400 are composed of various components, modules, and/or software programs within the electronic device 104.
[0099] In some implementations, the audio subsystem 226 includes a baseband subsystem 412, an application processor 418, a codec 410, and a buffer 414. In some implementations, more or fewer of these modules are used. The baseband subsystem 412, application processor 418, codec 410, and buffer 414 may be referred to as modules, and may include hardware (e.g., circuitry, memory, processors, etc.), software (e.g., programs, software-on-a-chip, firmware, etc.), and/or any combinations thereof for performing the functionality described herein. In some implementations, the codec 410 includes an analog to digital converter (ADC) and a digital to analog converter (DAC). In some implementations, the audio subsystem 226 is coupled to one or more microphones 230 (Figure 2) and one or more speakers 228 (Figure 2). In some implementations, the baseband subsystem 412, application processor 418, codec 410, and buffer 414 are connected using an Integrated Interchip Sound (I S) interface. In some implementations, the baseband subsystem 412, application processor 418, codec 410, and buffer 414 are connected using a high-speed interchip (HSIC) interface. In some implementations, the audio subsystem 226 is coupled to
an external audio system 416 that includes at least one microphone 418 and at least one speaker 420. The audio subsystem 226 provides sound inputs to the voice trigger system 400 (as well as other components or modules, such as a phone and/or communication(s) subsystem of a phone) for processing and/or analysis. In some implementations, the baseband subsystem is not a component of audio subsystem 226. In some implementations, the baseband subsystem is a component of communications subsystem 220.
[0100] Privacy is a concern for many users. Therefore, in some implementations, the baseband unit (e.g., baseband subsystem 412 in Figure 4) has a per-device privacy key and tapping into the baseband unit to obtain a call audio signal does not introduce any
vulnerability into the system. In some implementations, the adaptive speech recognition model is also encrypted to preserve privacy. In some implementations, the outbound audio channel of the first mobile communication device is encrypted and only authorized systems can tap into the outbound audio channel. Thus, unauthorized persons cannot tap into the outbound audio channel in a similar/analogous manner. Also, in some implementations, the adaptive speech recognition model does not include data which could be used to reconstruct a user's call audio. In other words, in these implementations, obtaining training data from a user's calls does not compromise the user's privacy during the calls. Thus, in some implementations, the data yielded from phone conversations need not be saved or transmitted to a server, thereby avoiding privacy issues.
[0101] In some implementations, baseband subsystem 412 includes an audio digital signal processor (DSP) 416. In some implementations, the audio digital signal processor 416 is included within the application processor 418. In some implementations, the audio digital signal processor (DSP) 416 is included within the codec 410. In some implementations, the audio digital signal processor (DSP) 416 is a standalone module within the audio subsystem 226. In some implementations, application processor 418 includes an embedded recognition engine. In some of these implementations, the embedded recognition engine is used to align sound units for updating the adaptive speech recognition model. In some implementations, application processor 418 corresponds to one or more processor(s) 204.
[0102] In some implementations, the voice trigger system 400 includes a noise detector 402, a sound-type detector 404, a trigger sound detector 406, and a speech-based service 408, and an audio subsystem 226, each coupled to an audio bus 401. In some implementations, more or fewer of these modules are used. The sound detectors 402, 404,
and 406 may be referred to as modules, and may include hardware (e.g., circuitry, memory, processors, etc.), software (e.g., programs, software-on-a-chip, firmware, etc.), and/or any combinations thereof for performing the functionality described herein. In some
implementations, the sound detectors are communicatively, programmatically, physically, and/or operationally coupled to one another (e.g., via a communications bus), as illustrated in Figure 4 by the broken lines. (For ease of illustration, Figure 4 shows each sound detector coupled only to adjacent sound detectors. It will be understood that each sound detector can be coupled to any of the other sound detectors as well.)
[0103] In some implementations, the speech-based service 408 is a voice-based digital assistant, and corresponds to one or more components or functionalities of the digital assistant system described above with reference to Figures 1-3C. In some implementations, the speech-based service is a speech-to-text service, a dictation service, or the like.
[0104] In some implementations, the noise detector 402 monitors an audio channel to determine whether a sound input from the audio subsystem 226 satisfies a predetermined condition, such as an amplitude threshold. The audio channel corresponds to a stream of audio information received by one or more sound pickup devices, such as the one or more microphones 230 (Figure 2). The audio channel refers to the audio information regardless of its state of processing or the particular hardware that is processing and/or transmitting the audio information. For example, the audio channel may refer to analog electrical impulses (and/or the circuits on which they are propagated) from the microphone 230, as well as a digitally encoded audio stream resulting from processing of the analog electrical impulses (e.g., by the audio subsystem 226 and/or any other audio processing system of the electronic device 104).
[0105] In some implementations, the predetermined condition is whether the sound input is above a certain volume for a predetermined amount of time. In some
implementations, the noise detector uses time-domain analysis of the sound input, which requires relatively little computational and battery resources as compared to other types of analysis (e.g., as performed by the sound- type detector 404, the trigger word detector 406, and/or the speech-based service 408). In some implementations, other types of signal processing and/or audio analysis are used, including, for example, frequency-domain analysis. If the noise detector 402 determines that the sound input satisfies the predetermined condition, it initiates an upstream sound detector, such as the sound-type detector 404 (e.g.,
by providing a control signal to initiate one or more processing routines, and/or by providing power to the upstream sound detector). In some implementations, the upstream sound detector is initiated in response to other conditions being satisfied. For example, in some implementations, the upstream sound detector is initiated in response to determining that the device is not being stored in an enclosed space (e.g., based on a light detector detecting a threshold level of light).
[0106] The sound-type detector 404 monitors the audio channel to determine whether a sound input corresponds to a certain type of sound, such as sound that is characteristic of a human voice, whistle, clap, etc. The type of sound that the sound-type detector 404 is configured to recognize will correspond to the particular trigger sound(s) that the voice trigger is configured to recognize. In implementations where the trigger sound is a spoken word or phrase, the sound-type detector 404 includes a "voice activity detector" (VAD). In some implementations, the sound-type detector 404 uses frequency-domain analysis of the sound input. For example, the sound-type detector 404 generates a spectrogram of a received sound input (e.g., using a Fourier transform), and analyzes the spectral components of the sound input to determine whether the sound input is likely to correspond to a particular type or category of sounds (e.g., human speech). Thus, in implementations where the trigger sound is a spoken word or phrase, if the audio channel is picking up ambient sound (e.g., traffic noise) but not human speech, the VAD will not initiate the trigger sound detector 406.
[0107] In some implementations, the sound-type detector 404 remains active for as long as predetermined conditions of any downstream sound detector (e.g., the noise detector 402) are satisfied. For example, in some implementations, the sound-type detector 404 remains active as long as the sound input includes sound above a predetermined amplitude threshold (as determined by the noise detector 402), and is deactivated when the sound drops below the predetermined threshold. In some implementations, once initiated, the sound-type detector 404 remains active until a condition is met, such as the expiration of a timer (e.g., for 1, 2, 5, or 10 seconds, or any other appropriate duration), the expiration of a certain number of on/off cycles of the sound-type detector 404, or the occurrence of an event (e.g., the amplitude of the sound falls below a second threshold, as determined by the noise detector 402 and/or the sound- type detector 404).
[0108] As mentioned above, if the sound-type detector 404 determines that the sound input corresponds to a predetermined type of sound, it initiates an upstream sound detector
(e.g., by providing a control signal to initiate one or more processing routines, and/or by providing power to the upstream sound detector), such as the trigger sound detector 406.
[0109] The trigger sound detector 406 is configured to determine whether a sound input includes at least part of certain predetermined content (e.g., at least part of the trigger word, phrase, or sound). In some implementations, the trigger sound detector 406 compares a representation of the sound input (an "input representation") to one or more reference representations of the trigger word. If the input representation matches at least one of the one or more reference representations with an acceptable confidence, the trigger sound detector 406 initiates the speech-based service 408 (e.g., by providing a control signal to initiate one or more processing routines, and/or by providing power to the upstream sound detector). In some implementations, the input representation and the one or more reference representations are spectrograms (or mathematical representations thereof), which represent how the spectral density of a signal varies with time. In some implementations, the representations are other types of audio signatures or voiceprints. In some implementations, initiating the speech-based service 408 includes bringing one or more circuits, programs, and/or processors out of a standby mode, and invoking the sound-based service. The sound-based service is then ready to provide more comprehensive speech recognition, speech-to-text processing, and/or natural language processing.
[0110] In some implementations, the voice-trigger system 400 includes voice authentication functionality, so that it can determine if a sound input corresponds to a voice of a particular person, such as an owner/user of the device. For example, in some
implementations, the sound-type detector 404 uses a voiceprinting technique to determine that the sound input was uttered by an authorized user. Voice authentication and
voiceprinting are described in more detail in U.S. Patent Application No. 13/053,144, assigned to the assignee of the instant application, which is hereby incorporated by reference in its entirety. In some implementations, voice authentication is included in any of the sound detectors described herein (e.g., the noise detector 402, the sound- type detector 404, the trigger sound detector 406, and/or the speech-based service 408). In some implementations, voice authentication is implemented as a separate module from the sound detectors listed above (e.g., as voice authentication module 426, Figure 4), and may be operationally positioned after the noise detector 402, after the sound-type detector 404, after the trigger sound detector 406, or at any other appropriate position.
[0111] In some implementations, the trigger sound detector 406 remains active for as long as conditions of any downstream sound detector(s) (e.g., the noise detector 402 and/or the sound-type detector 404) are satisfied. For example, in some implementations, the trigger sound detector 406 remains active as long as the sound input includes sound above a predetermined threshold (as detected by the noise detector 402). In some implementations, it remains active as long as the sound input includes sound of a certain type (as detected by the sound- type detector 404). In some implementations, it remains active as long as both of the foregoing conditions are met.
[0112] In some implementations, once initiated, the trigger sound detector 406 remains active until a condition is met, such as the expiration of a timer (e.g., for 1, 2, 5, or 10 seconds, or any other appropriate duration), the expiration of a certain number of on/off cycles of the trigger sound detector 406, or the occurrence of an event (e.g., the amplitude of the sound falls below a second threshold).
[0113] In some implementations, when one sound detector initiates another detector, both sound detectors remain active. However, the sound detectors may be active or inactive at various times, and it is not necessary that all of the downstream (e.g., the lower power and/or sophistication) sound detectors be active (or that their respective conditions are met) in order for upstream sound detectors to be active. For example, in some implementations, after the noise detector 402 and the sound-type detector 404 determine that their respective conditions are met, and the trigger sound detector 406 is initiated, one or both of the noise detector 402 and the sound-type detector 404 are deactivated and/or enter a standby mode while the trigger sound detector 406 operates. In other implementations, both the noise detector 402 and the sound-type detector 404 (or one or the other) stay active while the trigger sound detector 406 operates. In various implementations, different combinations of the sound detectors are active at different times, and whether one is active or inactive may depend on the state of other sound detectors, or may be independent of the state of other sound detectors.
[0114] While Figure 4 describes three separate sound detectors, each configured to detect different aspects of a sound input, more or fewer sound detectors are used in various implementations of the voice trigger. For example, in some implementations, only the trigger sound detector 406 is used. In some implementations, the trigger sound detector 406 is used in conjunction with either the noise detector 402 or the sound-type detector 404. In some
implementations, all of the detectors 402-406 are used. In some implementations, additional sound detectors are included as well.
[0115] Moreover, different combinations of sound detectors may be used at different times. For example, the particular combination of sound detectors and how they interact may depend on one or more conditions, such as the context or operating state of a device. As a specific example, if a device is plugged in (and thus not relying exclusively on battery power), the trigger sound detector 406 is active, while the noise detector 402 and the sound-type detector 404 remain inactive. In another example, if the device is in a pocket or backpack, all sound detectors are inactive.
[0116] By cascading sound detectors as described above, where the detectors that require more power are invoked only when necessary by detectors that require lower power, power efficient voice triggering functionality can be provided. As described above, additional power efficiency is achieved by operating one or more of the sound detectors according to a duty cycle. For example, in some implementations, the noise detector 402 operates according to a duty cycle so that it performs effectively continuous noise detection, even though the noise detector is off for at least part of the time. In some implementations, the noise detector 402 is on for 10 milliseconds and off for 90 milliseconds. In some implementations, the noise detector 402 is on for 20 milliseconds and off for 500 milliseconds. Other on and off durations are also possible.
[0117] In some implementations, if the noise detector 402 detects a noise during its
"on" interval, the noise detector 402 will remain on in order to further process and/or analyze the sound input. For example, the noise detector 402 may be configured to initiate an upstream sound detector if it detects sound above a predetermined amplitude for a
predetermined amount of time (e.g., 100 milliseconds). Thus, if the noise detector 402 detects sound above a predetermined amplitude during its 10 millisecond "on" interval, it will not immediately enter the "off interval. Instead, the noise detector 402 remains active and continues to process the sound input to determine whether it exceeds the threshold for the full predetermined duration (e.g., 100 milliseconds).
[0118] In some implementations, the sound- type detector 404 operates according to a duty cycle. In some implementations, the sound-type detector 404 is on for 20 milliseconds and off for 100 milliseconds. Other on and off durations are also possible. In some
implementations, the sound-type detector 404 is able to determine whether a sound input corresponds to a predetermined type of sound within the "on" interval of its duty cycle. Thus, the sound-type detector 404 will initiate the trigger sound detector 406 (or any other upstream sound detector) if the sound-type detector 404 determines, during its "on" interval, that the sound is of a certain type. Alternatively, in some implementations, if the sound-type detector 404 detects, during the "on" interval, sound that may correspond to the predetermined type, the detector will not immediately enter the "off interval. Instead, the sound-type detector 404 remains active and continues to process the sound input and determine whether it corresponds to the predetermined type of sound. In some implementations, if the sound detector determines that the predetermined type of sound has been detected, it initiates the trigger sound detector 406 to further process the sound input and determine if the trigger sound has been detected.
[0119] Similar to the noise detector 402 and the sound-type detector 404, in some implementations, the trigger sound detector 406 operates according to a duty cycle. In some implementations, the trigger sound detector 406 is on for 50 milliseconds and off for 50 milliseconds. Other on and off durations are also possible. If the trigger sound detector 406 detects, during its "on" interval, that there is sound that may correspond to a trigger sound, the detector will not immediately enter the "off interval. Instead, the trigger sound detector 406 remains active and continues to process the sound input and determine whether it includes the trigger sound. In some implementations, if such a sound is detected, the trigger sound detector 406 remains active to process the audio for a predetermined duration, such as 1, 2, 5, or 10 seconds, or any other appropriate duration. In some implementations, the duration is selected based on the length of the particular trigger word or sound that it is configured to detect. For example, if the trigger phrase is "Hey, SIRI," the trigger word detector is operated for about 2 seconds to determine whether the sound input includes that phrase.
[0120] In some implementations, some of the sound detectors are operated according to a duty cycle, while others operate continuously when active. For example, in some implementations, only the first sound detector is operated according to a duty cycle (e.g., the noise detector 402 in Figure 4), and upstream sound detectors are operated continuously once they are initiated. In some other implementations, the noise detector 402 and the sound-type detector 404 are operated according to a duty cycle, while the trigger sound detector 406 is
operated continuously. Whether a particular sound detector is operated continuously or according to a duty cycle depends on one or more conditions, such as the context or operating state of a device. In some implementations, if a device is plugged in and not relying exclusively on battery power, all of the sound detectors operate continuously once they are initiated. In other implementations, the noise detector 402 (or any of the sound detectors) operates according to a duty cycle if the device is in a pocket or backpack (e.g., as determined by sensor and/or microphone signals), but operates continuously when it is determined that the device is likely not being stored. In some implementations, whether a particular sound detector is operated continuously or according to a duty cycle depends on the battery charge level of the device. For example, the noise detector 402 operates continuously when the battery charge is above 50%, and operates according to a duty cycle when the battery charge is below 50%.
[0121] In some implementations, the voice trigger includes noise, echo, and/or sound cancellation functionality (referred to collectively as noise cancellation). In some
implementations, noise cancellation is performed by the audio subsystem 226 (e.g., by the audio DSP 416). Noise cancellation reduces or removes unwanted noise or sounds from the sound input prior to it being processed by the sound detectors. In some cases, the unwanted noise is background noise from the user's environment, such as a fan or the clicking from a keyboard. In some implementations, the unwanted noise is any sound above, below, or at predetermined amplitudes or frequencies. For example, in some implementations, sound above the typical human vocal range (e.g., 3,000 Hz) is filtered out or removed from the signal. In some implementations, multiple microphones (e.g., the microphones 230) are used to help determine what components of received sound should be reduced and/or removed. For example, in some implementations, the audio subsystem 226 uses beam forming techniques to identify sounds or portions of sound inputs that appear to originate from a single point in space (e.g., a user's mouth). The audio subsystem 226 then focuses on this sound by removing from the sound input sounds that are received equally by all microphones (e.g., ambient sound that does not appear to originate from any particular direction).
[0122] In some implementations, the DSP 416 is configured to cancel or remove from the sound input sounds that are being output by the device on which the digital assistant is operating. For example, if the audio subsystem 226 is outputting music, radio, a podcast, a voice output, or any other audio content (e.g., via the speaker 228), the DSP 416 removes any
of the outputted sound that was picked up by a microphone and included in the sound input. Thus, the sound input is free of the outputted audio (or at least contains less of the outputted audio). Accordingly, the sound input that is provided to the sound detectors will be cleaner, and the triggers more accurate. Aspects of noise cancellation are described in more detail in U.S. Patent No. 7,272,224, assigned to the assignee of the instant application, which is hereby incorporated by reference in its entirety.
[0123] In some implementations, different sound detectors require that the sound input be filtered and/or preprocessed in different ways. For example, in some
implementations, the noise detector 402 is configured to analyze time-domain audio signal between 60 and 20,000 Hz, and the sound-type detector is configured to perform
frequency-domain analysis of audio between 60 and 3,000 Hz. Thus, in some
implementations, the audio DSP 46 (and/or other audio DSPs of the device 104) preprocesses received audio according to the respective needs of the sound detectors. In some
implementations, on the other hand, the sound detectors are configured to filter and/or preprocess the audio from the audio subsystem 226 according to their specific needs. In such cases, the audio DSP 416 may still perform noise cancellation prior to providing the sound input to the sound detectors.
[0124] In some implementations, the context of the electronic device is used to help determine whether and how to operate the voice trigger. For example, it may be unlikely that users will invoke a speech-based service, such as a voice-based digital assistant, when the device is stored in their pocket, purse, or backpack. Also, it may be unlikely that users will invoke a speech-based service when they are at a loud rock concert. For some users, it is unlikely that they will invoke a speech-based service at certain times of the day (e.g., late at night). On the other hand, there are also contexts in which it is more likely that a user will invoke a speech-based service using a voice trigger. For example, some users will be more likely to use a voice trigger when they are driving, when they are alone, when they are at work, or the like. Various techniques are used to determine the context of a device. In various implementations, the device uses information from any one or more of the following components or information sources to determine the context of a device: GPS receivers, light sensors, microphones, proximity sensors, orientation sensors, inertial sensors, cameras, communications circuitry and/or antennas, charging and/or power circuitry, switch positions, temperature sensors, compasses, accelerometers, calendars, user preferences, etc.
[0125] The context of the device can then be used to adjust how and whether the voice trigger operates. For example, in certain contexts, the voice trigger will be deactivated (or operated in a different mode) as long as that context is maintained. For example, in some implementations, the voice trigger is deactivated when the phone is in a predetermined orientation (e.g., lying face-down on a surface), during predetermined time periods (e.g., between 10:00 PM and 8:00AM), when the phone is in a "silent" or a "do not disturb" mode (e.g., based on a switch position, mode setting, or user preference), when the device is in a substantially enclosed space (e.g., a pocket, bag, purse, drawer, or glove box), when the device is near other devices that have a voice trigger and/or speech -based services (e.g., based on proximity sensors, acoustic/wireless/infrared communications), and the like. In some implementations, instead of being deactivated, the voice trigger system 400 is operated in a low-power mode (e.g., by operating the noise detector 402 according to a duty cycle with a 10 millisecond "on" interval and a 5 second "off interval). In some implementations, an audio channel is monitored more infrequently when the voice trigger system 400 is operated in a low-power mode. In some implementations, a voice trigger uses a different sound detector or combination of sound detectors when it is in a low-power mode than when it is in a normal mode. (The voice trigger may be capable of numerous different modes or operating states, each of which may use a different amount of power, and different implementations will use them according to their specific designs.)
[0126] On the other hand, when the device is in some other contexts, the voice trigger will be activated (or operated in a different mode) so long as that context is maintained. For example, in some implementations, the voice trigger remains active while it is plugged into a power source, when the phone is in a predetermined orientation (e.g., lying face-up on a surface), during predetermined time periods (e.g., between 8:00AM and 10:00 PM), when the device is travelling and/or in a car (e.g., based on GPS signals, BLUETOOTH connection or docking with a vehicle, etc.), and the like. Aspects of determining when a device is in a vehicle are described in more detail in U.S. Provisional Patent Application No. 61/657,744, assigned to the assignee of the instant application, which is hereby incorporated by reference in its entirety. Several specific examples of how to determine certain contexts are provided below. In various embodiments, different techniques and/or information sources are used to detect these and other contexts.
[0127] As noted above, whether or not the voice trigger system 400 is active (e.g., listening) can depend on the physical orientation of a device. In some implementations, the voice trigger is active when the device is placed "face-up" on a surface (e.g., with the display and/or touchscreen surface visible), and/or is inactive when it is "face-down." This provides a user with an easy way to activate and/or deactivate the voice trigger without requiring manipulation of settings menus, switches, or buttons. In some implementations, the device detects whether it is face-up or face-down on a surface using light sensors (e.g., based on the difference in incident light on a front and a back face of the device 104), proximity sensors, magnetic sensors, accelerometers, gyroscopes, tilt sensors, cameras, and the like.
[0128] In some implementations, other operating modes, settings, parameters, or preferences are affected by the orientation and/or position of the device. In some
implementations, the particular trigger sound, word, or phrase of the voice trigger is listening for depends on the orientation and/or position of the device. For example, in some implementations, the voice trigger listens for a first trigger word, phrase, or sound when the device is in one orientation (e.g., laying face-up on a surface), and a different trigger word, phrase, or sound when the device is in another orientation (e.g., laying face-down). In some implementations, the trigger phrase for a face-down orientation is longer and/or more complex than for a face-up orientation. Thus, a user can place a device face-down when they are around other people or in a noisy environment so that the voice trigger can still be operational while also reducing false accepts, which may be more frequent for shorter or simpler trigger words. As a specific example, a face-up trigger phrase may be "Hey, SIRI," while a face-down trigger phrase may be "Hey, SIRI, this is Andrew, please wake up." The longer trigger phrase also provides a larger voice sample for the sound detectors and/or voice authenticators to process and/or analyze, thus increasing the accuracy of the voice trigger and decreasing false accepts.
[0129] In some implementations, the device 104 detects whether it is in a vehicle
(e.g., a car). A voice trigger is particularly beneficial for invoking a speech-based service when the user is in a vehicle, as it helps reduce the physical interactions that are necessary to operate the device and/or the speech based service. Indeed, one of the benefits of a voice-based digital assistant is that it can be used to perform tasks where looking at and touching a device would be impractical or unsafe. Thus, the voice trigger may be used when the device is in a vehicle so that the user does not have to touch the device in order to invoke
the digital assistant. In some implementations, the device determines that it is in a vehicle by detecting that it has been connected to and/or paired with a vehicle, such as through
BLUETOOTH communications (or other wireless communications) or through a docking connector or cable. In some implementations, the device determines that it is in a vehicle by determining the device's location and/or speed (e.g., using GPS receivers, accelerometers, and/or gyroscopes). If it is determined that the device is likely in a vehicle, because it is travelling above 20 miles per hour and is determined to be travelling along a road, for example, then the voice trigger remains active and/or in a high-power or more sensitive state.
[0130] In some implementations, the device detects whether the device is stored (e.g., in a pocket, purse, bag, a drawer, or the like) by determining whether it is in a substantially enclosed space. In some implementations, the device uses light sensors (e.g., dedicated ambient light sensors and/or cameras) to determine that it is stored. For example, in some implementations, the device is likely being stored if light sensors detect little or no light. In some implementations, the time of day and/or location of the device are also considered. For example, if the light sensors detect low light levels when high light levels would be expected (e.g., during the day), the device may be in storage and the voice trigger system 400 not needed. Thus, the voice trigger system 400 will be placed in a low-power or standby state.
[0131] In some implementations, the difference in light detected by sensors located on opposite faces of a device can be used to determine its position, and hence whether or not it is stored. Specifically, users are likely to attempt to activate a voice trigger when the device is resting on a table or surface rather than when it is being stored in a pocket or bag. But when a device is lying face-down (or face -up) on a surface such as a table or desk, one surface of the device will be occluded so that little or no light reaches that surface, while the other surface will be exposed to ambient light. Thus, if light sensors on the front and back face of a device detect significantly different light levels, the device determines that it is not being stored. On the other hand, if light sensors on opposite faces detect the same or similar light levels, the device determines that it is being stored in a substantially enclosed space. Also, if the light sensors both detect a low light level during the daytime (or when the device would expect the phone to be in a bright environment, the device determines with a greater confidence that it is being stored.
[0132] In some implementations, other techniques are used (instead of or in addition to light sensors) to determine whether the device is stored. For example, in some
implementations, the device emits one or more sounds (e.g., tones, clicks, pings, etc.) from a speaker or transducer (e.g., speaker 228), and monitors one or more microphones or transducers (e.g., microphone 230) to detect echoes of the omitted sound(s). (In some implementations, the device emits inaudible signals, such as sound outside of the human hearing range.) From the echoes, the device determines characteristics of the surrounding environment. For example, a relatively large environment (e.g., a room or a vehicle) will reflect the sound differently than a relatively small, enclosed environment (e.g., a pocket, purse, bag, drawer, or the like).
[0133] In some implementations, the voice trigger system 400 operates differently if it is near other devices (such as other devices that have voice triggers and/or speech-based services) than if it is not near other devices. This may be useful, for example, to shut down or decrease the sensitivity of the voice trigger system 400 when many devices are close together so that if one person utters a trigger word, other surrounding devices are not triggered as well. In some implementations, a device determines proximity to other devices using RFID, near-field communications, infrared/acoustic signals, or the like.
[0134] Voice triggers are particularly useful when a device is being operated in a hands-free mode, such as when the user is driving. In such cases, users often use external audio systems, such as wired or wireless headsets, watches with speakers and/or
microphones, a vehicle's built-in microphones and speakers, etc., to free themselves from having to hold a device near their face to make a call or dictate text inputs. For example, wireless headsets and vehicle audio systems may connect to an electronic device using BLUETOOTH communications, or any other appropriate wireless communication. However, it may be inefficient for a voice trigger to monitor audio received via a wireless audio accessory because of the power required to maintain an open audio channel with the wireless accessory. In particular, a wireless headset may hold enough charge in its battery to provide a few hours of continuous talk-time, and it is therefore preferable to reserve the battery for when the headset is needed for actual communication, instead of using it to simply monitor ambient audio and wait for a possible trigger sound. Moreover, wired external headset accessories may require significantly more power than on-board microphones alone, and keeping the headset microphone active will deplete the device's battery charge. This is especially true considering that the ambient audio received by the wireless or wired headset will typically consist mostly of silence or irrelevant sounds. Thus, in some implementations,
the voice trigger system 400 monitors audio from the microphone 230 on the device even when the device is coupled to an external microphone (wired or wireless). Then, when the voice trigger detects the trigger word, the device initializes an active audio link with the external microphone in order to receive subsequent sound inputs (such as a command to a voice-based digital assistant) via the external microphone rather than the on-device microphone 230.
[0135] When certain conditions are met, though, an active communication link can be maintained between an external audio system 416 (which may be communicatively coupled to the device 104 via wires or wirelessly) and the device so that the voice trigger system 400 can listen for a trigger sound via the external audio system 416 instead of (or in addition to) the on-device microphone 230. For example, in some implementations, characteristics of the motion of the electronic device and/or the external audio system 416 (e.g., as determined by accelerometers, gyroscopes, etc. on the respective devices) are used to determine whether the voice trigger system 400 should monitor ambient sound using the on-device microphone 230 or an external microphone 418. Specifically, the difference between the motion of the device and the external audio system 416 provides information about whether the external audio system 416 is actually in use. For example, if both the device and a wireless headset are moving (or not moving) substantially identically, it may be determined that the headset is not in use or is not being worn. This may occur, for example, because both devices are near to each other and idle (e.g., sitting on a table or stored in a pocket, bag, purse, drawer, etc.). Accordingly, under these conditions, the voice trigger system 400 monitors the on-device microphone, because it is unlikely that the headset is actually being used. If there is a difference in motion between the wireless headset and the device, however, it is determined that the headset is being worn by a user. These conditions may occur, for example, because the device has been set down (e.g., on a surface or in a bag), while the headset is being worn on the user's head (which will likely move at least a small amount, even when the wearer is relatively still). Under these conditions, because it is likely that the headset is being worn, the voice trigger system 400 maintains an active communication link and monitors the microphone 418 of the headset instead of (or in addition to) the on-device microphone 230. And because this technique focuses on the difference in the motion of the device and the headset, motion that is common to both devices can be canceled out. This may be useful, for example, when a user is using a headset in a moving vehicle, where the device (e.g., a cellular phone) is resting in a cup holder, empty seat, or in the user' s pocket, and the headset is worn
on the user's head. Once the motion that is common to both devices is cancelled out (e.g., the vehicle's motion), the relative motion of the headset as compared to the device (if any) can be determined in order to determine whether the headset is likely in use (or, whether the headset is not being worn). While the above discussion refers to wireless headsets, similar techniques are applied to wired headsets as well.
[0136] Because people's voices vary greatly, it may be necessary or beneficial to tune a voice trigger to improve its accuracy in recognizing the voice of a particular user. Also, people's voices may change over time, for example, because of illnesses, natural voice changes relating to aging or hormonal changes, and the like. Thus, in some implementations, the voice trigger system 400 is able to adapt its voice and/or sound recognition profiles for a particular user or group of users (e.g., by using an adaptive speech recognition model).
[0137] As described above, sound detectors (e.g., the sound- type detector 404 and/or the trigger sound detector 406) may be configured to compare a representation of a sound input (e.g., the sound or utterance provided by a user) to one or more reference
representations. For example, if an input representation matches the reference representation to a predetermined confidence level, the sound detector will determine that the sound input corresponds to a predetermined type of sound (e.g., the sound-type detector 404), or that the sound input includes predetermined content (e.g., the trigger sound detector 406). In order to tune the voice trigger system 400, in some implementations, the device adjusts the reference representation to which the input representation is compared. In some implementations, the reference representation is adjusted (or created) as part of a voice enrollment or "training" procedure, where a user outputs the trigger sound several times so that the device can adjust (or create) the reference representation. The device can then create a reference representation using that person's actual voice.
[0138] In some implementations, the device uses trigger sounds that are received under normal use conditions to adjust the reference representation. For example, after a successful voice triggering event (e.g., where the sound input was found to satisfy all of the triggering criteria) the device will use information from the sound input to adjust and/or tune the reference representation. In some implementations, only sound inputs that were determined to satisfy all or some of the triggering criteria with a certain confidence level are used to adjust the reference representation. Thus, when the voice trigger is less confident that
a sound input corresponds to or includes a trigger sound, that voice input may be ignored for the purposes of adjusting the reference representation. On the other hand, in some implementations, sound inputs that satisfied the voice trigger system 400 to a lower confidence are used to adjust the reference representation.
[0139] In some implementations, the device 104 iteratively adjusts the reference representation (using these or other techniques) as more and more sound inputs are received so that slight changes in a user' s voice over time can be accommodated. For example, in some implementations, the device 104 (and/or associated devices or services) adjusts the reference representation after each successful triggering event. In some implementations, the device 104 analyzes the sound input associated with each successful triggering event and determines if the reference representations should be adjusted based on that input (e.g., if certain conditions are met), and only adjusts the reference representation if it is appropriate to do so. In some implementations, the device 104 maintains a moving average of the reference representation over time.
[0140] In some implementations, the voice trigger system 400 detects sounds that do not satisfy one or more of the triggering criteria (e.g., as determined by one or more of the sound detectors), but that may actually be attempts by an authorized user to do so. For example, voice trigger system 400 may be configured to respond to a trigger phrase such as "Hey, SIRI", but if a user's voice has changed (e.g., due to sickness, age, accent/inflection changes, etc.), the voice trigger system 400 may not recognize the user's attempt to activate the device. (This may also occur when the voice trigger system 400 has not been properly tuned for that user's particular voice, such as when the voice trigger system 400 is set to default conditions and/or the user has not performed an initialization or training procedure to customize the voice trigger system 400 for his or her voice.) If the voice trigger system 400 does not respond to the user' s first attempt to active the voice trigger, the user is likely to repeat the trigger phrase. The device detects that these repeated sound inputs are similar to one another, and/or that they are similar to the trigger phrase (though not similar enough to cause the voice trigger system 400 to activate the speech-based service). If such conditions are met, the device determines that the sound inputs correspond to valid attempts to activate the voice trigger system 400. Accordingly, in some implementations, the voice trigger system 400 uses those received sound inputs to adjust one or more aspects of the voice trigger system 400 so that similar utterances by the user will be accepted as valid triggers in the future. In
some implementations, these sound inputs are used to adapt the voice trigger system 400 only if certain conditions or combinations of conditions are met. For example, in some
implementations, the sound inputs are used to adapt the voice trigger system 400 when a predetermined number of sound inputs are received in succession (e.g., 2, 3, 4, 5, or any other appropriate number), when the sound inputs are sufficiently similar to the reference representation, when the sound inputs are sufficiently similar to each other, when the sound inputs are close together (e.g., when they are received within a predetermined time period and/or at or near a predetermined interval), and/or any combination of these or other conditions.
[0141] In some cases, the voice trigger system 400 may detect one or more sound inputs that do not satisfy one or more of the triggering criteria, followed by a manual initiation of the speech-based service (e.g., by pressing a button or icon). In some
implementations, the voice trigger system 400 determines that, because speech-based service was initiated shortly after the sound inputs were received, the sound inputs actually corresponded to failed voice triggering attempts. Accordingly, the voice trigger system 400 uses those received sound inputs to adjust one or more aspects of the voice trigger system 400 so that utterances by the user will be accepted as valid triggers in the future, as described above.
[0142] While the adaptation techniques described above refer to adjusting a reference representation, other aspects of the trigger sound detecting techniques may be adjusted in the same or similar manner in addition to or instead of adjusting the reference representation. For example, in some implementations, the device adjusts how sound inputs are filtered and/or what filters are applied to sound inputs, such as to focus on and/or eliminate certain frequencies or ranges of frequencies of a sound input. In some implementations, the device adjusts an algorithm that is used to compare the input representation with the reference representation. For example, in some implementations, one or more terms of a mathematical function used to determine the difference between an input representation and a reference representation are changed, added, or removed, or a different mathematical function is substituted.
[0143] In some implementations, adaptation techniques such as those described above require more resources than the voice trigger system 400 is able to or is configured to provide. In particular, the sound detectors may not have, or have access to, the amount or the
types of processors, data, or memory that are necessary to perform the iterative adaptation of a reference representation and/or a sound detection algorithm (or any other appropriate aspect of the voice trigger system 400). Thus, in some implementations, one or more of the above described adaptation techniques are performed by a more powerful processor, such as an application processor (e.g., the processor(s) 204), or by a different device (e.g., the server system 108). However, the voice trigger system 400 is designed to operate even when the application processor is in a standby mode. Thus, the sound inputs which are to be used to adapt the voice trigger system 400 are received when the application processor is not active and cannot process the sound input. Accordingly, in some implementations, the sound input is stored by the device so that it can be further processed and/or analyzed after it is received. In some implementations, the sound input is stored in the memory buffer 414 of the audio subsystem 226. In some implementations, the sound input is stored in system memory (e.g., memory 250, Figure 2) using direct memory access (DMA) techniques (including, for example, using a DMA engine so that data can be copied or moved without requiring the application processor to be initiated). The stored sound input is then provided to or accessed by the application processor (or the server system 108, or another appropriate device) once it is initiated so that the application processor can execute one or more of the adaptation techniques described above.
[0144] Figures 5A and 5B are flow diagrams representing methods for obtaining training data to update an adaptive speech recognition model, according to certain
implementations. The methods are, optionally, governed by instructions that are stored in a computer memory or non-transitory computer readable storage medium (e.g., memory 250 of client device 104, memory 302 associated with the digital assistant system 300) and that are executed by one or more processors of one or more computer systems of a digital assistant system, including, but not limited to, the server system 108, and/or the user device 104a. The computer readable storage medium may include a magnetic or optical disk storage device, solid state storage devices such as Flash memory, or other non- volatile memory device or devices. The computer readable instructions stored on the computer readable storage medium may include one or more of: source code, assembly language code, object code, or other instruction format that is interpreted by one or more processors. In various implementations, some operations in each method may be combined and/or the order of some operations may be changed from the order shown in the figures. Also, in some implementations, operations shown in separate figures and/or discussed in association with separate methods may be
combined to form other methods, and operations shown in the same figure and/or discussed in association with the same method may be separated into different methods. Moreover, in some implementations, one or more operations in the methods are performed by modules of the digital assistant system 300 and/or an electronic device (e.g., the user device 104), including, for example, the natural language processing module 332, the dialogue flow processing module 334, the audio subsystem 226, the noise detector 402, the sound-type detector 404, the trigger sound detector 406, the speech-based service 408, and/or any sub modules thereof.
[0145] Figures 5A-5B illustrate a method 500 of obtaining training data to update an adaptive speech recognition model, according to some implementations. In some
implementations, the method 500 is performed at a system including one or more processors and memory storing instructions for execution by the one or more processors (e.g., server system 108 in Figure 1). The system determines (502) that a first user of a first mobile communication device (e.g., a mobile telephone) is engaged in a call over a communications network (e.g., communications network(s) 110 in Figure 1). For example, a device (e.g., device 104a in Figure 1) receives a user request (e.g., via audio subsystem 226 in Figure 2) to initiate a call. In another example, a device receives a call request through a communications network (e.g., via communication subsystem(s) 224 in Figure 2). In some implementations, the call is a mobile telephone call (e.g., telephony service 122-5 in Figure 1). In some implementations, the call is a multimedia communication. In some implementations, the call is a VoIP communication. In some implementations, the call comprises an interaction between the first user of the first mobile communication device and a second mobile device. In some implementations, the call comprises a conversation between the first user of the first mobile communication device and a user of a second device. As an example, a call may comprise a conversation between a user of device 104a and a user of device 104b in Figure 1.
[0146] The system provides (504) an adaptive speech recognition model. In some implementations, the system provides a speaker-independent model (e.g., a canonical model). For example, in some implementations, the system determines that a speaker-dependent model has not been stored for a corresponding user and, in accordance with that
determination, provides an initial speaker-independent model. In another example, the system determines that the first mobile communication device is not associated with any stored speaker-dependent models and, in accordance with that determination, provides a
speaker-independent model. In some implementations, the system provides a speaker- dependent model associated with a user of the first mobile communication device. For example, the system determines that a stored speaker-dependent model corresponds to the user and, in accordance with that determination, provides the stored speaker-dependent model
[0147] The system taps (506) into an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device. In some implementations, tapping into the outbound audio channel includes tapping into a baseband unit (e.g., baseband subsystem 412 in Figure 4) of the first mobile communication device. In some
implementations, tapping into the outbound audio channel includes tapping into an audio DSP (e.g., audio DSP 416 in Figure 4) of the first mobile communication device. In some implementations, tapping into the outbound audio channel includes tapping into an application processor (e.g., application processor 418 in Figure 4) of the first mobile communication device.
[0148] In some implementations, prior to tapping into the outbound audio channel, the system converts the audio signal from an analog audio signal to a digital audio signal. For example, in some implementations, the system converts the audio signal in a codec (e.g., codec 410 in Figure 4) prior to tapping into the outbound audio channel in an application processor (e.g., application processor 418 in Figure 4). In some implementations, prior to tapping into the outbound audio channel, the system determines that the mobile
communication device is in an adaptive- speaker- training mode. For example, in some implementations, prior to tapping into the outbound audio channel, the system sends the user of a mobile communication device a request to enter into a speaker-training mode, the user accepts the request, and in accordance with the user's acceptance, the device enters a speaker-training mode. In this example, if the user does not want the system to tap into the outbound audio channel (e.g., the user has a problem affecting the user's voice and/or the user is not the primary user of the mobile device), then the user can reject the request to enter into speaker-training mode. In some implementations, the tapped audio is rendered through the application processor where an embedded recognition engine is used to recognize sound units for updating model statistics.
[0149] In some embodiments, tapping into the outbound audio channel comprises utilizing (508) a voice activity detector (VAD) to determine when there is active speech on
the outbound audio channel; in accordance with a determination that there is active speech on the outbound audio channel, tapping into the outbound audio channel; and in accordance with a determination that there is not active speech on the outbound audio channel, forgoing tapping into the outbound audio channel. For example, in some embodiments, the system utilizes a VAD included within a voice trigger system (e.g., voice trigger system 400 in Figure 4). Thus, in some implementations, if the outbound audio channel is picking up ambient sound (e.g., traffic noise) but not human speech, the system will utilize the VAD to determine that there is no active speech on the outbound audio channel.
[0150] The system updates (510) the adaptive speech recognition model with training data (e.g., one or more speaker-dependent sound units) derived from the call audio signal. In some implementations, prior to updating the adaptive speech recognition model, the system determines that the call has ended. In some implementations, updating the adaptive speech recognition model comprises replacing the adaptive speech recognition model with a new adaptive speech recognition model generated from the training data. For example, in some of these implementations, the system discards the provided adaptive speech recognition model and stores a new adaptive speech recognition model generated from the training data obtained during the call.
[0151] In some implementations, updating the adaptive speech recognition model comprises generating a speaker-dependent model from the data, comparing the
speaker-dependent model to the adaptive speech recognition model, and updating the adaptive speech recognition model based on the comparison. In some of these
implementations, updating the adaptive speech recognition model based on the comparison includes directly adapting the model parameters. In some other implementations, updating the adaptive speech recognition model based on the comparison includes applying Linear transforms (e.g., an MLLR transform) for a set of Gaussians to the model parameters. In some implementations, updating the adaptive speech recognition model based on the comparison includes aligning a user's speech to phonetic sound units. These alignments are subsequently used to update each sound unit in the adaptive speech recognition model. In some implementations, the system updates the adaptive speech recognition model with training data only if the amount of training data reaches a predetermined threshold (e.g., batch adaptation). In some of these implementations, the system updates the adaptive speech
recognition model with training data derived from the call audio signal and discards any prior data.
[0152] In some implementations, the system utilizes the adaptive speech recognition model to authenticate a user on a device (e.g., the first mobile communication device). For example, in some of these implementations, voice-trigger system 400 includes voice authentication module 426 and voice authentication module 426 utilizes the adaptive speech recognition model to authenticate the user. In some of these implementations, the adaptive speech recognition model comprises a Gaussian mixture model that models the unique characteristics of the associated user. A statistical likelihood measure is used to render an authentication by how the user' s voice matches the adaptive speech recognition model versus a second model calculated from large amounts of anti-user data.
[0153] In some embodiments, updating the adaptive speech recognition model comprises comparing (512) the call audio signal with the adaptive speech recognition model, generating a confidence score based on the comparison; in accordance with a determination that the confidence score is at or above a predetermined threshold, updating the adaptive speech recognition model with the training data derived from the call audio signal; and, in accordance with a determination that the confidence score is below the predetermined threshold, forgoing the updating of the adaptive speech recognition model. For example, a confidence score below the predetermined threshold may indicate that a user is not associated with the provided adaptive speech recognition model or that the user is sick or that conditions have temporarily altered the user's voice. In these examples, forgoing the updating of the adaptive speech recognition model prevents a potential decrease in the model's accuracy which may result from updating the model with the training data.
[0154] In some embodiments, the training data derived from the call audio signal includes (514) one or more speaker-dependent sound units (e.g., phonemes). In these embodiments, updating the adaptive speech recognition model comprises comparing at least a subset of the one or more speaker-dependent sound units to the adaptive speech recognition model, generating one or more adaptive speech vectors based on the comparison, and modifying the adaptive speech recognition model based on at least a subset of the one or more adaptive speech vectors.
[0155] In some embodiments, the system stores (516) the adaptive speech recognition model in memory. In some implementations, the memory is a component of the first mobile device (e.g., memory 250 in Figure 2). In some implementations, the memory is a component of a server (e.g., memory 302 in Figure 3A).
[0156] In some embodiments, the system modifies (518) the adaptive speech recognition model with training data derived from audio user interaction with a digital assistant. In some implementations, the adaptive speech recognition model is updated with training data obtained both from calls and from audio user interactions with a digital assistant. For example, the adaptive speech recognition model is updated when a user makes a call and is then updated again when the user audibly interacts with a digital assistant.
[0157] In some embodiments, after said updating, the system receives (520) invocation of the digital assistant, receives speech input from a second user, generates speech-to-text output corresponding to the speech input, and provides the speech-to-text output to the digital assistant. For example, a user (e.g., a user of device 104a in Figure 1) invokes a digital assistant (e.g., digital assistant 102a in Figure 1) and the system uses the updated adaptive speech recognition model to generate the speech-to-text output (e.g., in STT processing module 330 in Figure 3B).
[0158] In some embodiments, generating speech-to-text output corresponding to the speech input comprises comparing (522) the speech input with the adaptive speech recognition model; in accordance with a determination that the second user is the same as the first user, performing automatic speech recognition using the adaptive speech recognition model to generate speech-to-text output; and in accordance with a determination that the second user is distinct from the first user, performing automatic speech recognition using a speaker- independent model to generate speech-to-text output. For example, a user (e.g., a user of device 104a in Figure 1) invokes a digital assistant (e.g., digital assistant 102a in Figure 1) and if the updated adaptive speech recognition model corresponds to the user then the system uses the updated adaptive speech recognition model to generate the speech-to-text output. Conversely, in this example, if the updated adaptive speech recognition model does not correspond to the user then the system uses a speaker-independent model to generate speech-to-text output.
[0159] In accordance with some implementations, Figure 6 shows a functional block diagram of a system 600 configured in accordance with the principles of the invention as described above. The functional blocks of the device may be implemented by hardware, software, or a combination of hardware and software to carry out the principles of the invention. It is understood by persons of skill in the art that the functional blocks described in Figure 6 may be combined or separated into sub-blocks to implement the principles of the invention as described above. Therefore, the description herein may support any possible combination or separation or further definition of the functional blocks described herein.
[0160] As shown in Figure 6, system 600 includes sound receiving unit 602 configured to receive sound input. System 600 also includes processing unit 604 coupled to sound receiving unit 602. In some implementations, voice activity detecting unit 606 is coupled to sound receiving unit 602 and processing unit 604. In some implementations, processing unit 604 includes determining unit 608, model providing unit 610, tapping unit 612, model updating unit 614, model modifying unit 616, speech-to-text (STT) generating unit 618, STT providing unit 620, speech comparing unit 622, storing unit 624, score generating unit 626, and speech vector generating unit 628. In some implementations, model updating unit 614 is the same as model modifying unit 616. In some implementations, STT generating unit 618 corresponds to STT processing module 330. In some implementations, STT providing unit 620 corresponds to STT processing module 330. In some
implementations, storing unit 624 corresponds to memory interface 202. In some
implementations, storing unit 624 corresponds to memory 302. In some implementations, model providing unit 610 corresponds to digital assistant client module 264. In some implementations, model providing unit 610 corresponds to I/O processing module 328.
[0161] Processing unit 604 is configured to: determine (e.g., with determining unit
608) that a first user of a first mobile communication device is engaged in a call over a communications network and providing an adaptive speech recognition mode; tap (e.g., with tapping unit 612) into an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device; and update (e.g., with model updating unit 614) the adaptive speech recognition model with training data derived from the call audio signal.
[0162] In some implementations, processing unit 604 is part of said first mobile communication device having said one or more microphones. In some implementations,
processing unit 604 is part of a server system distinct from said first mobile communication device.
[0163] In some implementations, processing unit 604 is further configured to modify
(e.g., with model modifying unit 616) the adaptive speech recognition model with training data derived from audio user interaction with a digital assistant.
[0164] In some implementations, processing unit 604 is further configured to, after said updating, receive invocation of the digital assistant, receive speech input from a second user, generate (e.g., with STT generating unit 618) speech-to-text output corresponding to the speech input, and provide (e.g., with STT providing unit 620) the speech-to-text output to the digital assistant.
[0165] In some implementations, generating speech-to-text (e.g., with STT generating unit 618) output corresponding to the speech input comprises: comparing (e.g., with speech comparing unit 622) the speech input with the adaptive speech recognition model; in accordance with a determination that the second user is the same as the first user, performing automatic speech recognition using the adaptive speech recognition model to generate speech-to-text output; and in accordance with a determination that the second user is distinct from the first user, performing automatic speech recognition using a speaker-independent model to generate speech-to-text output.
[0166] In some implementations, processing unit 604 is further configured to store
(e.g., with storing unit 624) the adaptive speech recognition model in memory.
[0167] In some implementations, tapping (e.g., with tapping unit 612) into the outbound audio channel comprises: utilizing a voice activity detector to determine when there is active speech on the outbound audio channel; in accordance with a determination that there is active speech on the outbound audio channel, tapping into the outbound audio channel; and in accordance with a determination that there is not active speech on the outbound audio channel, forgoing tapping into the outbound audio channel.
[0168] In some implementations, updating (e.g., with model updating unit 614) the adaptive speech recognition model comprises: comparing (e.g., with speech comparing unit 622) the call audio signal with the adaptive speech recognition model; generating (e.g., with score generating unit 626) a confidence score based on the comparison; in accordance with a
determination that the confidence score is at or above a predetermined threshold, updating the adaptive speech recognition model with the training data derived from the call audio signal; and in accordance with a determination that the confidence score is below the predetermined threshold, forgoing to update the adaptive speech recognition model.
[0169] In some implementations, the training data derived from the call audio signal includes one or more speaker-dependent sound units; and updating (e.g., with model updating unit 614) the adaptive speech recognition model comprises: comparing at least a subset of the one or more speaker-dependent sound units to the adaptive speech recognition model;
generating (e.g., with speech vector generating unit 628) one or more adaptive speech vectors based on the comparison; and modifying the adaptive speech recognition model based on at least a subset of the one or more adaptive speech vectors.
[0170] The foregoing description, for purpose of explanation, has been described with reference to specific implementations. However, the illustrative discussions above are not intended to be exhaustive or to limit the disclosed implementations to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The implementations were chosen and described in order to best explain the principles and practical applications of the disclosed ideas, to thereby enable others skilled in the art to best utilize them with various modifications as are suited to the particular use contemplated.
[0171] It will be understood that, although the terms "first," "second," etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first sound detector could be termed a second sound detector, and, similarly, a second sound detector could be termed a first sound detector, without changing the meaning of the description, so long as all occurrences of the "first sound detector" are renamed consistently and all occurrences of the "second sound detector" are renamed consistently. The first sound detector and the second sound detector are both sound detectors, but they are not the same sound detector.
[0172] The terminology used herein is for the purpose of describing particular implementations only and is not intended to be limiting of the claims. As used in the description of the implementations and the appended claims, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates
otherwise. It will also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
[0173] As used herein, the term "if may be construed to mean "when" or "upon" or
"in response to determining" or "in accordance with a determination" or "in response to detecting," that a stated condition precedent is true, depending on the context. Similarly, the phrase "if it is determined [that a stated condition precedent is true]" or "if [a stated condition precedent is true]" or "when [a stated condition precedent is true]" may be construed to mean "upon determining" or "upon a determination that" or "in response to determining" or "in accordance with a determination" or "upon detecting" or "in response to detecting" that the stated condition precedent is true, depending on the context.
Claims
1. A machine-implemented method, comprising:
determining that a first user of a first mobile communication device is engaged in a call over a communications network; providing an adaptive speech recognition model; tapping into an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device; and updating the adaptive speech recognition model with training data derived from the call audio signal.
2. The method of claim 1, wherein the method is performed by said first mobile
communication device having said one or more microphones.
3. The method of claim 1, wherein the method is performed by a server system distinct from said first mobile communication device.
4. The method of any of claims 1-3, further comprising modifying the adaptive speech recognition model with training data derived from audio user interaction with a digital
assistant.
5. The method of claim 4, further comprising, after said updating:
receiving invocation of the digital
assistant; receiving speech input from a
second user; generating speech-to-text output corresponding to the speech input;
and providing the speech-to-text output to the digital assistant.
6. The method of claim 5, wherein generating speech-to-text output
corresponding to the
speech input comprises:
comparing the speech input with the adaptive speech recognition model;
in accordance with a determination that the second user is the same as the first user, performing automatic speech recognition using the adaptive speech recognition model to generate speech-to-text output; and
in accordance with a determination that the second user is distinct from the first user, performing automatic speech recognition using a speaker-independent model to generate speech-to-text output.
7. The method of any of claims 1-6, further comprising storing the
adaptive speech recognition model in memory.
8. The method of any of claims 1-7, wherein tapping into the outbound audio channel compnses:
utilizing a voice activity detector to determine when there is active speech on the outbound audio channel; in accordance with a determination that there is active speech on the outbound audio channel, tapping into the outbound audio channel; and in accordance with a determination that there is not active speech on the outbound audio channel, forgoing tapping into the outbound audio channel.
9. The method of any of claims 1-8, wherein updating the adaptive speech recognition model comprises:
comparing the call audio signal with the adaptive speech
recognition model; generating a confidence score based on the
comparison; in accordance with a determination that the confidence score is at or above a predetermined threshold, updating the adaptive speech recognition model with the training data derived from the call audio signal; and in accordance with a determination that the confidence score is
below the predetermined threshold, forgoing to update the adaptive speech recognition model.
10. The method of any of claims 1-9, wherein the training data derived from the call audio signal includes one or more speaker-dependent sound units; and
updating the adaptive speech recognition model comprises: comparing at least a subset of the one or more speaker-dependent sound units to the adaptive speech recognition model; generating one or more adaptive speech vectors based on the comparison; and modifying the adaptive speech recognition model based on at least a subset of the one or more adaptive speech vectors.
11. A system,
comprising: one
or more
processors;
memory; and
one or more programs stored in the memory, the one or more programs comprising instructions to: determine that a first user of a first mobile communication device is engaged in a call over a communications network; provide an adaptive speech recognition model; tap into an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device; and update the adaptive speech recognition model with training data derived from the call audio signal.
12. The system of claim 11, wherein the system comprises said
first mobile communication device having said one or more
microphones.
13. The system of claim 11, wherein the system comprises a server system distinct from said first mobile communication device.
14. The system of any of claims 11-13, the one or more programs further comprising instructions to modify the adaptive speech recognition model with training data derived from audio user interaction with a digital assistant.
15. The system of claim 14, the one or more programs further comprising instructions for, after said updating:
receiving invocation of the
digital assistant; receiving speech
input from a second user;
generating speech-to-text output;
and providing the speech-to-text output to the digital assistant.
16. The system of claim 15, wherein generating speech-to-text output
comprises: comparing the speech input with the adaptive speech recognition model;
in accordance with a determination that the second user is the same as the first user, performing automatic speech recognition using the adaptive speech recognition model to generate speech-to-text output; and in accordance with a determination that the second user is distinct from the first user, performing automatic speech recognition using a speaker-independent model to generate speech-to-text output.
17. The system of any of claims 11-16, the one or more programs further comprising instructions to store the adaptive speech recognition model in memory.
18. The system of any of claims 11-17, wherein tapping into the outbound audio channel compnses:
utilizing a voice activity detector to determine when there is active speech on the outbound audio channel; in accordance with a determination that there is active speech on the outbound audio channel, tapping into the outbound audio channel; and
in accordance with a determination that there is not active speech on the outbound audio channel, forgoing tapping into the outbound audio channel.
19. The system of any of claims 11-18, wherein updating the adaptive speech recognition model comprises:
comparing the call audio signal with the adaptive speech recognition model; generating a confidence score based on the
comparison; in accordance with a determination that the confidence score is at or above a predetermined threshold, updating the adaptive speech recognition model with the training data derived from the call audio signal; and in accordance with a determination that the confidence score is
below the predetermined threshold, forgoing to update the adaptive speech recognition model.
20. The system of any of claims 11-19, wherein the training data derived from the call audio signal includes one or more speaker-dependent sound units; and
updating the adaptive speech recognition model comprises: comparing at least a subset of the one or more speaker-dependent sound units to the adaptive speech recognition model; generating one or more adaptive speech vectors based on the comparison; and modifying the adaptive speech recognition model based on at least a subset of the one or more adaptive speech vectors.
21. A non- transitory computer readable storage medium storing one or more programs configured for execution by a device, the one or more programs comprising instructions to:
determine that a first user of a first mobile communication device is engaged in a call over a communications network;
provide an adaptive speech recognition model; tap into an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device; and update the adaptive speech recognition model with training data derived from the call audio signal.
22. The non-transitory computer readable storage medium of claim 21, wherein the one or more programs are configured so that the device, when executing the one or more programs,
performs the method of any of claims 1-10.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361799479P | 2013-03-15 | 2013-03-15 | |
US61/799,479 | 2013-03-15 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014144579A1 true WO2014144579A1 (en) | 2014-09-18 |
Family
ID=50819945
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2014/029050 WO2014144579A1 (en) | 2013-03-15 | 2014-03-14 | System and method for updating an adaptive speech recognition model |
Country Status (2)
Country | Link |
---|---|
US (1) | US9697822B1 (en) |
WO (1) | WO2014144579A1 (en) |
Cited By (149)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016191318A1 (en) * | 2015-05-27 | 2016-12-01 | Google Inc. | Context-sensitive dynamic update of voice to text model in a voice-enabled electronic device |
WO2017044160A1 (en) * | 2015-09-08 | 2017-03-16 | Apple Inc. | Zero latency digital assistant |
US9668024B2 (en) | 2014-06-30 | 2017-05-30 | Apple Inc. | Intelligent automated assistant for TV user interactions |
ES2644887A1 (en) * | 2016-05-31 | 2017-11-30 | Xesol I Mas D Mas I, S.L. | Method of interaction through voice for during communication\rdriving vehicles and device that implements it (Machine-translation by Google Translate, not legally binding) |
US9865248B2 (en) | 2008-04-05 | 2018-01-09 | Apple Inc. | Intelligent text-to-speech conversion |
US9870196B2 (en) | 2015-05-27 | 2018-01-16 | Google Llc | Selective aborting of online processing of voice inputs in a voice-enabled electronic device |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US20180102125A1 (en) * | 2016-10-12 | 2018-04-12 | Samsung Electronics Co., Ltd. | Electronic device and method for controlling the same |
US9966060B2 (en) | 2013-06-07 | 2018-05-08 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
US9971774B2 (en) | 2012-09-19 | 2018-05-15 | Apple Inc. | Voice-based media searching |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US9986419B2 (en) | 2014-09-30 | 2018-05-29 | Apple Inc. | Social reminders |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10049675B2 (en) | 2010-02-25 | 2018-08-14 | Apple Inc. | User profiling for voice input processing |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10079014B2 (en) | 2012-06-08 | 2018-09-18 | Apple Inc. | Name recognition system |
US10083697B2 (en) | 2015-05-27 | 2018-09-25 | Google Llc | Local persisting of data for selectively offline capable voice action in a voice-enabled electronic device |
US10083690B2 (en) | 2014-05-30 | 2018-09-25 | Apple Inc. | Better resolution when referencing to concepts |
US10089072B2 (en) | 2016-06-11 | 2018-10-02 | Apple Inc. | Intelligent device arbitration and control |
US10108612B2 (en) | 2008-07-31 | 2018-10-23 | Apple Inc. | Mobile device having human language translation capability with positional feedback |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
CN109302528A (en) * | 2018-08-21 | 2019-02-01 | 努比亚技术有限公司 | A kind of photographic method, mobile terminal and computer readable storage medium |
CN109412900A (en) * | 2018-12-04 | 2019-03-01 | 腾讯科技(深圳)有限公司 | A kind of network state knows the method and device of method for distinguishing, model training |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US10269345B2 (en) | 2016-06-11 | 2019-04-23 | Apple Inc. | Intelligent task discovery |
US10297253B2 (en) | 2016-06-11 | 2019-05-21 | Apple Inc. | Application integration with a digital assistant |
US10303715B2 (en) | 2017-05-16 | 2019-05-28 | Apple Inc. | Intelligent automated assistant for media exploration |
US10311871B2 (en) | 2015-03-08 | 2019-06-04 | Apple Inc. | Competing devices responding to voice triggers |
US10311144B2 (en) | 2017-05-16 | 2019-06-04 | Apple Inc. | Emoji word sense disambiguation |
US10318871B2 (en) | 2005-09-08 | 2019-06-11 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US10332543B1 (en) | 2018-03-12 | 2019-06-25 | Cypress Semiconductor Corporation | Systems and methods for capturing noise for pattern recognition processing |
US10332518B2 (en) | 2017-05-09 | 2019-06-25 | Apple Inc. | User interface for correcting recognition errors |
US10354011B2 (en) | 2016-06-09 | 2019-07-16 | Apple Inc. | Intelligent automated assistant in a home environment |
US10356243B2 (en) | 2015-06-05 | 2019-07-16 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US10381016B2 (en) | 2008-01-03 | 2019-08-13 | Apple Inc. | Methods and apparatus for altering audio output signals |
US10395654B2 (en) | 2017-05-11 | 2019-08-27 | Apple Inc. | Text normalization based on a data-driven learning network |
US10403278B2 (en) | 2017-05-16 | 2019-09-03 | Apple Inc. | Methods and systems for phonetic matching in digital assistant services |
US10403283B1 (en) | 2018-06-01 | 2019-09-03 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
US10410637B2 (en) | 2017-05-12 | 2019-09-10 | Apple Inc. | User-specific acoustic models |
US10417344B2 (en) | 2014-05-30 | 2019-09-17 | Apple Inc. | Exemplar-based natural language processing |
US10417405B2 (en) | 2011-03-21 | 2019-09-17 | Apple Inc. | Device access using voice authentication |
US10417266B2 (en) | 2017-05-09 | 2019-09-17 | Apple Inc. | Context-aware ranking of intelligent response suggestions |
US10431204B2 (en) | 2014-09-11 | 2019-10-01 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US10438595B2 (en) | 2014-09-30 | 2019-10-08 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US10445429B2 (en) | 2017-09-21 | 2019-10-15 | Apple Inc. | Natural language understanding using vocabularies with compressed serialized tries |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
US10453443B2 (en) | 2014-09-30 | 2019-10-22 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US10474753B2 (en) | 2016-09-07 | 2019-11-12 | Apple Inc. | Language identification using recurrent neural networks |
US10482874B2 (en) | 2017-05-15 | 2019-11-19 | Apple Inc. | Hierarchical belief states for digital assistants |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
US10497365B2 (en) | 2014-05-30 | 2019-12-03 | Apple Inc. | Multi-command single utterance input method |
US10496705B1 (en) | 2018-06-03 | 2019-12-03 | Apple Inc. | Accelerated task performance |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
US10521466B2 (en) | 2016-06-11 | 2019-12-31 | Apple Inc. | Data driven natural language event detection and classification |
US10529332B2 (en) | 2015-03-08 | 2020-01-07 | Apple Inc. | Virtual assistant activation |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US10592604B2 (en) | 2018-03-12 | 2020-03-17 | Apple Inc. | Inverse text normalization for automatic speech recognition |
US10636424B2 (en) | 2017-11-30 | 2020-04-28 | Apple Inc. | Multi-turn canned dialog |
US10643611B2 (en) | 2008-10-02 | 2020-05-05 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US10657328B2 (en) | 2017-06-02 | 2020-05-19 | Apple Inc. | Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling |
US10657961B2 (en) | 2013-06-08 | 2020-05-19 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10684703B2 (en) | 2018-06-01 | 2020-06-16 | Apple Inc. | Attention aware virtual assistant dismissal |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10699717B2 (en) | 2014-05-30 | 2020-06-30 | Apple Inc. | Intelligent assistant for home automation |
US10706841B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Task flow identification based on user intent |
US10714117B2 (en) | 2013-02-07 | 2020-07-14 | Apple Inc. | Voice trigger for a digital assistant |
US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
US10733993B2 (en) | 2016-06-10 | 2020-08-04 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10733982B2 (en) | 2018-01-08 | 2020-08-04 | Apple Inc. | Multi-directional dialog |
US10733375B2 (en) | 2018-01-31 | 2020-08-04 | Apple Inc. | Knowledge-based framework for improving natural language understanding |
US10741185B2 (en) | 2010-01-18 | 2020-08-11 | Apple Inc. | Intelligent automated assistant |
US10748546B2 (en) | 2017-05-16 | 2020-08-18 | Apple Inc. | Digital assistant services based on device capabilities |
US10755051B2 (en) | 2017-09-29 | 2020-08-25 | Apple Inc. | Rule-based natural language processing |
US10755703B2 (en) | 2017-05-11 | 2020-08-25 | Apple Inc. | Offline personal assistant |
US10769385B2 (en) | 2013-06-09 | 2020-09-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
US10789959B2 (en) | 2018-03-02 | 2020-09-29 | Apple Inc. | Training speaker recognition models for digital assistants |
US10789945B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Low-latency intelligent automated assistant |
US10791176B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Synchronization and task delegation of a digital assistant |
US10795541B2 (en) | 2009-06-05 | 2020-10-06 | Apple Inc. | Intelligent organization of tasks items |
US10810274B2 (en) | 2017-05-15 | 2020-10-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
US10839159B2 (en) | 2018-09-28 | 2020-11-17 | Apple Inc. | Named entity normalization in a spoken dialog system |
US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
US10909331B2 (en) | 2018-03-30 | 2021-02-02 | Apple Inc. | Implicit identification of translation payload with neural machine translation |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
US10984780B2 (en) | 2018-05-21 | 2021-04-20 | Apple Inc. | Global semantic word embeddings using bi-directional recurrent neural networks |
US11010561B2 (en) | 2018-09-27 | 2021-05-18 | Apple Inc. | Sentiment prediction from textual data |
US11010127B2 (en) | 2015-06-29 | 2021-05-18 | Apple Inc. | Virtual assistant for media playback |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US11023513B2 (en) | 2007-12-20 | 2021-06-01 | Apple Inc. | Method and apparatus for searching using an active ontology |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US11048473B2 (en) | 2013-06-09 | 2021-06-29 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
US11070949B2 (en) | 2015-05-27 | 2021-07-20 | Apple Inc. | Systems and methods for proactively identifying and surfacing relevant content on an electronic device with a touch-sensitive display |
US11069336B2 (en) | 2012-03-02 | 2021-07-20 | Apple Inc. | Systems and methods for name pronunciation |
US11080012B2 (en) | 2009-06-05 | 2021-08-03 | Apple Inc. | Interface for a virtual digital assistant |
CN113261056A (en) * | 2019-12-04 | 2021-08-13 | 谷歌有限责任公司 | Speaker perception using speaker-dependent speech models |
US11120372B2 (en) | 2011-06-03 | 2021-09-14 | Apple Inc. | Performing actions associated with task items that represent tasks to perform |
US11127397B2 (en) | 2015-05-27 | 2021-09-21 | Apple Inc. | Device voice control |
US11133008B2 (en) | 2014-05-30 | 2021-09-28 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
JP2021529978A (en) * | 2018-05-10 | 2021-11-04 | エル ソルー カンパニー, リミテッドLlsollu Co., Ltd. | Artificial intelligence service method and equipment for it |
US11170166B2 (en) | 2018-09-28 | 2021-11-09 | Apple Inc. | Neural typographical error modeling via generative adversarial networks |
US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
US11217251B2 (en) | 2019-05-06 | 2022-01-04 | Apple Inc. | Spoken notifications |
US11227589B2 (en) | 2016-06-06 | 2022-01-18 | Apple Inc. | Intelligent list reading |
US11231904B2 (en) | 2015-03-06 | 2022-01-25 | Apple Inc. | Reducing response latency of intelligent automated assistants |
US11237797B2 (en) | 2019-05-31 | 2022-02-01 | Apple Inc. | User activity shortcut suggestions |
US11269678B2 (en) | 2012-05-15 | 2022-03-08 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
US11281993B2 (en) | 2016-12-05 | 2022-03-22 | Apple Inc. | Model and ensemble compression for metric learning |
US11289073B2 (en) | 2019-05-31 | 2022-03-29 | Apple Inc. | Device text to speech |
US11301477B2 (en) | 2017-05-12 | 2022-04-12 | Apple Inc. | Feedback analysis of a digital assistant |
US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
US11314370B2 (en) | 2013-12-06 | 2022-04-26 | Apple Inc. | Method for extracting salient dialog usage from live data |
US11350253B2 (en) | 2011-06-03 | 2022-05-31 | Apple Inc. | Active transport based notifications |
US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
US11360641B2 (en) | 2019-06-01 | 2022-06-14 | Apple Inc. | Increasing the relevance of new available information |
US11388291B2 (en) | 2013-03-14 | 2022-07-12 | Apple Inc. | System and method for processing voicemail |
US11386266B2 (en) | 2018-06-01 | 2022-07-12 | Apple Inc. | Text correction |
US11423908B2 (en) | 2019-05-06 | 2022-08-23 | Apple Inc. | Interpreting spoken requests |
US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
US11467802B2 (en) | 2017-05-11 | 2022-10-11 | Apple Inc. | Maintaining privacy of personal information |
US11468282B2 (en) | 2015-05-15 | 2022-10-11 | Apple Inc. | Virtual assistant in a communication session |
US11475884B2 (en) | 2019-05-06 | 2022-10-18 | Apple Inc. | Reducing digital assistant latency when a language is incorrectly determined |
US11475898B2 (en) | 2018-10-26 | 2022-10-18 | Apple Inc. | Low-latency multi-speaker speech recognition |
US11488406B2 (en) | 2019-09-25 | 2022-11-01 | Apple Inc. | Text detection using global geometry estimators |
US11496600B2 (en) | 2019-05-31 | 2022-11-08 | Apple Inc. | Remote execution of machine-learned models |
US11495218B2 (en) | 2018-06-01 | 2022-11-08 | Apple Inc. | Virtual assistant operation in multi-device environments |
US11532306B2 (en) | 2017-05-16 | 2022-12-20 | Apple Inc. | Detecting a trigger of a digital assistant |
EP4123639A3 (en) * | 2021-11-08 | 2023-02-22 | Apollo Intelligent Connectivity (Beijing) Technology Co., Ltd. | Wake-up control for a speech controlled device |
US11638059B2 (en) | 2019-01-04 | 2023-04-25 | Apple Inc. | Content playback on multiple devices |
US11657813B2 (en) | 2019-05-31 | 2023-05-23 | Apple Inc. | Voice identification in digital assistant systems |
US11671920B2 (en) | 2007-04-03 | 2023-06-06 | Apple Inc. | Method and system for operating a multifunction portable electronic device using voice-activation |
US11696060B2 (en) | 2020-07-21 | 2023-07-04 | Apple Inc. | User identification using headphones |
US11755276B2 (en) | 2020-05-12 | 2023-09-12 | Apple Inc. | Reducing description length based on confidence |
US11765209B2 (en) | 2020-05-11 | 2023-09-19 | Apple Inc. | Digital assistant hardware abstraction |
US11790914B2 (en) | 2019-06-01 | 2023-10-17 | Apple Inc. | Methods and user interfaces for voice-based control of electronic devices |
US11798547B2 (en) | 2013-03-15 | 2023-10-24 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
US11809483B2 (en) | 2015-09-08 | 2023-11-07 | Apple Inc. | Intelligent automated assistant for media search and playback |
US11838734B2 (en) | 2020-07-20 | 2023-12-05 | Apple Inc. | Multi-device audio adjustment coordination |
US11853536B2 (en) | 2015-09-08 | 2023-12-26 | Apple Inc. | Intelligent automated assistant in a media environment |
US11886805B2 (en) | 2015-11-09 | 2024-01-30 | Apple Inc. | Unconventional virtual assistant interactions |
US11914848B2 (en) | 2020-05-11 | 2024-02-27 | Apple Inc. | Providing relevant data items based on context |
Families Citing this family (58)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9634855B2 (en) | 2010-05-13 | 2017-04-25 | Alexander Poltorak | Electronic personal interactive device that determines topics of interest using a conversational agent |
US9953630B1 (en) * | 2013-05-31 | 2018-04-24 | Amazon Technologies, Inc. | Language recognition for device settings |
CN110503963B (en) * | 2014-04-24 | 2022-10-04 | 日本电信电话株式会社 | Decoding method, decoding device, and recording medium |
US10001760B1 (en) * | 2014-09-30 | 2018-06-19 | Hrl Laboratories, Llc | Adaptive control system capable of recovering from unexpected situations |
US11231826B2 (en) * | 2015-03-08 | 2022-01-25 | Google Llc | Annotations in software applications for invoking dialog system functions |
JP6235757B2 (en) * | 2015-03-19 | 2017-11-22 | 株式会社東芝 | Dialog data collection system, dialog data collection method, dialog data collection program, dialog data collection support device, dialog data collection support method, and dialog data collection support program |
BR112017021673B1 (en) * | 2015-04-10 | 2023-02-14 | Honor Device Co., Ltd | VOICE CONTROL METHOD, COMPUTER READABLE NON-TRANSITORY MEDIUM AND TERMINAL |
US20170154273A1 (en) * | 2015-11-30 | 2017-06-01 | Seematics Systems Ltd | System and method for automatically updating inference models |
US10229672B1 (en) | 2015-12-31 | 2019-03-12 | Google Llc | Training acoustic models using connectionist temporal classification |
KR102501083B1 (en) * | 2016-02-05 | 2023-02-17 | 삼성전자 주식회사 | Method for voice detection and electronic device using the same |
JP6605995B2 (en) * | 2016-03-16 | 2019-11-13 | 株式会社東芝 | Speech recognition error correction apparatus, method and program |
US11086593B2 (en) | 2016-08-26 | 2021-08-10 | Bragi GmbH | Voice assistant for wireless earpieces |
KR102562287B1 (en) * | 2016-10-14 | 2023-08-02 | 삼성전자주식회사 | Electronic device and audio signal processing method thereof |
GB2555661A (en) * | 2016-11-07 | 2018-05-09 | Cirrus Logic Int Semiconductor Ltd | Methods and apparatus for biometric authentication in an electronic device |
RU2646380C1 (en) * | 2016-12-22 | 2018-03-02 | Общество с ограниченной ответственностью "Аби Продакшн" | Using verified by user data for training models of confidence |
KR102643501B1 (en) * | 2016-12-26 | 2024-03-06 | 현대자동차주식회사 | Dialogue processing apparatus, vehicle having the same and dialogue processing method |
US10360916B2 (en) * | 2017-02-22 | 2019-07-23 | Plantronics, Inc. | Enhanced voiceprint authentication |
KR102017244B1 (en) * | 2017-02-27 | 2019-10-21 | 한국전자통신연구원 | Method and apparatus for performance improvement in spontaneous speech recognition |
CN107122179A (en) * | 2017-03-31 | 2017-09-01 | 阿里巴巴集团控股有限公司 | The function control method and device of voice |
US10178432B2 (en) * | 2017-05-18 | 2019-01-08 | Sony Corporation | Identity-based face and voice recognition to regulate content rights and parental controls using consumer profiles |
US9900556B1 (en) * | 2017-06-28 | 2018-02-20 | The Travelers Indemnity Company | Systems and methods for virtual co-location |
WO2019014425A1 (en) | 2017-07-13 | 2019-01-17 | Pindrop Security, Inc. | Zero-knowledge multiparty secure sharing of voiceprints |
US10204624B1 (en) * | 2017-08-14 | 2019-02-12 | Lenovo (Singapore) Pte. Ltd. | False positive wake word |
US10417339B2 (en) * | 2017-08-18 | 2019-09-17 | Kyocera Document Solutions Inc. | Suggestion of alternate user input using different user interface |
CN111164676A (en) * | 2017-11-15 | 2020-05-15 | 英特尔公司 | Speech model personalization via environmental context capture |
WO2019126881A1 (en) * | 2017-12-29 | 2019-07-04 | Fluent.Ai Inc. | System and method for tone recognition in spoken languages |
WO2019128550A1 (en) * | 2017-12-31 | 2019-07-04 | Midea Group Co., Ltd. | Method and system for controlling home assistant devices |
CN108198552B (en) * | 2018-01-18 | 2021-02-02 | 深圳市大疆创新科技有限公司 | Voice control method and video glasses |
KR102629385B1 (en) * | 2018-01-25 | 2024-01-25 | 삼성전자주식회사 | Application processor including low power voice trigger system with direct path for barge-in, electronic device including the same and method of operating the same |
US10623403B1 (en) | 2018-03-22 | 2020-04-14 | Pindrop Security, Inc. | Leveraging multiple audio channels for authentication |
US10665244B1 (en) | 2018-03-22 | 2020-05-26 | Pindrop Security, Inc. | Leveraging multiple audio channels for authentication |
US20190362709A1 (en) * | 2018-05-25 | 2019-11-28 | Motorola Mobility Llc | Offline Voice Enrollment |
US11348588B2 (en) * | 2018-08-20 | 2022-05-31 | Samsung Electronics Co., Ltd. | Electronic device and operation method for performing speech recognition |
CN109117233A (en) * | 2018-08-22 | 2019-01-01 | 百度在线网络技术(北京)有限公司 | Method and apparatus for handling information |
US10885899B2 (en) * | 2018-10-09 | 2021-01-05 | Motorola Mobility Llc | Retraining voice model for trigger phrase using training data collected during usage |
US11527265B2 (en) * | 2018-11-02 | 2022-12-13 | BriefCam Ltd. | Method and system for automatic object-aware video or audio redaction |
KR20200063521A (en) | 2018-11-28 | 2020-06-05 | 삼성전자주식회사 | Electronic device and control method thereof |
KR102206181B1 (en) * | 2018-12-19 | 2021-01-22 | 엘지전자 주식회사 | Terminla and operating method thereof |
US20220036751A1 (en) * | 2018-12-31 | 2022-02-03 | 4S Medical Research Private Limited | A method and a device for providing a performance indication to a hearing and speech impaired person learning speaking skills |
US11295726B2 (en) * | 2019-04-08 | 2022-04-05 | International Business Machines Corporation | Synthetic narrowband data generation for narrowband automatic speech recognition systems |
US11545148B2 (en) * | 2019-06-18 | 2023-01-03 | Roku, Inc. | Do not disturb functionality for voice responsive devices |
WO2021006920A1 (en) * | 2019-07-09 | 2021-01-14 | Google Llc | On-device speech synthesis of textual segments for training of on-device speech recognition model |
US10915227B1 (en) | 2019-08-07 | 2021-02-09 | Bank Of America Corporation | System for adjustment of resource allocation based on multi-channel inputs |
TWI727521B (en) * | 2019-11-27 | 2021-05-11 | 瑞昱半導體股份有限公司 | Dynamic speech recognition method and apparatus therefor |
US11258750B2 (en) * | 2019-12-19 | 2022-02-22 | Honeywell International Inc. | Systems and methods for unified data and voice messages management |
US11544458B2 (en) | 2020-01-17 | 2023-01-03 | Apple Inc. | Automatic grammar detection and correction |
CN113658596A (en) * | 2020-04-29 | 2021-11-16 | 扬智科技股份有限公司 | Semantic identification method and semantic identification device |
US11810578B2 (en) | 2020-05-11 | 2023-11-07 | Apple Inc. | Device arbitration for digital assistant-based intercom systems |
DE102020119980B3 (en) * | 2020-07-29 | 2021-11-18 | Otto-von-Guericke-Universität Magdeburg, Körperschaft des öffentlichen Rechts | Language assistance system, method and computer program for language-based support |
US11615795B2 (en) * | 2020-08-03 | 2023-03-28 | HCL America Inc. | Method and system for providing secured access to services rendered by a digital voice assistant |
US11181988B1 (en) | 2020-08-31 | 2021-11-23 | Apple Inc. | Incorporating user feedback into text prediction models via joint reward planning |
US11829720B2 (en) | 2020-09-01 | 2023-11-28 | Apple Inc. | Analysis and validation of language models |
US11783827B2 (en) | 2020-11-06 | 2023-10-10 | Apple Inc. | Determining suggested subsequent user actions during digital assistant interaction |
US11620990B2 (en) * | 2020-12-11 | 2023-04-04 | Google Llc | Adapting automated speech recognition parameters based on hotword properties |
US11935168B1 (en) | 2021-05-14 | 2024-03-19 | Apple Inc. | Selective amplification of voice and interactive language simulator |
US11797766B2 (en) | 2021-05-21 | 2023-10-24 | Apple Inc. | Word prediction with multiple overlapping contexts |
US20230343235A1 (en) * | 2022-04-22 | 2023-10-26 | 617 Education Inc. | Systems and methods for grapheme-phoneme correspondence learning |
US11908473B2 (en) | 2022-05-10 | 2024-02-20 | Apple Inc. | Task modification after task initiation |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7272224B1 (en) | 2003-03-03 | 2007-09-18 | Apple Inc. | Echo cancellation |
US20130006633A1 (en) * | 2011-07-01 | 2013-01-03 | Qualcomm Incorporated | Learning speech models for mobile device users |
Family Cites Families (2713)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US1559320A (en) | 1924-11-17 | 1925-10-27 | Albert A Hirsh | Tooth cleaner |
US2180522A (en) | 1938-11-01 | 1939-11-21 | Henne Isabelle | Dental floss throw-away unit and method of making same |
US3828132A (en) | 1970-10-30 | 1974-08-06 | Bell Telephone Labor Inc | Speech synthesis by concatenation of formant encoded words |
US3710321A (en) | 1971-01-18 | 1973-01-09 | Ibm | Machine recognition of lexical symbols |
US3704345A (en) | 1971-03-19 | 1972-11-28 | Bell Telephone Labor Inc | Conversion of printed text into synthetic speech |
US3979557A (en) | 1974-07-03 | 1976-09-07 | International Telephone And Telegraph Corporation | Speech processor system for pitch period extraction using prediction filters |
US4013085A (en) | 1974-07-17 | 1977-03-22 | Wright Charles E | Dental cleaning means and method of manufacture therefor |
US4108211A (en) | 1975-04-28 | 1978-08-22 | Fuji Photo Optical Co., Ltd. | Articulated, four-way bendable tube structure |
US4107784A (en) | 1975-12-22 | 1978-08-15 | Bemmelen Henri M Van | Management control terminal method and apparatus |
US4090216A (en) | 1976-05-26 | 1978-05-16 | Gte Sylvania Incorporated | Ambient light contrast and color control circuit |
BG24190A1 (en) | 1976-09-08 | 1978-01-10 | Antonov | Method of synthesis of speech and device for effecting same |
US4081631A (en) | 1976-12-08 | 1978-03-28 | Motorola, Inc. | Dual purpose, weather resistant data terminal keyboard assembly including audio porting |
US4384169A (en) | 1977-01-21 | 1983-05-17 | Forrest S. Mozer | Method and apparatus for speech synthesizing |
US4159536A (en) | 1977-04-08 | 1979-06-26 | Willard E. Kehoe | Portable electronic language translation device |
GB1545406A (en) | 1977-12-16 | 1979-05-10 | Ibm | Keyboard apparatus |
US4181821A (en) | 1978-10-31 | 1980-01-01 | Bell Telephone Laboratories, Incorporated | Multiple template speech recognition system |
JPS597120B2 (en) | 1978-11-24 | 1984-02-16 | 日本電気株式会社 | speech analysis device |
US4241286A (en) | 1979-01-04 | 1980-12-23 | Mack Gordon | Welding helmet lens assembly |
US4253477A (en) | 1979-08-02 | 1981-03-03 | Eichman John J | Dental floss holder |
JPS5681900A (en) | 1979-12-10 | 1981-07-04 | Nippon Electric Co | Voice synthesizer |
US4310721A (en) | 1980-01-23 | 1982-01-12 | The United States Of America As Represented By The Secretary Of The Army | Half duplex integral vocoder modem system |
US4348553A (en) | 1980-07-02 | 1982-09-07 | International Business Machines Corporation | Parallel pattern verifier with dynamic time warping |
JPS5741731A (en) | 1980-08-25 | 1982-03-09 | Fujitsu Ltd | Coordinate input device |
US4332464A (en) | 1980-09-22 | 1982-06-01 | Xerox Corporation | Interactive user-machine interface method and apparatus for copier/duplicator |
NZ199001A (en) | 1981-01-30 | 1984-02-03 | Mobil Oil Corp | Alkylation of aromatic compounds using catalyst with metal component and a zeolite |
JPS57178295A (en) | 1981-04-27 | 1982-11-02 | Nippon Electric Co | Continuous word recognition apparatus |
US4495644A (en) | 1981-04-27 | 1985-01-22 | Quest Automation Public Limited Company | Apparatus for signature verification |
US4433377A (en) | 1981-06-29 | 1984-02-21 | Eustis Mary S | Data processing with format varying |
US4386345A (en) | 1981-09-22 | 1983-05-31 | Sperry Corporation | Color and brightness tracking in a cathode ray tube display system |
DE3382796T2 (en) | 1982-06-11 | 1996-03-28 | Mitsubishi Electric Corp | Intermediate image coding device. |
US4451849A (en) | 1982-06-23 | 1984-05-29 | Rca Corporation | Plural operating mode ambient light responsive television picture control |
US4485439A (en) | 1982-07-27 | 1984-11-27 | S.A. Analis | Standard hardware-software interface for connecting any instrument which provides a digital output stream with any digital host computer |
US4513379A (en) | 1982-09-07 | 1985-04-23 | General Electric Company | Customization window for a computer numerical control system |
JPS5957336A (en) | 1982-09-27 | 1984-04-02 | Toshiba Corp | Picture display device |
US4555775B1 (en) | 1982-10-07 | 1995-12-05 | Bell Telephone Labor Inc | Dynamic generation and overlaying of graphic windows for multiple active program storage areas |
US4587670A (en) | 1982-10-15 | 1986-05-06 | At&T Bell Laboratories | Hidden Markov model speech recognition arrangement |
US4831551A (en) | 1983-01-28 | 1989-05-16 | Texas Instruments Incorporated | Speaker-dependent connected speech word recognizer |
US4688195A (en) | 1983-01-28 | 1987-08-18 | Texas Instruments Incorporated | Natural-language interface generating system |
US4586158A (en) | 1983-02-22 | 1986-04-29 | International Business Machines Corp. | Screen management system |
DE3381300D1 (en) | 1983-03-31 | 1990-04-12 | Ibm | IMAGE ROOM MANAGEMENT AND PLAYBACK IN A PART OF THE SCREEN OF A VIRTUAL MULTIFUNCTIONAL TERMINAL. |
US4654875A (en) | 1983-05-23 | 1987-03-31 | The Research Foundation Of State University Of New York | System to achieve automatic recognition of linguistic strings |
SE8303123L (en) | 1983-06-02 | 1984-12-03 | Fixfabriken Ab | PARTY ARRANGEMENTS |
US4618984A (en) | 1983-06-08 | 1986-10-21 | International Business Machines Corporation | Adaptive automatic discrete utterance recognition |
JPS603056A (en) | 1983-06-21 | 1985-01-09 | Toshiba Corp | Information rearranging device |
US4611346A (en) | 1983-09-29 | 1986-09-09 | International Business Machines Corporation | Method and apparatus for character recognition accommodating diacritical marks |
DE3335358A1 (en) | 1983-09-29 | 1985-04-11 | Siemens AG, 1000 Berlin und 8000 München | METHOD FOR DETERMINING LANGUAGE SPECTRES FOR AUTOMATIC VOICE RECOGNITION AND VOICE ENCODING |
US4797930A (en) | 1983-11-03 | 1989-01-10 | Texas Instruments Incorporated | constructed syllable pitch patterns from phonological linguistic unit string data |
US4802223A (en) | 1983-11-03 | 1989-01-31 | Texas Instruments Incorporated | Low data rate speech encoding employing syllable pitch patterns |
US5164900A (en) | 1983-11-14 | 1992-11-17 | Colman Bernath | Method and device for phonetically encoding Chinese textual data for data processing entry |
US5212638A (en) | 1983-11-14 | 1993-05-18 | Colman Bernath | Alphabetic keyboard arrangement for typing Mandarin Chinese phonetic data |
US4680805A (en) | 1983-11-17 | 1987-07-14 | Texas Instruments Incorporated | Method and apparatus for recognition of discontinuous text |
US4589022A (en) | 1983-11-28 | 1986-05-13 | General Electric Company | Brightness control system for CRT video display |
JPS60116072A (en) | 1983-11-29 | 1985-06-22 | N K B:Kk | Information furnishing system |
US4736296A (en) | 1983-12-26 | 1988-04-05 | Hitachi, Ltd. | Method and apparatus of intelligent guidance in natural language |
US4726065A (en) | 1984-01-26 | 1988-02-16 | Horst Froessl | Image manipulation by speech signals |
US4955047A (en) | 1984-03-26 | 1990-09-04 | Dytel Corporation | Automated attendant with direct inward system access |
US4811243A (en) | 1984-04-06 | 1989-03-07 | Racine Marsh V | Computer aided coordinate digitizing system |
US4692941A (en) | 1984-04-10 | 1987-09-08 | First Byte | Real-time text-to-speech conversion system |
US4709390A (en) | 1984-05-04 | 1987-11-24 | American Telephone And Telegraph Company, At&T Bell Laboratories | Speech message code modifying arrangement |
JPH067397Y2 (en) | 1984-07-30 | 1994-02-23 | カシオ計算機株式会社 | Document input device |
JPH0724055B2 (en) | 1984-07-31 | 1995-03-15 | 株式会社日立製作所 | Word division processing method |
US4783807A (en) | 1984-08-27 | 1988-11-08 | John Marley | System and method for sound recognition with feature selection synchronized to voice pitch |
JP2607457B2 (en) | 1984-09-17 | 1997-05-07 | 株式会社東芝 | Pattern recognition device |
JPS61105671A (en) | 1984-10-29 | 1986-05-23 | Hitachi Ltd | Natural language processing device |
US4718094A (en) | 1984-11-19 | 1988-01-05 | International Business Machines Corp. | Speech recognition system |
US5165007A (en) | 1985-02-01 | 1992-11-17 | International Business Machines Corporation | Feneme-based Markov models for words |
US4783804A (en) | 1985-03-21 | 1988-11-08 | American Telephone And Telegraph Company, At&T Bell Laboratories | Hidden Markov model speech recognition arrangement |
US4944013A (en) | 1985-04-03 | 1990-07-24 | British Telecommunications Public Limited Company | Multi-pulse speech coder |
US4670848A (en) | 1985-04-10 | 1987-06-02 | Standard Systems Corporation | Artificial intelligence system |
US4658425A (en) | 1985-04-19 | 1987-04-14 | Shure Brothers, Inc. | Microphone actuation control system suitable for teleconference systems |
US4833712A (en) | 1985-05-29 | 1989-05-23 | International Business Machines Corporation | Automatic generation of simple Markov model stunted baseforms for words in a vocabulary |
US4819271A (en) | 1985-05-29 | 1989-04-04 | International Business Machines Corporation | Constructing Markov model word baseforms from multiple utterances by concatenating model sequences for word segments |
US4698625A (en) | 1985-05-30 | 1987-10-06 | International Business Machines Corp. | Graphic highlight adjacent a pointing cursor |
US4829583A (en) | 1985-06-03 | 1989-05-09 | Sino Business Machines, Inc. | Method and apparatus for processing ideographic characters |
US5067158A (en) | 1985-06-11 | 1991-11-19 | Texas Instruments Incorporated | Linear predictive residual representation via non-iterative spectral reconstruction |
US5175803A (en) | 1985-06-14 | 1992-12-29 | Yeh Victor C | Method and apparatus for data processing and word processing in Chinese using a phonetic Chinese language |
US4713775A (en) | 1985-08-21 | 1987-12-15 | Teknowledge, Incorporated | Intelligent assistant for using and operating computer system capabilities to solve problems |
EP0218859A3 (en) | 1985-10-11 | 1989-09-06 | International Business Machines Corporation | Signal processor communication interface |
US5133023A (en) | 1985-10-15 | 1992-07-21 | The Palantir Corporation | Means for resolving ambiguities in text based upon character context |
US4754489A (en) | 1985-10-15 | 1988-06-28 | The Palantir Corporation | Means for resolving ambiguities in text based upon character context |
US4655233A (en) | 1985-11-04 | 1987-04-07 | Laughlin Patrick E | Dental flossing tool |
US4776016A (en) | 1985-11-21 | 1988-10-04 | Position Orientation Systems, Inc. | Voice control system |
NL8503304A (en) | 1985-11-29 | 1987-06-16 | Philips Nv | METHOD AND APPARATUS FOR SEGMENTING AN ELECTRIC SIGNAL FROM AN ACOUSTIC SIGNAL, FOR EXAMPLE, A VOICE SIGNAL. |
JPH0833744B2 (en) | 1986-01-09 | 1996-03-29 | 株式会社東芝 | Speech synthesizer |
US4680429A (en) | 1986-01-15 | 1987-07-14 | Tektronix, Inc. | Touch panel |
US4807752A (en) | 1986-01-21 | 1989-02-28 | Placontrol Corporation | Dental floss holders and package assembly of same |
US4724542A (en) | 1986-01-22 | 1988-02-09 | International Business Machines Corporation | Automatic reference adaptation during dynamic signature verification |
US5128752A (en) | 1986-03-10 | 1992-07-07 | Kohorn H Von | System and method for generating and redeeming tokens |
US5759101A (en) | 1986-03-10 | 1998-06-02 | Response Reward Systems L.C. | Central and remote evaluation of responses of participatory broadcast audience with automatic crediting and couponing |
US5032989A (en) | 1986-03-19 | 1991-07-16 | Realpro, Ltd. | Real estate search and location system and method |
DE3779351D1 (en) | 1986-03-28 | 1992-07-02 | American Telephone And Telegraph Co., New York, N.Y., Us | |
JPS62235998A (en) | 1986-04-05 | 1987-10-16 | シャープ株式会社 | Syllable identification system |
JPH0814822B2 (en) | 1986-04-30 | 1996-02-14 | カシオ計算機株式会社 | Command input device |
US4903305A (en) | 1986-05-12 | 1990-02-20 | Dragon Systems, Inc. | Method for representing word models for use in speech recognition |
US4837798A (en) | 1986-06-02 | 1989-06-06 | American Telephone And Telegraph Company | Communication system having unified messaging |
GB8618665D0 (en) | 1986-07-31 | 1986-09-10 | British Telecomm | Graphical workstation |
US4790028A (en) | 1986-09-12 | 1988-12-06 | Westinghouse Electric Corp. | Method and apparatus for generating variably scaled displays |
US5765131A (en) | 1986-10-03 | 1998-06-09 | British Telecommunications Public Limited Company | Language translation system and method |
EP0262938B1 (en) | 1986-10-03 | 1993-12-15 | BRITISH TELECOMMUNICATIONS public limited company | Language translation system |
US4837831A (en) | 1986-10-15 | 1989-06-06 | Dragon Systems, Inc. | Method for creating and using multiple-word sound models in speech recognition |
US5083268A (en) | 1986-10-15 | 1992-01-21 | Texas Instruments Incorporated | System and method for parsing natural language by unifying lexical features of words |
WO1988002975A1 (en) | 1986-10-16 | 1988-04-21 | Mitsubishi Denki Kabushiki Kaisha | Amplitude-adapted vector quantizer |
US5123103A (en) | 1986-10-17 | 1992-06-16 | Hitachi, Ltd. | Method and system of retrieving program specification and linking the specification by concept to retrieval request for reusing program parts |
US4829576A (en) | 1986-10-21 | 1989-05-09 | Dragon Systems, Inc. | Voice recognition system |
US4887212A (en) | 1986-10-29 | 1989-12-12 | International Business Machines Corporation | Parser for natural language text |
US4833718A (en) | 1986-11-18 | 1989-05-23 | First Byte | Compression of stored waveforms for artificial speech |
US4852168A (en) | 1986-11-18 | 1989-07-25 | Sprague Richard P | Compression of stored waveforms for artificial speech |
US4727354A (en) | 1987-01-07 | 1988-02-23 | Unisys Corporation | System for selecting best fit vector code in vector quantization encoding |
US4827520A (en) | 1987-01-16 | 1989-05-02 | Prince Corporation | Voice actuated control system for use in a vehicle |
US5179627A (en) | 1987-02-10 | 1993-01-12 | Dictaphone Corporation | Digital dictation system |
US4965763A (en) | 1987-03-03 | 1990-10-23 | International Business Machines Corporation | Computer method for automatic extraction of commonly specified information from business correspondence |
JP2595235B2 (en) | 1987-03-18 | 1997-04-02 | 富士通株式会社 | Speech synthesizer |
US4755811A (en) | 1987-03-24 | 1988-07-05 | Tektronix, Inc. | Touch controlled zoom of waveform displays |
US4803729A (en) | 1987-04-03 | 1989-02-07 | Dragon Systems, Inc. | Speech recognition method |
US5027408A (en) | 1987-04-09 | 1991-06-25 | Kroeker John P | Speech-recognition circuitry employing phoneme estimation |
US5125030A (en) | 1987-04-13 | 1992-06-23 | Kokusai Denshin Denwa Co., Ltd. | Speech signal coding/decoding system based on the type of speech signal |
US5644727A (en) | 1987-04-15 | 1997-07-01 | Proprietary Financial Products, Inc. | System for the operation and management of one or more financial accounts through the use of a digital communication and computation system for exchange, investment and borrowing |
AT386947B (en) | 1987-04-17 | 1988-11-10 | Rochus Marxer | TENSIONABLE THREAD, CONTAINER FOR THIS THREAD, AND HOLDER FOR DENTAL CARE, ESPECIALLY FOR CLEANING THE DENTAL SPACES |
JPS63285598A (en) | 1987-05-18 | 1988-11-22 | ケイディディ株式会社 | Phoneme connection type parameter rule synthesization system |
EP0293259A3 (en) | 1987-05-29 | 1990-03-07 | Kabushiki Kaisha Toshiba | Voice recognition system used in telephone apparatus |
US5231670A (en) | 1987-06-01 | 1993-07-27 | Kurzweil Applied Intelligence, Inc. | Voice controlled system and method for generating text from a voice controlled input |
CA1265623A (en) | 1987-06-11 | 1990-02-06 | Eddy Lee | Method of facilitating computer sorting |
DE3723078A1 (en) | 1987-07-11 | 1989-01-19 | Philips Patentverwaltung | METHOD FOR DETECTING CONTINUOUSLY SPOKEN WORDS |
CA1288516C (en) | 1987-07-31 | 1991-09-03 | Leendert M. Bijnagte | Apparatus and method for communicating textual and image information between a host computer and a remote display terminal |
US4974191A (en) | 1987-07-31 | 1990-11-27 | Syntellect Software Inc. | Adaptive natural language computer interface system |
US4827518A (en) | 1987-08-06 | 1989-05-02 | Bell Communications Research, Inc. | Speaker verification system using integrated circuit cards |
CA1280215C (en) | 1987-09-28 | 1991-02-12 | Eddy Lee | Multilingual ordered data retrieval system |
JP2602847B2 (en) | 1987-09-29 | 1997-04-23 | 株式会社日立製作所 | Multimedia mail system |
US5022081A (en) | 1987-10-01 | 1991-06-04 | Sharp Kabushiki Kaisha | Information recognition system |
WO1989003573A1 (en) | 1987-10-09 | 1989-04-20 | Sound Entertainment, Inc. | Generating speech from digitally stored coarticulated speech segments |
JPH01102599A (en) | 1987-10-12 | 1989-04-20 | Internatl Business Mach Corp <Ibm> | Voice recognition |
US4852173A (en) | 1987-10-29 | 1989-07-25 | International Business Machines Corporation | Design and construction of a binary-tree system for language modelling |
EP0314908B1 (en) | 1987-10-30 | 1992-12-02 | International Business Machines Corporation | Automatic determination of labels and markov word models in a speech recognition system |
US5072452A (en) | 1987-10-30 | 1991-12-10 | International Business Machines Corporation | Automatic determination of labels and Markov word models in a speech recognition system |
US4914586A (en) | 1987-11-06 | 1990-04-03 | Xerox Corporation | Garbage collector for hypermedia systems |
US4992972A (en) | 1987-11-18 | 1991-02-12 | International Business Machines Corporation | Flexible context searchable on-line information system with help files and modules for on-line computer system documentation |
US4908867A (en) | 1987-11-19 | 1990-03-13 | British Telecommunications Public Limited Company | Speech synthesis |
US5220657A (en) | 1987-12-02 | 1993-06-15 | Xerox Corporation | Updating local copy of shared data in a collaborative system |
JP2739945B2 (en) | 1987-12-24 | 1998-04-15 | 株式会社東芝 | Voice recognition method |
US5053758A (en) | 1988-02-01 | 1991-10-01 | Sperry Marine Inc. | Touchscreen control panel with sliding touch control |
US4984177A (en) | 1988-02-05 | 1991-01-08 | Advanced Products And Technologies, Inc. | Voice language translator |
GB2219178A (en) | 1988-02-11 | 1989-11-29 | Benchmark Technologies | State machine controlled video processor |
US5194950A (en) | 1988-02-29 | 1993-03-16 | Mitsubishi Denki Kabushiki Kaisha | Vector quantizer |
US5079723A (en) | 1988-03-04 | 1992-01-07 | Xerox Corporation | Touch dialogue user interface for reproduction machines |
US4994966A (en) | 1988-03-31 | 1991-02-19 | Emerson & Stern Associates, Inc. | System and method for natural language parsing by initiating processing prior to entry of complete sentences |
FI80536C (en) | 1988-04-15 | 1990-06-11 | Nokia Mobira Oy | matrix Display |
US4914590A (en) | 1988-05-18 | 1990-04-03 | Emhart Industries, Inc. | Natural language understanding system |
US4975975A (en) | 1988-05-26 | 1990-12-04 | Gtx Corporation | Hierarchical parametric apparatus and method for recognizing drawn characters |
US5315689A (en) | 1988-05-27 | 1994-05-24 | Kabushiki Kaisha Toshiba | Speech recognition system having word-based and phoneme-based recognition means |
US5029211A (en) | 1988-05-30 | 1991-07-02 | Nec Corporation | Speech analysis and synthesis system |
US5111423A (en) | 1988-07-21 | 1992-05-05 | Altera Corporation | Programmable interface for computer system peripheral circuit card |
FR2636163B1 (en) | 1988-09-02 | 1991-07-05 | Hamon Christian | METHOD AND DEVICE FOR SYNTHESIZING SPEECH BY ADDING-COVERING WAVEFORMS |
US5161102A (en) | 1988-09-09 | 1992-11-03 | Compaq Computer Corporation | Computer interface for the configuration of computer system and circuit boards |
US5257387A (en) | 1988-09-09 | 1993-10-26 | Compaq Computer Corporation | Computer implemented method and apparatus for dynamic and automatic configuration of a computer system and circuit boards including computer resource allocation conflict resolution |
US5353432A (en) | 1988-09-09 | 1994-10-04 | Compaq Computer Corporation | Interactive method for configuration of computer system and circuit boards with user specification of system resources and computer resolution of resource conflicts |
US4839853A (en) | 1988-09-15 | 1989-06-13 | Bell Communications Research, Inc. | Computer information retrieval using latent semantic structure |
JPH0286397A (en) | 1988-09-22 | 1990-03-27 | Nippon Telegr & Teleph Corp <Ntt> | Microphone array |
JPH0293597A (en) | 1988-09-30 | 1990-04-04 | Nippon I B M Kk | Speech recognition device |
US5201034A (en) | 1988-09-30 | 1993-04-06 | Hitachi Ltd. | Interactive intelligent interface |
US4905163A (en) | 1988-10-03 | 1990-02-27 | Minnesota Mining & Manufacturing Company | Intelligent optical navigator dynamic information presentation and navigation system |
US5282265A (en) | 1988-10-04 | 1994-01-25 | Canon Kabushiki Kaisha | Knowledge information processing system |
US4918723A (en) | 1988-10-07 | 1990-04-17 | Jerry R. Iggulden | Keyboard to facsimile machine transmission system |
DE3837590A1 (en) | 1988-11-05 | 1990-05-10 | Ant Nachrichtentech | PROCESS FOR REDUCING THE DATA RATE OF DIGITAL IMAGE DATA |
DE68913669T2 (en) | 1988-11-23 | 1994-07-21 | Digital Equipment Corp | Pronunciation of names by a synthesizer. |
US5027110A (en) | 1988-12-05 | 1991-06-25 | At&T Bell Laboratories | Arrangement for simultaneously displaying on one or more display terminals a series of images |
JPH02153415A (en) | 1988-12-06 | 1990-06-13 | Hitachi Ltd | Keyboard device |
US5027406A (en) | 1988-12-06 | 1991-06-25 | Dragon Systems, Inc. | Method for interactive speech recognition and training |
GB8828796D0 (en) | 1988-12-09 | 1989-01-18 | British Telecomm | Data compression |
US4935954A (en) | 1988-12-28 | 1990-06-19 | At&T Company | Automated message retrieval system |
US5127055A (en) | 1988-12-30 | 1992-06-30 | Kurzweil Applied Intelligence, Inc. | Speech recognition apparatus & method having dynamic reference pattern adaptation |
DE3853885T2 (en) | 1988-12-30 | 1995-09-14 | Ezel Inc | Vectorization process. |
US5293448A (en) | 1989-10-02 | 1994-03-08 | Nippon Telegraph And Telephone Corporation | Speech analysis-synthesis method and apparatus therefor |
US5047614A (en) | 1989-01-23 | 1991-09-10 | Bianco James S | Method and apparatus for computer-aided shopping |
JP2574892B2 (en) | 1989-02-15 | 1997-01-22 | 株式会社日立製作所 | Load sharing control method for automobile |
US5086792A (en) | 1989-02-16 | 1992-02-11 | Placontrol Corp. | Dental floss loop devices, and methods of manufacture and packaging same |
US4928307A (en) | 1989-03-02 | 1990-05-22 | Acs Communications | Time dependent, variable amplitude threshold output circuit for frequency variant and frequency invariant signal discrimination |
SE466029B (en) | 1989-03-06 | 1991-12-02 | Ibm Svenska Ab | DEVICE AND PROCEDURE FOR ANALYSIS OF NATURAL LANGUAGES IN A COMPUTER-BASED INFORMATION PROCESSING SYSTEM |
JPH0636156B2 (en) | 1989-03-13 | 1994-05-11 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Voice recognizer |
JP2763322B2 (en) | 1989-03-13 | 1998-06-11 | キヤノン株式会社 | Audio processing method |
US5033087A (en) | 1989-03-14 | 1991-07-16 | International Business Machines Corp. | Method and apparatus for the automatic determination of phonological rules as for a continuous speech recognition system |
JPH0782544B2 (en) | 1989-03-24 | 1995-09-06 | インターナショナル・ビジネス・マシーンズ・コーポレーション | DP matching method and apparatus using multi-template |
US5003577A (en) | 1989-04-05 | 1991-03-26 | At&T Bell Laboratories | Voice and data interface to a voice-mail service system |
US4977598A (en) | 1989-04-13 | 1990-12-11 | Texas Instruments Incorporated | Efficient pruning algorithm for hidden markov model speech recognition |
US5197005A (en) | 1989-05-01 | 1993-03-23 | Intelligent Business Systems | Database retrieval system having a natural language interface |
US4994983A (en) | 1989-05-02 | 1991-02-19 | Itt Corporation | Automatic speech recognition system using seed templates |
US5287448A (en) | 1989-05-04 | 1994-02-15 | Apple Computer, Inc. | Method and apparatus for providing help information to users of computers |
JP2904283B2 (en) | 1989-05-22 | 1999-06-14 | マツダ株式会社 | Multiplex transmission equipment for vehicles |
US4953106A (en) | 1989-05-23 | 1990-08-28 | At&T Bell Laboratories | Technique for drawing directed graphs |
US5010574A (en) | 1989-06-13 | 1991-04-23 | At&T Bell Laboratories | Vector quantizer search arrangement |
JPH03163623A (en) | 1989-06-23 | 1991-07-15 | Articulate Syst Inc | Voice control computor interface |
JP2527817B2 (en) | 1989-07-14 | 1996-08-28 | シャープ株式会社 | Subject association device and word association device |
JP2940005B2 (en) | 1989-07-20 | 1999-08-25 | 日本電気株式会社 | Audio coding device |
JPH03113578A (en) | 1989-09-27 | 1991-05-14 | Fujitsu Ltd | Graphic output processing system |
US5091945A (en) | 1989-09-28 | 1992-02-25 | At&T Bell Laboratories | Source dependent channel coding with error protection |
US5276616A (en) | 1989-10-16 | 1994-01-04 | Sharp Kabushiki Kaisha | Apparatus for automatically generating index |
CA2027705C (en) | 1989-10-17 | 1994-02-15 | Masami Akamine | Speech coding system utilizing a recursive computation technique for improvement in processing speed |
US5075896A (en) | 1989-10-25 | 1991-12-24 | Xerox Corporation | Character and phoneme recognition based on probability clustering |
US4980916A (en) | 1989-10-26 | 1990-12-25 | General Electric Company | Method for improving speech quality in code excited linear predictive speech coding |
US5020112A (en) | 1989-10-31 | 1991-05-28 | At&T Bell Laboratories | Image recognition method using two-dimensional stochastic grammars |
DE69028072T2 (en) | 1989-11-06 | 1997-01-09 | Canon Kk | Method and device for speech synthesis |
US5220639A (en) | 1989-12-01 | 1993-06-15 | National Science Council | Mandarin speech input method for Chinese computers and a mandarin speech recognition machine |
US5021971A (en) | 1989-12-07 | 1991-06-04 | Unisys Corporation | Reflective binary encoder for vector quantization |
US5179652A (en) | 1989-12-13 | 1993-01-12 | Anthony I. Rozmanith | Method and apparatus for storing, transmitting and retrieving graphical and tabular data |
US5077669A (en) | 1989-12-27 | 1991-12-31 | International Business Machines Corporation | Method for quasi-key search within a national language support (nls) data processing system |
US5091790A (en) | 1989-12-29 | 1992-02-25 | Morton Silverberg | Multipurpose computer accessory for facilitating facsimile communication |
EP0438662A2 (en) | 1990-01-23 | 1991-07-31 | International Business Machines Corporation | Apparatus and method of grouping utterances of a phoneme into context-de-pendent categories based on sound-similarity for automatic speech recognition |
US5175814A (en) | 1990-01-30 | 1992-12-29 | Digital Equipment Corporation | Direct manipulation interface for boolean information retrieval |
US5218700A (en) | 1990-01-30 | 1993-06-08 | Allen Beechick | Apparatus and method for sorting a list of items |
US5255386A (en) | 1990-02-08 | 1993-10-19 | International Business Machines Corporation | Method and apparatus for intelligent help that matches the semantic similarity of the inferred intent of query or command to a best-fit predefined command intent |
CH681573A5 (en) | 1990-02-13 | 1993-04-15 | Astral | Automatic teller arrangement involving bank computers - is operated by user data card carrying personal data, account information and transaction records |
DE69133296T2 (en) | 1990-02-22 | 2004-01-29 | Nec Corp | speech |
US5067503A (en) | 1990-03-21 | 1991-11-26 | Stile Thomas W | Dental apparatus for flossing teeth |
US5266949A (en) | 1990-03-29 | 1993-11-30 | Nokia Mobile Phones Ltd. | Lighted electronic keyboard |
US5299284A (en) | 1990-04-09 | 1994-03-29 | Arizona Board Of Regents, Acting On Behalf Of Arizona State University | Pattern classification using linear programming |
US5125022A (en) | 1990-05-15 | 1992-06-23 | Vcs Industries, Inc. | Method for recognizing alphanumeric strings spoken over a telephone network |
US5127043A (en) | 1990-05-15 | 1992-06-30 | Vcs Industries, Inc. | Simultaneous speaker-independent voice recognition and verification over a telephone network |
US5301109A (en) | 1990-06-11 | 1994-04-05 | Bell Communications Research, Inc. | Computerized cross-language document retrieval using latent semantic indexing |
JP3266246B2 (en) | 1990-06-15 | 2002-03-18 | インターナシヨナル・ビジネス・マシーンズ・コーポレーシヨン | Natural language analysis apparatus and method, and knowledge base construction method for natural language analysis |
US5202952A (en) | 1990-06-22 | 1993-04-13 | Dragon Systems, Inc. | Large-vocabulary continuous speech prefiltering and processing system |
EP0464712A3 (en) | 1990-06-28 | 1993-01-13 | Kabushiki Kaisha Toshiba | Display/input control system for software keyboard in information processing apparatus having integral display/input device |
DE4023318A1 (en) | 1990-07-21 | 1992-02-20 | Fraunhofer Ges Forschung | METHOD FOR PERFORMING A VARIABLE DIALOG WITH TECHNICAL DEVICES |
US5175536A (en) | 1990-08-01 | 1992-12-29 | Westinghouse Electric Corp. | Apparatus and method for adapting cards designed for a VME bus for use in a VXI bus system |
US5103498A (en) | 1990-08-02 | 1992-04-07 | Tandy Corporation | Intelligent help system |
JPH0493894A (en) | 1990-08-03 | 1992-03-26 | Canon Inc | Method and device for character processing |
DE69131819T2 (en) | 1990-08-09 | 2000-04-27 | Semantic Compaction System Pit | COMMUNICATION SYSTEM WITH TEXT MESSAGE DETECTION BASED ON CONCEPTS THAT ARE ENTERED BY KEYBOARD ICONS |
GB9017600D0 (en) | 1990-08-10 | 1990-09-26 | British Aerospace | An assembly and method for binary tree-searched vector quanisation data compression processing |
DE4126902C2 (en) | 1990-08-15 | 1996-06-27 | Ricoh Kk | Speech interval - detection unit |
US5404295A (en) | 1990-08-16 | 1995-04-04 | Katz; Boris | Method and apparatus for utilizing annotations to facilitate computer retrieval of database material |
US5309359A (en) | 1990-08-16 | 1994-05-03 | Boris Katz | Method and apparatus for generating and utlizing annotations to facilitate computer text retrieval |
US5297170A (en) | 1990-08-21 | 1994-03-22 | Codex Corporation | Lattice and trellis-coded quantization |
US5400434A (en) | 1990-09-04 | 1995-03-21 | Matsushita Electric Industrial Co., Ltd. | Voice source for synthetic speech system |
EP0473864A1 (en) | 1990-09-04 | 1992-03-11 | International Business Machines Corporation | Method and apparatus for paraphrasing information contained in logical forms |
JPH0833739B2 (en) | 1990-09-13 | 1996-03-29 | 三菱電機株式会社 | Pattern expression model learning device |
US5119079A (en) | 1990-09-17 | 1992-06-02 | Xerox Corporation | Touch screen user interface with expanding touch locations for a reprographic machine |
US5216747A (en) | 1990-09-20 | 1993-06-01 | Digital Voice Systems, Inc. | Voiced/unvoiced estimation of an acoustic signal |
US5276794A (en) | 1990-09-25 | 1994-01-04 | Grid Systems Corporation | Pop-up keyboard system for entering handwritten data into computer generated forms |
US5164982A (en) | 1990-09-27 | 1992-11-17 | Radish Communications Systems, Inc. | Telecommunication display system |
US5305205A (en) | 1990-10-23 | 1994-04-19 | Weber Maria L | Computer-assisted transcription apparatus |
US5128672A (en) | 1990-10-30 | 1992-07-07 | Apple Computer, Inc. | Dynamic predictive keyboard |
US5325298A (en) | 1990-11-07 | 1994-06-28 | Hnc, Inc. | Methods for generating or revising context vectors for a plurality of word stems |
US5317507A (en) | 1990-11-07 | 1994-05-31 | Gallant Stephen I | Method for document retrieval and for word sense disambiguation using neural networks |
US5260697A (en) | 1990-11-13 | 1993-11-09 | Wang Laboratories, Inc. | Computer with separate display plane and user interface processor |
US5450523A (en) | 1990-11-15 | 1995-09-12 | Matsushita Electric Industrial Co., Ltd. | Training module for estimating mixture Gaussian densities for speech unit models in speech recognition systems |
US5247579A (en) | 1990-12-05 | 1993-09-21 | Digital Voice Systems, Inc. | Methods for speech transmission |
US5345536A (en) | 1990-12-21 | 1994-09-06 | Matsushita Electric Industrial Co., Ltd. | Method of speech recognition |
US5127053A (en) | 1990-12-24 | 1992-06-30 | General Electric Company | Low-complexity method for improving the performance of autocorrelation-based pitch detectors |
US5133011A (en) | 1990-12-26 | 1992-07-21 | International Business Machines Corporation | Method and apparatus for linear vocal control of cursor position |
US5210689A (en) | 1990-12-28 | 1993-05-11 | Semantic Compaction Systems | System and method for automatically selecting among a plurality of input modes |
US5196838A (en) | 1990-12-28 | 1993-03-23 | Apple Computer, Inc. | Intelligent scrolling |
US5497319A (en) | 1990-12-31 | 1996-03-05 | Trans-Link International Corp. | Machine translation and telecommunications system |
JPH04236624A (en) | 1991-01-18 | 1992-08-25 | Sony Corp | Control system |
US5712949A (en) | 1991-01-29 | 1998-01-27 | Sony Corporation | Disc reproduction system with sequential reproduction of audio and image data |
FI88345C (en) | 1991-01-29 | 1993-04-26 | Nokia Mobile Phones Ltd | BELYST KEYBOARD |
US5268990A (en) | 1991-01-31 | 1993-12-07 | Sri International | Method for recognizing speech using linguistically-motivated hidden Markov models |
US5369577A (en) | 1991-02-01 | 1994-11-29 | Wang Laboratories, Inc. | Text searching system |
US5613056A (en) | 1991-02-19 | 1997-03-18 | Bright Star Technology, Inc. | Advanced tools for speech synchronized animation |
US5167004A (en) | 1991-02-28 | 1992-11-24 | Texas Instruments Incorporated | Temporal decorrelation method for robust speaker verification |
GB9105367D0 (en) | 1991-03-13 | 1991-04-24 | Univ Strathclyde | Computerised information-retrieval database systems |
EP0505621A3 (en) | 1991-03-28 | 1993-06-02 | International Business Machines Corporation | Improved message recognition employing integrated speech and handwriting information |
US5212821A (en) | 1991-03-29 | 1993-05-18 | At&T Bell Laboratories | Machine-based learning system |
US5327342A (en) | 1991-03-31 | 1994-07-05 | Roy Prannoy L | Method and apparatus for generating personalized handwriting |
KR100318330B1 (en) | 1991-04-08 | 2002-04-22 | 가나이 쓰도무 | Monitoring device |
JP2970964B2 (en) | 1991-09-18 | 1999-11-02 | 株式会社日立製作所 | Monitoring device |
US5303406A (en) | 1991-04-29 | 1994-04-12 | Motorola, Inc. | Noise squelch circuit with adaptive noise shaping |
US5274771A (en) | 1991-04-30 | 1993-12-28 | Hewlett-Packard Company | System for configuring an input/output board in a computer |
US5367640A (en) | 1991-04-30 | 1994-11-22 | Hewlett-Packard Company | System for configuring an input/output board in a computer |
US5341466A (en) | 1991-05-09 | 1994-08-23 | New York University | Fractal computer user centerface with zooming capability |
JP3123558B2 (en) | 1991-05-09 | 2001-01-15 | ソニー株式会社 | Information input processing device and method |
US5202828A (en) | 1991-05-15 | 1993-04-13 | Apple Computer, Inc. | User interface system having programmable user interface elements |
US5500905A (en) | 1991-06-12 | 1996-03-19 | Microelectronics And Computer Technology Corporation | Pattern recognition neural network with saccade-like operation |
US5241619A (en) | 1991-06-25 | 1993-08-31 | Bolt Beranek And Newman Inc. | Word dependent N-best search method |
US5475587A (en) | 1991-06-28 | 1995-12-12 | Digital Equipment Corporation | Method and apparatus for efficient morphological text analysis using a high-level language for compact specification of inflectional paradigms |
US5293452A (en) | 1991-07-01 | 1994-03-08 | Texas Instruments Incorporated | Voice log-in using spoken name input |
US5442780A (en) | 1991-07-11 | 1995-08-15 | Mitsubishi Denki Kabushiki Kaisha | Natural language database retrieval system using virtual tables to convert parsed input phrases into retrieval keys |
US5477451A (en) | 1991-07-25 | 1995-12-19 | International Business Machines Corp. | Method and system for natural language translation |
US5687077A (en) | 1991-07-31 | 1997-11-11 | Universal Dynamics Limited | Method and apparatus for adaptive control |
JPH05197389A (en) | 1991-08-13 | 1993-08-06 | Toshiba Corp | Voice recognition device |
US5278980A (en) | 1991-08-16 | 1994-01-11 | Xerox Corporation | Iterative technique for phrase query formation and an information retrieval system employing same |
US5450522A (en) | 1991-08-19 | 1995-09-12 | U S West Advanced Technologies, Inc. | Auditory model for parametrization of speech |
US5326270A (en) | 1991-08-29 | 1994-07-05 | Introspect Technologies, Inc. | System and method for assessing an individual's task-processing style |
US5199077A (en) | 1991-09-19 | 1993-03-30 | Xerox Corporation | Wordspotting for voice editing and indexing |
DE4131387A1 (en) | 1991-09-20 | 1993-03-25 | Siemens Ag | METHOD FOR RECOGNIZING PATTERNS IN TIME VARIANTS OF MEASURING SIGNALS |
US5488727A (en) | 1991-09-30 | 1996-01-30 | International Business Machines Corporation | Methods to support multimethod function overloading with compile-time type checking |
JP2662120B2 (en) | 1991-10-01 | 1997-10-08 | インターナショナル・ビジネス・マシーンズ・コーポレイション | Speech recognition device and processing unit for speech recognition |
JPH05108065A (en) | 1991-10-15 | 1993-04-30 | Kawai Musical Instr Mfg Co Ltd | Automatic performance device |
JP3155577B2 (en) | 1991-10-16 | 2001-04-09 | キヤノン株式会社 | Character recognition method and device |
US5222146A (en) | 1991-10-23 | 1993-06-22 | International Business Machines Corporation | Speech recognition apparatus having a speech coder outputting acoustic prototype ranks |
US5371853A (en) | 1991-10-28 | 1994-12-06 | University Of Maryland At College Park | Method and system for CELP speech coding and codebook for use therewith |
US5757979A (en) | 1991-10-30 | 1998-05-26 | Fuji Electric Co., Ltd. | Apparatus and method for nonlinear normalization of image |
KR940002854B1 (en) | 1991-11-06 | 1994-04-04 | 한국전기통신공사 | Sound synthesizing system |
US5386494A (en) | 1991-12-06 | 1995-01-31 | Apple Computer, Inc. | Method and apparatus for controlling a speech recognition function using a cursor control device |
JPH05165459A (en) | 1991-12-19 | 1993-07-02 | Toshiba Corp | Enlarging display system |
US5475796A (en) | 1991-12-20 | 1995-12-12 | Nec Corporation | Pitch pattern generation apparatus |
US6081750A (en) | 1991-12-23 | 2000-06-27 | Hoffberg; Steven Mark | Ergonomic man-machine interface incorporating adaptive pattern recognition based control system |
US5903454A (en) | 1991-12-23 | 1999-05-11 | Hoffberg; Linda Irene | Human-factored interface corporating adaptive pattern recognition based controller apparatus |
US5502790A (en) | 1991-12-24 | 1996-03-26 | Oki Electric Industry Co., Ltd. | Speech recognition method and system using triphones, diphones, and phonemes |
US5349645A (en) | 1991-12-31 | 1994-09-20 | Matsushita Electric Industrial Co., Ltd. | Word hypothesizer for continuous speech decoding using stressed-vowel centered bidirectional tree searches |
JPH05188994A (en) | 1992-01-07 | 1993-07-30 | Sony Corp | Noise suppression device |
US5392419A (en) | 1992-01-24 | 1995-02-21 | Hewlett-Packard Company | Language identification system and method for a peripheral unit |
US5357431A (en) | 1992-01-27 | 1994-10-18 | Fujitsu Limited | Character string retrieval system using index and unit for making the index |
US5274818A (en) | 1992-02-03 | 1993-12-28 | Thinking Machines Corporation | System and method for compiling a fine-grained array based source program onto a course-grained hardware |
US5267345A (en) | 1992-02-10 | 1993-11-30 | International Business Machines Corporation | Speech recognition apparatus which predicts word classes from context and words from word classes |
US5621806A (en) | 1992-02-14 | 1997-04-15 | Texas Instruments Incorporated | Apparatus and methods for determining the relative displacement of an object |
US5412735A (en) | 1992-02-27 | 1995-05-02 | Central Institute For The Deaf | Adaptive noise reduction circuit for a sound reproduction system |
EP0559349B1 (en) | 1992-03-02 | 1999-01-07 | AT&T Corp. | Training method and apparatus for speech recognition |
US6055514A (en) | 1992-03-20 | 2000-04-25 | Wren; Stephen Corey | System for marketing foods and services utilizing computerized centraland remote facilities |
US5353376A (en) | 1992-03-20 | 1994-10-04 | Texas Instruments Incorporated | System and method for improved speech acquisition for hands-free voice telecommunication in a noisy environment |
US5333266A (en) | 1992-03-27 | 1994-07-26 | International Business Machines Corporation | Method and apparatus for message handling in computer systems |
US5440615A (en) | 1992-03-31 | 1995-08-08 | At&T Corp. | Language selection for voice messaging system |
US5757358A (en) | 1992-03-31 | 1998-05-26 | The United States Of America As Represented By The Secretary Of The Navy | Method and apparatus for enhancing computer-user selection of computer-displayed objects through dynamic selection area and constant visual feedback |
US5390236A (en) | 1992-03-31 | 1995-02-14 | Klausner Patent Technologies | Telephone answering device linking displayed data with recorded audio message |
US5283818A (en) | 1992-03-31 | 1994-02-01 | Klausner Patent Technologies | Telephone answering device linking displayed data with recorded audio message |
CA2088080C (en) | 1992-04-02 | 1997-10-07 | Enrico Luigi Bocchieri | Automatic speech recognizer |
US5317647A (en) | 1992-04-07 | 1994-05-31 | Apple Computer, Inc. | Constrained attribute grammars for syntactic pattern recognition |
JPH05293126A (en) | 1992-04-15 | 1993-11-09 | Matsushita Electric Works Ltd | Dental floss |
US5412804A (en) | 1992-04-30 | 1995-05-02 | Oracle Corporation | Extending the semantics of the outer join operator for un-nesting queries to a data base |
US5745873A (en) | 1992-05-01 | 1998-04-28 | Massachusetts Institute Of Technology | Speech recognition using final decision based on tentative decisions |
US5369575A (en) | 1992-05-15 | 1994-11-29 | International Business Machines Corporation | Constrained natural language interface for a computer system |
US5377103A (en) | 1992-05-15 | 1994-12-27 | International Business Machines Corporation | Constrained natural language interface for a computer that employs a browse function |
US5293584A (en) | 1992-05-21 | 1994-03-08 | International Business Machines Corporation | Speech recognition system for natural language translation |
US5463696A (en) | 1992-05-27 | 1995-10-31 | Apple Computer, Inc. | Recognition system and method for user inputs to a computer system |
US5477447A (en) | 1992-05-27 | 1995-12-19 | Apple Computer, Incorporated | Method and apparatus for providing computer-implemented assistance |
US5390281A (en) | 1992-05-27 | 1995-02-14 | Apple Computer, Inc. | Method and apparatus for deducing user intent and providing computer implemented services |
US5434777A (en) | 1992-05-27 | 1995-07-18 | Apple Computer, Inc. | Method and apparatus for processing natural language |
US5734789A (en) | 1992-06-01 | 1998-03-31 | Hughes Electronics | Voiced, unvoiced or noise modes in a CELP vocoder |
JP2795058B2 (en) | 1992-06-03 | 1998-09-10 | 松下電器産業株式会社 | Time series signal processing device |
US5543588A (en) | 1992-06-08 | 1996-08-06 | Synaptics, Incorporated | Touch pad driven handheld computing device |
US5502774A (en) | 1992-06-09 | 1996-03-26 | International Business Machines Corporation | Automatic recognition of a consistent message using multiple complimentary sources of information |
AU4013693A (en) | 1992-06-16 | 1993-12-23 | Honeywell Inc. | A method for utilizing a low resolution touch screen system in a high resolution graphics environment |
JPH064093A (en) | 1992-06-18 | 1994-01-14 | Matsushita Electric Ind Co Ltd | Hmm generating device, hmm storage device, likelihood calculating device, and recognizing device |
US5333275A (en) | 1992-06-23 | 1994-07-26 | Wheatley Barbara J | System and method for time aligning speech |
US5325297A (en) | 1992-06-25 | 1994-06-28 | System Of Multiple-Colored Images For Internationally Listed Estates, Inc. | Computer implemented method and system for storing and retrieving textual data and compressed image data |
US5835732A (en) | 1993-10-28 | 1998-11-10 | Elonex Ip Holdings, Ltd. | Miniature digital assistant having enhanced host communication |
JPH0619965A (en) | 1992-07-01 | 1994-01-28 | Canon Inc | Natural language processor |
US5303308A (en) | 1992-07-07 | 1994-04-12 | Gn Netcom A/S | Audio frequency signal compressing system |
JP3230319B2 (en) | 1992-07-09 | 2001-11-19 | ソニー株式会社 | Sound reproduction device |
US5625554A (en) | 1992-07-20 | 1997-04-29 | Xerox Corporation | Finite-state transduction of related word forms for text indexing and retrieval |
US5325462A (en) | 1992-08-03 | 1994-06-28 | International Business Machines Corporation | System and method for speech synthesis employing improved formant composition |
US5999908A (en) | 1992-08-06 | 1999-12-07 | Abelow; Daniel H. | Customer-based product design module |
JPH0669954A (en) | 1992-08-18 | 1994-03-11 | Fujitsu Ltd | Message supersession notice system |
US5412806A (en) | 1992-08-20 | 1995-05-02 | Hewlett-Packard Company | Calibration of logical cost formulae for queries in a heterogeneous DBMS using synthetic database |
GB9220404D0 (en) | 1992-08-20 | 1992-11-11 | Nat Security Agency | Method of identifying,retrieving and sorting documents |
US5305768A (en) | 1992-08-24 | 1994-04-26 | Product Development (Zgs) Ltd. | Dental flosser units and method of making same |
US5425108A (en) | 1992-09-04 | 1995-06-13 | Industrial Technology Research Institute | Mobile type of automatic identification system for a car plate |
DE4229577A1 (en) | 1992-09-04 | 1994-03-10 | Daimler Benz Ag | Method for speech recognition with which an adaptation of microphone and speech characteristics is achieved |
US5333236A (en) | 1992-09-10 | 1994-07-26 | International Business Machines Corporation | Speech recognizer having a speech coder for an acoustic match based on context-dependent speech-transition acoustic models |
US5982352A (en) | 1992-09-18 | 1999-11-09 | Pryor; Timothy R. | Method for providing human input to a computer |
US5384893A (en) | 1992-09-23 | 1995-01-24 | Emerson & Stern Associates, Inc. | Method and apparatus for speech synthesis based on prosodic analysis |
FR2696036B1 (en) | 1992-09-24 | 1994-10-14 | France Telecom | Method of measuring resemblance between sound samples and device for implementing this method. |
JPH0772840B2 (en) | 1992-09-29 | 1995-08-02 | 日本アイ・ビー・エム株式会社 | Speech model configuration method, speech recognition method, speech recognition device, and speech model training method |
JP2779886B2 (en) | 1992-10-05 | 1998-07-23 | 日本電信電話株式会社 | Wideband audio signal restoration method |
JP2851977B2 (en) | 1992-10-14 | 1999-01-27 | シャープ株式会社 | Playback device |
US5758313A (en) | 1992-10-16 | 1998-05-26 | Mobile Information Systems, Inc. | Method and apparatus for tracking vehicle location |
US5353374A (en) | 1992-10-19 | 1994-10-04 | Loral Aerospace Corporation | Low bit rate voice transmission for use in a noisy environment |
US5636325A (en) | 1992-11-13 | 1997-06-03 | International Business Machines Corporation | Speech synthesis and analysis of dialects |
US5983179A (en) | 1992-11-13 | 1999-11-09 | Dragon Systems, Inc. | Speech recognition system which turns its voice response on for confirmation when it has been turned off without confirmation |
US6092043A (en) | 1992-11-13 | 2000-07-18 | Dragon Systems, Inc. | Apparatuses and method for training and operating speech recognition systems |
DE69327774T2 (en) | 1992-11-18 | 2000-06-21 | Canon Information Syst Inc | Processor for converting data into speech and sequence control for this |
US5455888A (en) | 1992-12-04 | 1995-10-03 | Northern Telecom Limited | Speech bandwidth extension method and apparatus |
US5465401A (en) | 1992-12-15 | 1995-11-07 | Texas Instruments Incorporated | Communication system and methods for enhanced information transfer |
US5335276A (en) | 1992-12-16 | 1994-08-02 | Texas Instruments Incorporated | Communication system and methods for enhanced information transfer |
AU5803394A (en) | 1992-12-17 | 1994-07-04 | Bell Atlantic Network Services, Inc. | Mechanized directory assistance |
US5533182A (en) | 1992-12-22 | 1996-07-02 | International Business Machines Corporation | Aural position indicating mechanism for viewable objects |
US5412756A (en) | 1992-12-22 | 1995-05-02 | Mitsubishi Denki Kabushiki Kaisha | Artificial intelligence software shell for plant operation simulation |
WO1994015286A1 (en) | 1992-12-23 | 1994-07-07 | Taligent, Inc. | Object oriented framework system |
US5373566A (en) | 1992-12-24 | 1994-12-13 | Motorola, Inc. | Neural network-based diacritical marker recognition system and method |
FR2700055B1 (en) | 1992-12-30 | 1995-01-27 | Sextant Avionique | Method for denoising vector speech and device for implementing it. |
US6311157B1 (en) | 1992-12-31 | 2001-10-30 | Apple Computer, Inc. | Assigning meanings to utterances in a speech recognition system |
US5734791A (en) | 1992-12-31 | 1998-03-31 | Apple Computer, Inc. | Rapid tree-based method for vector quantization |
US5390279A (en) | 1992-12-31 | 1995-02-14 | Apple Computer, Inc. | Partitioning speech rules by context for speech recognition |
US5463725A (en) | 1992-12-31 | 1995-10-31 | International Business Machines Corp. | Data processing system graphical user interface which emulates printed material |
US5613036A (en) | 1992-12-31 | 1997-03-18 | Apple Computer, Inc. | Dynamic categories for a speech recognition system |
US5384892A (en) | 1992-12-31 | 1995-01-24 | Apple Computer, Inc. | Dynamic language model for speech recognition |
US5335011A (en) | 1993-01-12 | 1994-08-02 | Bell Communications Research, Inc. | Sound localization system for teleconferencing using self-steering microphone arrays |
JP2752309B2 (en) | 1993-01-19 | 1998-05-18 | 松下電器産業株式会社 | Display device |
US5642466A (en) | 1993-01-21 | 1997-06-24 | Apple Computer, Inc. | Intonation adjustment in text-to-speech systems |
US5490234A (en) | 1993-01-21 | 1996-02-06 | Apple Computer, Inc. | Waveform blending technique for text-to-speech system |
US5878396A (en) | 1993-01-21 | 1999-03-02 | Apple Computer, Inc. | Method and apparatus for synthetic speech in facial animation |
US6122616A (en) | 1993-01-21 | 2000-09-19 | Apple Computer, Inc. | Method and apparatus for diphone aliasing |
EP0609030B1 (en) | 1993-01-26 | 1999-06-09 | Sun Microsystems, Inc. | Method and apparatus for browsing information in a computer database |
US5491758A (en) | 1993-01-27 | 1996-02-13 | International Business Machines Corporation | Automatic handwriting recognition using both static and dynamic parameters |
US5890122A (en) | 1993-02-08 | 1999-03-30 | Microsoft Corporation | Voice-controlled computer simulateously displaying application menu and list of available commands |
US5449368A (en) | 1993-02-18 | 1995-09-12 | Kuzmak; Lubomyr I. | Laparoscopic adjustable gastric banding device and method for implantation and removal thereof |
US5864844A (en) | 1993-02-18 | 1999-01-26 | Apple Computer, Inc. | System and method for enhancing a user interface with a computer based training tool |
US5473728A (en) | 1993-02-24 | 1995-12-05 | The United States Of America As Represented By The Secretary Of The Navy | Training of homoscedastic hidden Markov models for automatic speech recognition |
US5467425A (en) | 1993-02-26 | 1995-11-14 | International Business Machines Corporation | Building scalable N-gram language models using maximum likelihood maximum entropy N-gram models |
CA2091658A1 (en) | 1993-03-15 | 1994-09-16 | Matthew Lennig | Method and apparatus for automation of directory assistance using speech recognition |
CA2119397C (en) | 1993-03-19 | 2007-10-02 | Kim E.A. Silverman | Improved automated voice synthesis employing enhanced prosodic treatment of text, spelling of text and rate of annunciation |
JPH06274586A (en) | 1993-03-22 | 1994-09-30 | Mitsubishi Electric Corp | Displaying system |
US6055531A (en) | 1993-03-24 | 2000-04-25 | Engate Incorporated | Down-line transcription system having context sensitive searching capability |
ES2139066T3 (en) | 1993-03-26 | 2000-02-01 | British Telecomm | CONVERSION OF TEXT TO A WAVE FORM. |
US5536902A (en) | 1993-04-14 | 1996-07-16 | Yamaha Corporation | Method of and apparatus for analyzing and synthesizing a sound by extracting and controlling a sound parameter |
US5444823A (en) | 1993-04-16 | 1995-08-22 | Compaq Computer Corporation | Intelligent search engine for associated on-line documentation having questionless case-based knowledge base |
US6496793B1 (en) | 1993-04-21 | 2002-12-17 | Borland Software Corporation | System and methods for national language support with embedded locale-specific language driver identifiers |
CA2095452C (en) | 1993-05-04 | 1997-03-18 | Phillip J. Beaudet | Dynamic hierarchical selection menu |
US5428731A (en) | 1993-05-10 | 1995-06-27 | Apple Computer, Inc. | Interactive multimedia delivery engine |
US5860064A (en) | 1993-05-13 | 1999-01-12 | Apple Computer, Inc. | Method and apparatus for automatic generation of vocal emotion in a synthetic text-to-speech system |
EP0626635B1 (en) | 1993-05-24 | 2003-03-05 | Sun Microsystems, Inc. | Improved graphical user interface with method for interfacing to remote devices |
US5652897A (en) | 1993-05-24 | 1997-07-29 | Unisys Corporation | Robust language processor for segmenting and parsing-language containing multiple instructions |
JPH06332617A (en) | 1993-05-25 | 1994-12-02 | Pfu Ltd | Display method in touch panel input device |
US5710922A (en) | 1993-06-02 | 1998-01-20 | Apple Computer, Inc. | Method for synchronizing and archiving information between computer systems |
WO1994029788A1 (en) | 1993-06-15 | 1994-12-22 | Honeywell Inc. | A method for utilizing a low resolution touch screen system in a high resolution graphics environment |
KR950001695A (en) | 1993-06-18 | 1995-01-03 | 오오가 노리오 | Disc player |
US5481739A (en) | 1993-06-23 | 1996-01-02 | Apple Computer, Inc. | Vector quantization using thresholds |
US5574823A (en) | 1993-06-23 | 1996-11-12 | Her Majesty The Queen In Right Of Canada As Represented By The Minister Of Communications | Frequency selective harmonic coding |
US5515475A (en) | 1993-06-24 | 1996-05-07 | Northern Telecom Limited | Speech recognition method using a two-pass search |
JPH0756933A (en) | 1993-06-24 | 1995-03-03 | Xerox Corp | Method for retrieval of document |
JP3685812B2 (en) | 1993-06-29 | 2005-08-24 | ソニー株式会社 | Audio signal transmitter / receiver |
JP2648558B2 (en) | 1993-06-29 | 1997-09-03 | インターナショナル・ビジネス・マシーンズ・コーポレイション | Information selection device and information selection method |
US5860075A (en) | 1993-06-30 | 1999-01-12 | Matsushita Electric Industrial Co., Ltd. | Document data filing apparatus for generating visual attribute values of document data to be filed |
US5973676A (en) | 1993-06-30 | 1999-10-26 | Kabushiki Kaisha Toshiba | Input apparatus suitable for portable electronic device |
US5794207A (en) | 1996-09-04 | 1998-08-11 | Walker Asset Management Limited Partnership | Method and apparatus for a cryptographically assisted commercial network system designed to facilitate buyer-driven conditional purchase offers |
AU7323694A (en) | 1993-07-07 | 1995-02-06 | Inference Corporation | Case-based organizing and querying of a database |
JPH0736882A (en) | 1993-07-19 | 1995-02-07 | Fujitsu Ltd | Dictionary retrieving device |
US5818182A (en) | 1993-08-13 | 1998-10-06 | Apple Computer, Inc. | Removable media ejection system |
US5495604A (en) | 1993-08-25 | 1996-02-27 | Asymetrix Corporation | Method and apparatus for the modeling and query of database structures using natural language-like constructs |
US5619694A (en) | 1993-08-26 | 1997-04-08 | Nec Corporation | Case database storage/retrieval system |
US5940811A (en) | 1993-08-27 | 1999-08-17 | Affinity Technology Group, Inc. | Closed loop financial transaction method and apparatus |
US5377258A (en) | 1993-08-30 | 1994-12-27 | National Medical Research Council | Method and apparatus for an automated and interactive behavioral guidance system |
US5627939A (en) | 1993-09-03 | 1997-05-06 | Microsoft Corporation | Speech recognition system and method employing data compression |
US5500937A (en) | 1993-09-08 | 1996-03-19 | Apple Computer, Inc. | Method and apparatus for editing an inked object while simultaneously displaying its recognized object |
US5568540A (en) | 1993-09-13 | 1996-10-22 | Active Voice Corporation | Method and apparatus for selecting and playing a voice mail message |
US6594688B2 (en) | 1993-10-01 | 2003-07-15 | Collaboration Properties, Inc. | Dedicated echo canceler for a workstation |
US5689641A (en) | 1993-10-01 | 1997-11-18 | Vicor, Inc. | Multimedia collaboration system arrangement for routing compressed AV signal through a participant site without decompressing the AV signal |
US5873056A (en) | 1993-10-12 | 1999-02-16 | The Syracuse University | Natural language processing system for semantic vector representation which accounts for lexical ambiguity |
JP2986345B2 (en) | 1993-10-18 | 1999-12-06 | インターナショナル・ビジネス・マシーンズ・コーポレイション | Voice recording indexing apparatus and method |
US5708659A (en) | 1993-10-20 | 1998-01-13 | Lsi Logic Corporation | Method for hashing in a packet network switching system |
JP3697276B2 (en) | 1993-10-27 | 2005-09-21 | ゼロックス コーポレイション | Image display method, image display apparatus, and image scaling method |
US5422656A (en) | 1993-11-01 | 1995-06-06 | International Business Machines Corp. | Personal communicator having improved contrast control for a liquid crystal, touch sensitive display |
JP2813728B2 (en) | 1993-11-01 | 1998-10-22 | インターナショナル・ビジネス・マシーンズ・コーポレイション | Personal communication device with zoom / pan function |
US5977950A (en) | 1993-11-29 | 1999-11-02 | Motorola, Inc. | Manually controllable cursor in a virtual image |
AU1303595A (en) | 1993-12-14 | 1995-07-03 | Apple Computer, Inc. | Method and apparatus for transferring data between a computer and a peripheral storage device |
US5578808A (en) | 1993-12-22 | 1996-11-26 | Datamark Services, Inc. | Data card that can be used for transactions involving separate card issuers |
ZA948426B (en) | 1993-12-22 | 1995-06-30 | Qualcomm Inc | Distributed voice recognition system |
US5384671A (en) | 1993-12-23 | 1995-01-24 | Quantum Corporation | PRML sampled data channel synchronous servo detector |
WO1995017711A1 (en) | 1993-12-23 | 1995-06-29 | Diacom Technologies, Inc. | Method and apparatus for implementing user feedback |
JP2610114B2 (en) | 1993-12-30 | 1997-05-14 | インターナショナル・ビジネス・マシーンズ・コーポレイション | Pointing system, computer system and force response method |
US5621859A (en) | 1994-01-19 | 1997-04-15 | Bbn Corporation | Single tree method for grammar directed, very large vocabulary speech recognizer |
US5577164A (en) | 1994-01-28 | 1996-11-19 | Canon Kabushiki Kaisha | Incorrect voice command recognition prevention and recovery processing method and apparatus |
US5583993A (en) | 1994-01-31 | 1996-12-10 | Apple Computer, Inc. | Method and apparatus for synchronously sharing data among computer |
US5577135A (en) | 1994-03-01 | 1996-11-19 | Apple Computer, Inc. | Handwriting signal processing front-end for handwriting recognizers |
AU684872B2 (en) | 1994-03-10 | 1998-01-08 | Cable And Wireless Plc | Communication system |
US5548507A (en) | 1994-03-14 | 1996-08-20 | International Business Machines Corporation | Language identification process using coded language words |
US5724406A (en) | 1994-03-22 | 1998-03-03 | Ericsson Messaging Systems, Inc. | Call processing system and method for providing a variety of messaging services |
US5584024A (en) | 1994-03-24 | 1996-12-10 | Software Ag | Interactive database query system and method for prohibiting the selection of semantically incorrect query parameters |
US5574824A (en) | 1994-04-11 | 1996-11-12 | The United States Of America As Represented By The Secretary Of The Air Force | Analysis/synthesis-based microphone array speech enhancer with variable signal distortion |
CH689410A5 (en) | 1994-04-21 | 1999-03-31 | Info Byte Ag | Method and apparatus for voice-activated remote control of electrical loads. |
GB9408042D0 (en) | 1994-04-22 | 1994-06-15 | Hewlett Packard Co | Device for managing voice data |
US5642519A (en) | 1994-04-29 | 1997-06-24 | Sun Microsystems, Inc. | Speech interpreter with a unified grammer compiler |
US5786803A (en) | 1994-05-09 | 1998-07-28 | Apple Computer, Inc. | System and method for adjusting the illumination characteristics of an output device |
US5670985A (en) | 1994-05-09 | 1997-09-23 | Apple Computer, Inc. | System and method for adjusting the output of an output device to compensate for ambient illumination |
US5828768A (en) | 1994-05-11 | 1998-10-27 | Noise Cancellation Technologies, Inc. | Multimedia personal computer with active noise reduction and piezo speakers |
US5596260A (en) | 1994-05-13 | 1997-01-21 | Apple Computer, Inc. | Apparatus and method for determining a charge of a battery |
JPH07320079A (en) | 1994-05-20 | 1995-12-08 | Nippon Telegr & Teleph Corp <Ntt> | Method and device for partial enlargement display of figure |
JPH07320051A (en) | 1994-05-20 | 1995-12-08 | Nippon Telegr & Teleph Corp <Ntt> | Method and device for enlargement and reduction display in optional area of graphic |
DE69520302T2 (en) | 1994-05-25 | 2001-08-09 | Victor Company Of Japan | Data transfer device with variable transmission rate |
JPH07325591A (en) | 1994-05-31 | 1995-12-12 | Nec Corp | Method and device for generating imitated musical sound performance environment |
US5535121A (en) | 1994-06-01 | 1996-07-09 | Mitsubishi Electric Research Laboratories, Inc. | System for correcting auxiliary verb sequences |
US5521816A (en) | 1994-06-01 | 1996-05-28 | Mitsubishi Electric Research Laboratories, Inc. | Word inflection correction system |
US5537317A (en) | 1994-06-01 | 1996-07-16 | Mitsubishi Electric Research Laboratories Inc. | System for correcting grammer based parts on speech probability |
US5485372A (en) | 1994-06-01 | 1996-01-16 | Mitsubishi Electric Research Laboratories, Inc. | System for underlying spelling recovery |
US5477448A (en) | 1994-06-01 | 1995-12-19 | Mitsubishi Electric Research Laboratories, Inc. | System for correcting improper determiners |
US5644656A (en) | 1994-06-07 | 1997-07-01 | Massachusetts Institute Of Technology | Method and apparatus for automated text recognition |
US5493677A (en) | 1994-06-08 | 1996-02-20 | Systems Research & Applications Corporation | Generation, archiving, and retrieval of digital images with evoked suggestion-set captions and natural language interface |
US5812697A (en) | 1994-06-10 | 1998-09-22 | Nippon Steel Corporation | Method and apparatus for recognizing hand-written characters using a weighting dictionary |
US5675819A (en) | 1994-06-16 | 1997-10-07 | Xerox Corporation | Document information retrieval using global word co-occurrence patterns |
JPH0869470A (en) | 1994-06-21 | 1996-03-12 | Canon Inc | Natural language processing device and method |
US5948040A (en) | 1994-06-24 | 1999-09-07 | Delorme Publishing Co. | Travel reservation information and planning system |
US5610812A (en) | 1994-06-24 | 1997-03-11 | Mitsubishi Electric Information Technology Center America, Inc. | Contextual tagger utilizing deterministic finite state transducer |
US5581484A (en) | 1994-06-27 | 1996-12-03 | Prince; Kevin R. | Finger mounted computer input device |
DE69533479T2 (en) | 1994-07-01 | 2005-09-22 | Palm Computing, Inc., Los Altos | CHARACTER SET WITH CHARACTERS FROM MULTIPLE BARS AND HANDWRITING IDENTIFICATION SYSTEM |
US6442523B1 (en) | 1994-07-22 | 2002-08-27 | Steven H. Siegel | Method for the auditory navigation of text |
US5568536A (en) | 1994-07-25 | 1996-10-22 | International Business Machines Corporation | Selective reconfiguration method and apparatus in a multiple application personal communications device |
CN1059303C (en) | 1994-07-25 | 2000-12-06 | 国际商业机器公司 | Apparatus and method for marking text on a display screen in a personal communications device |
JP3359745B2 (en) | 1994-07-29 | 2002-12-24 | シャープ株式会社 | Moving image reproducing device and moving image recording device |
JP3586777B2 (en) | 1994-08-17 | 2004-11-10 | 富士通株式会社 | Voice input device |
JP3565453B2 (en) | 1994-08-23 | 2004-09-15 | キヤノン株式会社 | Image input / output device |
US6137476A (en) | 1994-08-25 | 2000-10-24 | International Business Machines Corp. | Data mouse |
JPH0877173A (en) | 1994-09-01 | 1996-03-22 | Fujitsu Ltd | System and method for correcting character string |
US5559301A (en) | 1994-09-15 | 1996-09-24 | Korg, Inc. | Touchscreen interface having pop-up variable adjustment displays for controllers and audio processing systems |
DE69524340T2 (en) | 1994-09-22 | 2002-08-14 | Aisin Aw Co | Touch display for an information entry system |
GB9419388D0 (en) | 1994-09-26 | 1994-11-09 | Canon Kk | Speech analysis |
JP3027321B2 (en) | 1994-09-27 | 2000-04-04 | 財団法人工業技術研究院 | Method and apparatus for online recognition of unrestricted handwritten alphanumeric characters |
US5799268A (en) | 1994-09-28 | 1998-08-25 | Apple Computer, Inc. | Method for extracting knowledge from online documentation and creating a glossary, index, help database or the like |
IT1266943B1 (en) | 1994-09-29 | 1997-01-21 | Cselt Centro Studi Lab Telecom | VOICE SYNTHESIS PROCEDURE BY CONCATENATION AND PARTIAL OVERLAPPING OF WAVE FORMS. |
US5682539A (en) | 1994-09-29 | 1997-10-28 | Conrad; Donovan | Anticipated meaning natural language interface |
US5715468A (en) | 1994-09-30 | 1998-02-03 | Budzinski; Robert Lucius | Memory system for storing and retrieving experience and knowledge with natural language |
GB2293667B (en) | 1994-09-30 | 1998-05-27 | Intermation Limited | Database management system |
US5777614A (en) | 1994-10-14 | 1998-07-07 | Hitachi, Ltd. | Editing support system including an interactive interface |
US5661787A (en) | 1994-10-27 | 1997-08-26 | Pocock; Michael H. | System for on-demand remote access to a self-generating audio recording, storage, indexing and transaction system |
US5845255A (en) | 1994-10-28 | 1998-12-01 | Advanced Health Med-E-Systems Corporation | Prescription management system |
JPH08138321A (en) | 1994-11-11 | 1996-05-31 | Pioneer Electron Corp | Disc player |
US5613122A (en) | 1994-11-14 | 1997-03-18 | Object Technology Licensing Corp. | Object-oriented operating system |
US5652884A (en) | 1994-11-14 | 1997-07-29 | Object Technology Licensing Corp. | Method and apparatus for dynamic update of an existing object in an object editor |
US5577241A (en) | 1994-12-07 | 1996-11-19 | Excite, Inc. | Information retrieval system and method with implementation extensible query architecture |
US5748974A (en) | 1994-12-13 | 1998-05-05 | International Business Machines Corporation | Multimodal natural language interface for cross-application tasks |
DE4445023A1 (en) | 1994-12-16 | 1996-06-20 | Thomson Brandt Gmbh | Vibration resistant player with reduced energy consumption |
JPH08185265A (en) | 1994-12-28 | 1996-07-16 | Fujitsu Ltd | Touch panel controller |
US5682475A (en) | 1994-12-30 | 1997-10-28 | International Business Machines Corporation | Method and system for variable password access |
US5774859A (en) | 1995-01-03 | 1998-06-30 | Scientific-Atlanta, Inc. | Information system having a speech interface |
US5794050A (en) | 1995-01-04 | 1998-08-11 | Intelligent Text Processing, Inc. | Natural language understanding system |
US5835077A (en) | 1995-01-13 | 1998-11-10 | Remec, Inc., | Computer control device |
US5634084A (en) | 1995-01-20 | 1997-05-27 | Centigram Communications Corporation | Abbreviation and acronym/initialism expansion procedures for a text to speech reader |
DE69637733D1 (en) | 1995-02-13 | 2008-12-11 | Intertrust Tech Corp | SYSTEMS AND METHOD FOR SAFE TRANSMISSION |
US5565888A (en) | 1995-02-17 | 1996-10-15 | International Business Machines Corporation | Method and apparatus for improving visibility and selectability of icons |
JPH08227341A (en) | 1995-02-22 | 1996-09-03 | Mitsubishi Electric Corp | User interface |
US6009237A (en) | 1995-02-24 | 1999-12-28 | Hitachi Ltd. | Optical disk and optical disk reproduction apparatus |
US5748512A (en) | 1995-02-28 | 1998-05-05 | Microsoft Corporation | Adjusting keyboard |
US5543897A (en) | 1995-03-07 | 1996-08-06 | Eastman Kodak Company | Reproduction apparatus having touch screen operator interface and auxiliary keyboard |
US5701400A (en) | 1995-03-08 | 1997-12-23 | Amado; Carlos Armando | Method and apparatus for applying if-then-else rules to data sets in a relational data base and generating from the results of application of said rules a database of diagnostics linked to said data sets to aid executive analysis of financial data |
US5564446A (en) | 1995-03-27 | 1996-10-15 | Wiltshire; Curtis B. | Dental floss device and applicator assembly |
US5749081A (en) | 1995-04-06 | 1998-05-05 | Firefly Network, Inc. | System and method for recommending items to a user |
EP0820626B1 (en) | 1995-04-12 | 2001-10-10 | BRITISH TELECOMMUNICATIONS public limited company | Waveform speech synthesis |
US5616876A (en) | 1995-04-19 | 1997-04-01 | Microsoft Corporation | System and methods for selecting music on the basis of subjective content |
US5943049A (en) | 1995-04-27 | 1999-08-24 | Casio Computer Co., Ltd. | Image processor for displayed message, balloon, and character's face |
US5642464A (en) | 1995-05-03 | 1997-06-24 | Northern Telecom Limited | Methods and apparatus for noise conditioning in digital speech compression systems using linear predictive coding |
US5812698A (en) | 1995-05-12 | 1998-09-22 | Synaptics, Inc. | Handwriting recognition system and method |
US5708822A (en) | 1995-05-31 | 1998-01-13 | Oracle Corporation | Methods and apparatus for thematic parsing of discourse |
TW338815B (en) | 1995-06-05 | 1998-08-21 | Motorola Inc | Method and apparatus for character recognition of handwritten input |
US6268859B1 (en) | 1995-06-06 | 2001-07-31 | Apple Computer, Inc. | Method and system for rendering overlapping opaque graphical objects in graphic imaging systems |
US5920327A (en) | 1995-06-06 | 1999-07-06 | Microsoft Corporation | Multiple resolution data display |
US5664055A (en) | 1995-06-07 | 1997-09-02 | Lucent Technologies Inc. | CS-ACELP speech compression system with adaptive pitch prediction filter gain based on a measure of periodicity |
US6496182B1 (en) | 1995-06-07 | 2002-12-17 | Microsoft Corporation | Method and system for providing touch-sensitive screens for the visually impaired |
US5991441A (en) | 1995-06-07 | 1999-11-23 | Wang Laboratories, Inc. | Real time handwriting recognition system |
FI99072C (en) | 1995-06-08 | 1997-09-25 | Nokia Telecommunications Oy | A method for issuing delivery confirmations of message deliveries over a telephone network |
JP3385146B2 (en) | 1995-06-13 | 2003-03-10 | シャープ株式会社 | Conversational sentence translator |
US6330538B1 (en) | 1995-06-13 | 2001-12-11 | British Telecommunications Public Limited Company | Phonetic unit duration adjustment for text-to-speech system |
US5710886A (en) | 1995-06-16 | 1998-01-20 | Sellectsoft, L.C. | Electric couponing method and apparatus |
JP3284832B2 (en) | 1995-06-22 | 2002-05-20 | セイコーエプソン株式会社 | Speech recognition dialogue processing method and speech recognition dialogue device |
JPH0918585A (en) | 1995-07-03 | 1997-01-17 | Matsushita Electric Ind Co Ltd | Voice mail system |
JPH0916598A (en) | 1995-07-03 | 1997-01-17 | Fujitsu Ltd | System and method for character string correction using error pattern |
US6038533A (en) | 1995-07-07 | 2000-03-14 | Lucent Technologies Inc. | System and method for selecting training text |
US5684513A (en) | 1995-07-17 | 1997-11-04 | Decker; Mark Randall | Electronic luminescence keyboard system for a portable device |
US5760760A (en) | 1995-07-17 | 1998-06-02 | Dell Usa, L.P. | Intelligent LCD brightness control system |
US5949961A (en) | 1995-07-19 | 1999-09-07 | International Business Machines Corporation | Word syllabification in speech synthesis system |
US5999895A (en) | 1995-07-24 | 1999-12-07 | Forest; Donald K. | Sound operated menu method and apparatus |
US5864815A (en) | 1995-07-31 | 1999-01-26 | Microsoft Corporation | Method and system for displaying speech recognition status information in a visual notification area |
KR0183726B1 (en) | 1995-07-31 | 1999-04-15 | 윤종용 | Cd regenerative apparatus regenerating signal from cd ok and video cd |
US5724985A (en) | 1995-08-02 | 1998-03-10 | Pacesetter, Inc. | User interface for an implantable medical device using an integrated digitizer display screen |
JPH0955792A (en) | 1995-08-11 | 1997-02-25 | Ricoh Co Ltd | Voice mail system |
US6026388A (en) | 1995-08-16 | 2000-02-15 | Textwise, Llc | User interface and other enhancements for natural language information retrieval system and method |
US5835721A (en) | 1995-08-21 | 1998-11-10 | Apple Computer, Inc. | Method and system for data transmission over a network link between computers with the ability to withstand temporary interruptions |
JP3697748B2 (en) | 1995-08-21 | 2005-09-21 | セイコーエプソン株式会社 | Terminal, voice recognition device |
WO1997008685A2 (en) | 1995-08-28 | 1997-03-06 | Philips Electronics N.V. | Method and system for pattern recognition based on dynamically constructing a subset of reference vectors |
KR19990044066A (en) | 1995-09-02 | 1999-06-25 | 에이지마. 헨리 | Loudspeaker with panel acoustic radiation element |
US5570324A (en) | 1995-09-06 | 1996-10-29 | Northrop Grumman Corporation | Underwater sound localization system |
US5712957A (en) | 1995-09-08 | 1998-01-27 | Carnegie Mellon University | Locating and correcting erroneously recognized portions of utterances by rescoring based on two n-best lists |
US5855000A (en) | 1995-09-08 | 1998-12-29 | Carnegie Mellon University | Method and apparatus for correcting and repairing machine-transcribed input using independent or cross-modal secondary input |
DE19533541C1 (en) | 1995-09-11 | 1997-03-27 | Daimler Benz Aerospace Ag | Method for the automatic control of one or more devices by voice commands or by voice dialog in real time and device for executing the method |
WO1997010586A1 (en) | 1995-09-14 | 1997-03-20 | Ericsson Inc. | System for adaptively filtering audio signals to enhance speech intelligibility in noisy environmental conditions |
US5790978A (en) | 1995-09-15 | 1998-08-04 | Lucent Technologies, Inc. | System and method for determining pitch contours |
US5737734A (en) | 1995-09-15 | 1998-04-07 | Infonautics Corporation | Query word relevance adjustment in a search of an information retrieval system |
US6173261B1 (en) | 1998-09-30 | 2001-01-09 | At&T Corp | Grammar fragment acquisition using syntactic and semantic clustering |
JPH0981320A (en) | 1995-09-20 | 1997-03-28 | Matsushita Electric Ind Co Ltd | Pen input type selection input device and method therefor |
US5771276A (en) | 1995-10-10 | 1998-06-23 | Ast Research, Inc. | Voice templates for interactive voice mail and voice response system |
US5884323A (en) | 1995-10-13 | 1999-03-16 | 3Com Corporation | Extendible method and apparatus for synchronizing files on two different computer systems |
US6560707B2 (en) | 1995-11-06 | 2003-05-06 | Xerox Corporation | Multimedia coordination system |
US5799276A (en) | 1995-11-07 | 1998-08-25 | Accent Incorporated | Knowledge-based speech recognition system and methods having frame length computed based upon estimated pitch period of vocalic intervals |
JPH09146708A (en) | 1995-11-09 | 1997-06-06 | Internatl Business Mach Corp <Ibm> | Driving method for touch panel and touch input method |
JP3152871B2 (en) | 1995-11-10 | 2001-04-03 | 富士通株式会社 | Dictionary search apparatus and method for performing a search using a lattice as a key |
US5794237A (en) | 1995-11-13 | 1998-08-11 | International Business Machines Corporation | System and method for improving problem source identification in computer systems employing relevance feedback and statistical source ranking |
US5799279A (en) | 1995-11-13 | 1998-08-25 | Dragon Systems, Inc. | Continuous speech recognition of text and commands |
US6064959A (en) | 1997-03-28 | 2000-05-16 | Dragon Systems, Inc. | Error correction in speech recognition |
US5802526A (en) | 1995-11-15 | 1998-09-01 | Microsoft Corporation | System and method for graphically displaying and navigating through an interactive voice response menu |
US5801692A (en) | 1995-11-30 | 1998-09-01 | Microsoft Corporation | Audio-visual user interface controls |
US6240384B1 (en) | 1995-12-04 | 2001-05-29 | Kabushiki Kaisha Toshiba | Speech synthesis method |
US5987401A (en) | 1995-12-08 | 1999-11-16 | Apple Computer, Inc. | Language translation for real-time text-based conversations |
US5880731A (en) | 1995-12-14 | 1999-03-09 | Microsoft Corporation | Use of avatars with automatic gesturing and bounded interaction in on-line chat session |
US5761640A (en) | 1995-12-18 | 1998-06-02 | Nynex Science & Technology, Inc. | Name and address processor |
US5706442A (en) | 1995-12-20 | 1998-01-06 | Block Financial Corporation | System for on-line financial services using distributed objects |
JPH09179719A (en) | 1995-12-26 | 1997-07-11 | Nec Corp | Voice synthesizer |
US5787422A (en) | 1996-01-11 | 1998-07-28 | Xerox Corporation | Method and apparatus for information accesss employing overlapping clusters |
WO1997026612A1 (en) | 1996-01-17 | 1997-07-24 | Personal Agents, Inc. | Intelligent agents for electronic commerce |
US6119101A (en) | 1996-01-17 | 2000-09-12 | Personal Agents, Inc. | Intelligent agents for electronic commerce |
US6125356A (en) | 1996-01-18 | 2000-09-26 | Rosefaire Development, Ltd. | Portable sales presentation system with selective scripted seller prompts |
US6011585A (en) | 1996-01-19 | 2000-01-04 | Apple Computer, Inc. | Apparatus and method for rotating the display orientation of a captured image |
JPH09265731A (en) | 1996-01-24 | 1997-10-07 | Sony Corp | Speech reproducing device and its method, speech recording device and its method, speech recording and reproducing system, speech data transfer method, information receiving device, and reproducing device |
US5987404A (en) | 1996-01-29 | 1999-11-16 | International Business Machines Corporation | Statistical natural language understanding using hidden clumpings |
US5729694A (en) | 1996-02-06 | 1998-03-17 | The Regents Of The University Of California | Speech coding, reconstruction and recognition using acoustics and electromagnetic waves |
US6535610B1 (en) | 1996-02-07 | 2003-03-18 | Morgan Stanley & Co. Incorporated | Directional microphone utilizing spaced apart omni-directional microphones |
US6076088A (en) | 1996-02-09 | 2000-06-13 | Paik; Woojin | Information extraction system and method using concept relation concept (CRC) triples |
US5737487A (en) | 1996-02-13 | 1998-04-07 | Apple Computer, Inc. | Speaker adaptation based on lateral tying for large-vocabulary continuous speech recognition |
US5864868A (en) | 1996-02-13 | 1999-01-26 | Contois; David C. | Computer control system and user interface for media playing devices |
US5835893A (en) | 1996-02-15 | 1998-11-10 | Atr Interpreting Telecommunications Research Labs | Class-based word clustering for speech recognition using a three-level balanced hierarchical similarity |
FI102343B (en) | 1996-02-20 | 1998-11-13 | Sonera Oyj | System and method for transmitting data |
GB2310559B (en) | 1996-02-23 | 2000-09-20 | Nokia Mobile Phones Ltd | Audio output apparatus for a mobile communication device |
US5864855A (en) | 1996-02-26 | 1999-01-26 | The United States Of America As Represented By The Secretary Of The Army | Parallel document clustering process |
EP0823112B1 (en) | 1996-02-27 | 2002-05-02 | Koninklijke Philips Electronics N.V. | Method and apparatus for automatic speech segmentation into phoneme-like units |
US5895448A (en) | 1996-02-29 | 1999-04-20 | Nynex Science And Technology, Inc. | Methods and apparatus for generating and using speaker independent garbage models for speaker dependent speech recognition purpose |
US6226533B1 (en) | 1996-02-29 | 2001-05-01 | Sony Corporation | Voice messaging transceiver message duration indicator and method |
US5842165A (en) | 1996-02-29 | 1998-11-24 | Nynex Science & Technology, Inc. | Methods and apparatus for generating and using garbage models for speaker dependent speech recognition purposes |
US6069622A (en) | 1996-03-08 | 2000-05-30 | Microsoft Corporation | Method and system for generating comic panels |
GB9605216D0 (en) | 1996-03-12 | 1996-05-15 | Ncr Int Inc | Display system and method of moving a cursor of the display system |
JP3160707B2 (en) | 1996-03-22 | 2001-04-25 | 富士通株式会社 | Data transmitting / receiving device, data transmitting device, and data receiving device |
JPH09265457A (en) | 1996-03-29 | 1997-10-07 | Hitachi Ltd | On-line conversation system |
WO1997037346A1 (en) | 1996-03-29 | 1997-10-09 | British Telecommunications Public Limited Company | Speech processing |
US5901287A (en) | 1996-04-01 | 1999-05-04 | The Sabre Group Inc. | Information aggregation and synthesization system |
US5867799A (en) | 1996-04-04 | 1999-02-02 | Lang; Andrew K. | Information system and method for filtering a massive flow of information entities to meet user information classification needs |
US5790671A (en) | 1996-04-04 | 1998-08-04 | Ericsson Inc. | Method for automatically adjusting audio response for improved intelligibility |
US6173194B1 (en) | 1996-04-15 | 2001-01-09 | Nokia Mobile Phones Limited | Mobile terminal having improved user interface |
US5987140A (en) | 1996-04-26 | 1999-11-16 | Verifone, Inc. | System, method and article of manufacture for secure network electronic payment and credit collection |
US5963924A (en) | 1996-04-26 | 1999-10-05 | Verifone, Inc. | System, method and article of manufacture for the use of payment instrument holders and payment instruments in network electronic commerce |
US5913193A (en) | 1996-04-30 | 1999-06-15 | Microsoft Corporation | Method and system of runtime acoustic unit selection for speech synthesis |
US5857184A (en) | 1996-05-03 | 1999-01-05 | Walden Media, Inc. | Language and method for creating, organizing, and retrieving data from a database |
US5828999A (en) | 1996-05-06 | 1998-10-27 | Apple Computer, Inc. | Method and system for deriving a large-span semantic language model for large-vocabulary recognition systems |
FR2748342B1 (en) | 1996-05-06 | 1998-07-17 | France Telecom | METHOD AND DEVICE FOR FILTERING A SPEECH SIGNAL BY EQUALIZATION, USING A STATISTICAL MODEL OF THIS SIGNAL |
US5917487A (en) | 1996-05-10 | 1999-06-29 | Apple Computer, Inc. | Data-driven method and system for drawing user interface objects |
US5826261A (en) | 1996-05-10 | 1998-10-20 | Spencer; Graham | System and method for querying multiple, distributed databases by selective sharing of local relative significance information for terms related to the query |
US6366883B1 (en) | 1996-05-15 | 2002-04-02 | Atr Interpreting Telecommunications | Concatenation of speech segments by use of a speech synthesizer |
US5758314A (en) | 1996-05-21 | 1998-05-26 | Sybase, Inc. | Client/server database system with methods for improved soundex processing in a heterogeneous language environment |
US5727950A (en) | 1996-05-22 | 1998-03-17 | Netsage Corporation | Agent based instruction system and method |
US6556712B1 (en) | 1996-05-23 | 2003-04-29 | Apple Computer, Inc. | Methods and apparatus for handwriting recognition |
US5848386A (en) | 1996-05-28 | 1998-12-08 | Ricoh Company, Ltd. | Method and system for translating documents using different translation resources for different portions of the documents |
US5850480A (en) | 1996-05-30 | 1998-12-15 | Scan-Optics, Inc. | OCR error correction methods and apparatus utilizing contextual comparison |
JP2856390B2 (en) | 1996-07-26 | 1999-02-10 | 株式会社日立製作所 | Information recording medium and recording / reproducing method using the same |
US5966533A (en) | 1996-06-11 | 1999-10-12 | Excite, Inc. | Method and system for dynamically synthesizing a computer program by differentially resolving atoms based on user context data |
US5915249A (en) | 1996-06-14 | 1999-06-22 | Excite, Inc. | System and method for accelerated query evaluation of very large full-text databases |
US5987132A (en) | 1996-06-17 | 1999-11-16 | Verifone, Inc. | System, method and article of manufacture for conditionally accepting a payment method utilizing an extensible, flexible architecture |
US5832433A (en) | 1996-06-24 | 1998-11-03 | Nynex Science And Technology, Inc. | Speech synthesis method for operator assistance telecommunications calls comprising a plurality of text-to-speech (TTS) devices |
JP2973944B2 (en) | 1996-06-26 | 1999-11-08 | 富士ゼロックス株式会社 | Document processing apparatus and document processing method |
US5912952A (en) | 1996-06-27 | 1999-06-15 | At&T Corp | Voice response unit with a visual menu interface |
US5825881A (en) | 1996-06-28 | 1998-10-20 | Allsoft Distributing Inc. | Public network merchandising system |
US5802466A (en) | 1996-06-28 | 1998-09-01 | Mci Communications Corporation | Personal communication device voice mail notification apparatus and method |
US6070147A (en) | 1996-07-02 | 2000-05-30 | Tecmark Services, Inc. | Customer identification and marketing analysis systems |
US6054990A (en) | 1996-07-05 | 2000-04-25 | Tran; Bao Q. | Computer system with handwriting annotation |
US5915238A (en) | 1996-07-16 | 1999-06-22 | Tjaden; Gary S. | Personalized audio information delivery system |
CA2261262C (en) | 1996-07-22 | 2007-08-21 | Cyva Research Corporation | Personal information security and exchange tool |
US5862223A (en) | 1996-07-24 | 1999-01-19 | Walker Asset Management Limited Partnership | Method and apparatus for a cryptographically-assisted commercial network system designed to facilitate and support expert-based commerce |
US6453281B1 (en) | 1996-07-30 | 2002-09-17 | Vxi Corporation | Portable audio database device with icon-based graphical user-interface |
KR100260760B1 (en) | 1996-07-31 | 2000-07-01 | 모리 하루오 | Information display system with touch panel |
US5818924A (en) | 1996-08-02 | 1998-10-06 | Siemens Business Communication Systems, Inc. | Combined keypad and protective cover |
US5797008A (en) | 1996-08-09 | 1998-08-18 | Digital Equipment Corporation | Memory storing an integrated index of database records |
US5765168A (en) | 1996-08-09 | 1998-06-09 | Digital Equipment Corporation | Method for maintaining an index |
US7113958B1 (en) | 1996-08-12 | 2006-09-26 | Battelle Memorial Institute | Three-dimensional display of document set |
US5818451A (en) | 1996-08-12 | 1998-10-06 | International Busienss Machines Corporation | Computer programmed soft keyboard system, method and apparatus having user input displacement |
US6298174B1 (en) | 1996-08-12 | 2001-10-02 | Battelle Memorial Institute | Three-dimensional display of document set |
US7191135B2 (en) | 1998-04-08 | 2007-03-13 | Symbol Technologies, Inc. | Speech recognition system and method for employing the same |
US6216102B1 (en) | 1996-08-19 | 2001-04-10 | International Business Machines Corporation | Natural language determination using partial words |
US5822730A (en) | 1996-08-22 | 1998-10-13 | Dragon Systems, Inc. | Lexical tree pre-filtering in speech recognition |
US5950123A (en) | 1996-08-26 | 1999-09-07 | Telefonaktiebolaget L M | Cellular telephone network support of audible information delivery to visually impaired subscribers |
CA2264167A1 (en) | 1996-08-28 | 1998-03-05 | Via, Inc. | Touch screen systems and methods |
US5999169A (en) | 1996-08-30 | 1999-12-07 | International Business Machines Corporation | Computer graphical user interface method and system for supporting multiple two-dimensional movement inputs |
US5850629A (en) | 1996-09-09 | 1998-12-15 | Matsushita Electric Industrial Co., Ltd. | User interface controller for text-to-speech synthesizer |
US5878393A (en) | 1996-09-09 | 1999-03-02 | Matsushita Electric Industrial Co., Ltd. | High quality concatenative reading system |
US5745116A (en) | 1996-09-09 | 1998-04-28 | Motorola, Inc. | Intuitive gesture-based graphical user interface |
EP0829811A1 (en) | 1996-09-11 | 1998-03-18 | Nippon Telegraph And Telephone Corporation | Method and system for information retrieval |
US6035267A (en) | 1996-09-26 | 2000-03-07 | Mitsubishi Denki Kabushiki Kaisha | Interactive processing apparatus having natural language interfacing capability, utilizing goal frames, and judging action feasibility |
US5876396A (en) | 1996-09-27 | 1999-03-02 | Baxter International Inc. | System method and container for holding and delivering a solution |
US6181935B1 (en) | 1996-09-27 | 2001-01-30 | Software.Com, Inc. | Mobility extended telephone application programming interface and method of use |
US5794182A (en) | 1996-09-30 | 1998-08-11 | Apple Computer, Inc. | Linear predictive speech encoding systems with efficient combination pitch coefficients computation |
US6208932B1 (en) | 1996-09-30 | 2001-03-27 | Mazda Motor Corporation | Navigation apparatus |
US5721827A (en) | 1996-10-02 | 1998-02-24 | James Logan | System for electrically distributing personalized information |
US5732216A (en) | 1996-10-02 | 1998-03-24 | Internet Angles, Inc. | Audio message exchange system |
US6199076B1 (en) | 1996-10-02 | 2001-03-06 | James Logan | Audio program player including a dynamic program selection controller |
US20020120925A1 (en) | 2000-03-28 | 2002-08-29 | Logan James D. | Audio and video program recording, editing and playback systems using metadata |
US20070026852A1 (en) | 1996-10-02 | 2007-02-01 | James Logan | Multimedia telephone system |
US5913203A (en) | 1996-10-03 | 1999-06-15 | Jaesent Inc. | System and method for pseudo cash transactions |
US5930769A (en) | 1996-10-07 | 1999-07-27 | Rose; Andrea | System and method for fashion shopping |
US7051096B1 (en) | 1999-09-02 | 2006-05-23 | Citicorp Development Center, Inc. | System and method for providing global self-service financial transaction terminals with worldwide web content, centralized management, and local and remote administration |
US6073033A (en) | 1996-11-01 | 2000-06-06 | Telxon Corporation | Portable telephone with integrated heads-up display and data terminal functions |
EP0840396B1 (en) | 1996-11-04 | 2003-02-19 | Molex Incorporated | Electrical connector for telephone handset |
US6233318B1 (en) | 1996-11-05 | 2001-05-15 | Comverse Network Systems, Inc. | System for accessing multimedia mailboxes and messages over the internet and via telephone |
US5956667A (en) | 1996-11-08 | 1999-09-21 | Research Foundation Of State University Of New York | System and methods for frame-based augmentative communication |
US5918303A (en) | 1996-11-25 | 1999-06-29 | Yamaha Corporation | Performance setting data selecting apparatus |
US5836771A (en) | 1996-12-02 | 1998-11-17 | Ho; Chi Fai | Learning method and system based on questioning |
US5875427A (en) | 1996-12-04 | 1999-02-23 | Justsystem Corp. | Voice-generating/document making apparatus voice-generating/document making method and computer-readable medium for storing therein a program having a computer execute voice-generating/document making sequence |
US6665639B2 (en) | 1996-12-06 | 2003-12-16 | Sensory, Inc. | Speech recognition in consumer electronic products |
US6078914A (en) | 1996-12-09 | 2000-06-20 | Open Text Corporation | Natural language meta-search system and method |
JP3349905B2 (en) | 1996-12-10 | 2002-11-25 | 松下電器産業株式会社 | Voice synthesis method and apparatus |
US6023676A (en) | 1996-12-12 | 2000-02-08 | Dspc Israel, Ltd. | Keyword recognition system and method |
US6157935A (en) | 1996-12-17 | 2000-12-05 | Tran; Bao Q. | Remote data access and management system |
US5839106A (en) | 1996-12-17 | 1998-11-17 | Apple Computer, Inc. | Large-vocabulary speech recognition using an integrated syntactic and semantic statistical language model |
US5926789A (en) | 1996-12-19 | 1999-07-20 | Bell Communications Research, Inc. | Audio-based wide area information system |
US6177931B1 (en) | 1996-12-19 | 2001-01-23 | Index Systems, Inc. | Systems and methods for displaying and recording control interface with television programs, video, advertising information and program scheduling information |
US5966126A (en) | 1996-12-23 | 1999-10-12 | Szabo; Andrew J. | Graphic user interface for database system |
US5739451A (en) | 1996-12-27 | 1998-04-14 | Franklin Electronic Publishers, Incorporated | Hand held electronic music encyclopedia with text and note structure search |
US5932869A (en) | 1996-12-27 | 1999-08-03 | Graphic Technology, Inc. | Promotional system with magnetic stripe and visual thermo-reversible print surfaced medium |
US6111562A (en) | 1997-01-06 | 2000-08-29 | Intel Corporation | System for generating an audible cue indicating the status of a display object |
US7787647B2 (en) | 1997-01-13 | 2010-08-31 | Micro Ear Technology, Inc. | Portable system for programming hearing aids |
JP3579204B2 (en) | 1997-01-17 | 2004-10-20 | 富士通株式会社 | Document summarizing apparatus and method |
US5933477A (en) | 1997-01-22 | 1999-08-03 | Lucent Technologies Inc. | Changing-urgency-dependent message or call delivery |
US5815225A (en) | 1997-01-22 | 1998-09-29 | Gateway 2000, Inc. | Lighting apparatus for a portable computer with illumination apertures |
US5953541A (en) | 1997-01-24 | 1999-09-14 | Tegic Communications, Inc. | Disambiguating system for disambiguating ambiguous input sequences by displaying objects associated with the generated input sequences in the order of decreasing frequency of use |
US6684376B1 (en) | 1997-01-27 | 2004-01-27 | Unisys Corporation | Method and apparatus for selecting components within a circuit design database |
US6006274A (en) | 1997-01-30 | 1999-12-21 | 3Com Corporation | Method and apparatus using a pass through personal computer connected to both a local communication link and a computer network for indentifying and synchronizing a preferred computer with a portable computer |
US5924068A (en) | 1997-02-04 | 1999-07-13 | Matsushita Electric Industrial Co. Ltd. | Electronic news reception apparatus that selectively retains sections and searches by keyword or index for text to speech conversion |
EP0863469A3 (en) | 1997-02-10 | 2002-01-09 | Nippon Telegraph And Telephone Corporation | Scheme for automatic data conversion definition generation according to data feature in visual multidimensional data analysis tool |
US5926769A (en) | 1997-02-18 | 1999-07-20 | Nokia Mobile Phones Limited | Cellular telephone having simplified user interface for storing and retrieving telephone numbers |
US5930783A (en) | 1997-02-21 | 1999-07-27 | Nec Usa, Inc. | Semantic and cognition based image retrieval |
US5941944A (en) | 1997-03-03 | 1999-08-24 | Microsoft Corporation | Method for providing a substitute for a requested inaccessible object by identifying substantially similar objects using weights corresponding to object features |
US5930801A (en) | 1997-03-07 | 1999-07-27 | Xerox Corporation | Shared-data environment in which each file has independent security properties |
US6076051A (en) | 1997-03-07 | 2000-06-13 | Microsoft Corporation | Information retrieval utilizing semantic representation of text |
US6144377A (en) | 1997-03-11 | 2000-11-07 | Microsoft Corporation | Providing access to user interface elements of legacy application programs |
US6604124B1 (en) | 1997-03-13 | 2003-08-05 | A:\Scribes Corporation | Systems and methods for automatically managing work flow based on tracking job step completion status |
US6260013B1 (en) | 1997-03-14 | 2001-07-10 | Lernout & Hauspie Speech Products N.V. | Speech recognition system employing discriminatively trained models |
AU6566598A (en) | 1997-03-20 | 1998-10-12 | Schlumberger Technologies, Inc. | System and method of transactional taxation using secure stored data devices |
DE19712632A1 (en) | 1997-03-26 | 1998-10-01 | Thomson Brandt Gmbh | Method and device for remote voice control of devices |
US6097391A (en) | 1997-03-31 | 2000-08-01 | Menai Corporation | Method and apparatus for graphically manipulating objects |
US6041127A (en) | 1997-04-03 | 2000-03-21 | Lucent Technologies Inc. | Steerable and variable first-order differential microphone array |
US5822743A (en) | 1997-04-08 | 1998-10-13 | 1215627 Ontario Inc. | Knowledge-based information retrieval system |
US6954899B1 (en) | 1997-04-14 | 2005-10-11 | Novint Technologies, Inc. | Human-computer interface including haptically controlled interactions |
US5912951A (en) | 1997-04-17 | 1999-06-15 | At&T Corp | Voice mail system with multi-retrieval mailboxes |
JP3704925B2 (en) | 1997-04-22 | 2005-10-12 | トヨタ自動車株式会社 | Mobile terminal device and medium recording voice output program thereof |
US5970474A (en) | 1997-04-24 | 1999-10-19 | Sears, Roebuck And Co. | Registry information system for shoppers |
US7321783B2 (en) | 1997-04-25 | 2008-01-22 | Minerva Industries, Inc. | Mobile entertainment and communication device |
US6073036A (en) | 1997-04-28 | 2000-06-06 | Nokia Mobile Phones Limited | Mobile station with touch input having automatic symbol magnification function |
US5895464A (en) | 1997-04-30 | 1999-04-20 | Eastman Kodak Company | Computer program product and a method for using natural language for the description, search and retrieval of multi-media objects |
US6233545B1 (en) | 1997-05-01 | 2001-05-15 | William E. Datig | Universal machine translator of arbitrary languages utilizing epistemic moments |
US6226614B1 (en) | 1997-05-21 | 2001-05-01 | Nippon Telegraph And Telephone Corporation | Method and apparatus for editing/creating synthetic speech message and recording medium with the method recorded thereon |
US5930751A (en) | 1997-05-30 | 1999-07-27 | Lucent Technologies Inc. | Method of implicit confirmation for automatic speech recognition |
US6803905B1 (en) | 1997-05-30 | 2004-10-12 | International Business Machines Corporation | Touch sensitive apparatus and method for improved visual feedback |
US6582342B2 (en) | 1999-01-12 | 2003-06-24 | Epm Development Systems Corporation | Audible electronic exercise monitor |
DE69816185T2 (en) | 1997-06-12 | 2004-04-15 | Hewlett-Packard Co. (N.D.Ges.D.Staates Delaware), Palo Alto | Image processing method and device |
US5930754A (en) | 1997-06-13 | 1999-07-27 | Motorola, Inc. | Method, device and article of manufacture for neural-network based orthography-phonetics transformation |
US6415250B1 (en) | 1997-06-18 | 2002-07-02 | Novell, Inc. | System and method for identifying language using morphologically-based techniques |
US6138098A (en) | 1997-06-30 | 2000-10-24 | Lernout & Hauspie Speech Products N.V. | Command parsing and rewrite system |
EP1008084A1 (en) | 1997-07-02 | 2000-06-14 | Philippe J. M. Coueignoux | System and method for the secure discovery, exploitation and publication of information |
JP3593241B2 (en) | 1997-07-02 | 2004-11-24 | 株式会社日立製作所 | How to restart the computer |
CA2242065C (en) | 1997-07-03 | 2004-12-14 | Henry C.A. Hyde-Thomson | Unified messaging system with automatic language identification for text-to-speech conversion |
EP0889626A1 (en) | 1997-07-04 | 1999-01-07 | Octel Communications Corporation | Unified messaging system with automatic language identifacation for text-to-speech conversion |
EP1010175A4 (en) | 1997-07-09 | 2005-06-22 | Advanced Audio Devices Llc | Optical storage device |
US6587404B1 (en) | 1997-07-09 | 2003-07-01 | Advanced Audio Devices, Llc | Optical storage device capable of recording a set of sound tracks on a compact disc |
JP3224760B2 (en) | 1997-07-10 | 2001-11-05 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Voice mail system, voice synthesizing apparatus, and methods thereof |
US5860063A (en) | 1997-07-11 | 1999-01-12 | At&T Corp | Automated meaningful phrase clustering |
US5940841A (en) | 1997-07-11 | 1999-08-17 | International Business Machines Corporation | Parallel file system with extended file attributes |
US20020138254A1 (en) | 1997-07-18 | 2002-09-26 | Takehiko Isaka | Method and apparatus for processing speech signals |
US5933822A (en) | 1997-07-22 | 1999-08-03 | Microsoft Corporation | Apparatus and methods for an information retrieval system that employs natural language processing of search results to improve overall precision |
US6356864B1 (en) | 1997-07-25 | 2002-03-12 | University Technology Corporation | Methods for analysis and evaluation of the semantic content of a writing based on vector length |
JPH1145241A (en) | 1997-07-28 | 1999-02-16 | Just Syst Corp | Japanese syllabary-chinese character conversion system and computer-readable recording medium where programs making computer function as means of same system is recorded |
US5974146A (en) | 1997-07-30 | 1999-10-26 | Huntington Bancshares Incorporated | Real time bank-centric universal payment system |
US6904110B2 (en) | 1997-07-31 | 2005-06-07 | Francois Trans | Channel equalization system and method |
WO1999006804A1 (en) | 1997-07-31 | 1999-02-11 | Kyoyu Corporation | Voice monitoring system using laser beam |
JPH1153384A (en) | 1997-08-05 | 1999-02-26 | Mitsubishi Electric Corp | Device and method for keyword extraction and computer readable storage medium storing keyword extraction program |
US6016476A (en) | 1997-08-11 | 2000-01-18 | International Business Machines Corporation | Portable information and transaction processing system and method utilizing biometric authorization and digital certificate security |
US5943052A (en) | 1997-08-12 | 1999-08-24 | Synaptics, Incorporated | Method and apparatus for scroll bar control |
US5895466A (en) | 1997-08-19 | 1999-04-20 | At&T Corp | Automated natural language understanding customer service system |
JP3516328B2 (en) | 1997-08-22 | 2004-04-05 | 株式会社日立製作所 | Information communication terminal equipment |
US6081774A (en) | 1997-08-22 | 2000-06-27 | Novell, Inc. | Natural language information retrieval system and method |
US7385359B2 (en) | 1997-08-26 | 2008-06-10 | Philips Solid-State Lighting Solutions, Inc. | Information systems |
US5974412A (en) | 1997-09-24 | 1999-10-26 | Sapient Health Network | Intelligent query system for automatically indexing information in a database and automatically categorizing users |
US6404876B1 (en) | 1997-09-25 | 2002-06-11 | Gte Intelligent Network Services Incorporated | System and method for voice activated dialing and routing under open access network control |
US7046813B1 (en) | 1997-09-25 | 2006-05-16 | Fumio Denda | Auditory sense training method and sound processing method for auditory sense training |
CN100334530C (en) | 1997-09-25 | 2007-08-29 | 蒂吉通信系统公司 | Reduced keyboard disambiguating systems |
US6169911B1 (en) | 1997-09-26 | 2001-01-02 | Sun Microsystems, Inc. | Graphical user interface for a portable telephone |
US6574661B1 (en) | 1997-09-26 | 2003-06-03 | Mci Communications Corporation | Integrated proxy interface for web based telecommunication toll-free network management using a network manager for downloading a call routing tree to client |
US6023684A (en) | 1997-10-01 | 2000-02-08 | Security First Technologies, Inc. | Three tier financial transaction system with cache memory |
US6493652B1 (en) | 1997-10-02 | 2002-12-10 | Personal Electronic Devices, Inc. | Monitoring activity of a user in locomotion on foot |
US6611789B1 (en) | 1997-10-02 | 2003-08-26 | Personal Electric Devices, Inc. | Monitoring activity of a user in locomotion on foot |
US6882955B1 (en) | 1997-10-02 | 2005-04-19 | Fitsense Technology, Inc. | Monitoring activity of a user in locomotion on foot |
US6898550B1 (en) | 1997-10-02 | 2005-05-24 | Fitsense Technology, Inc. | Monitoring activity of a user in locomotion on foot |
US6560903B1 (en) | 2000-03-07 | 2003-05-13 | Personal Electronic Devices, Inc. | Ambulatory foot pod |
US6336365B1 (en) | 1999-08-24 | 2002-01-08 | Personal Electronic Devices, Inc. | Low-cost accelerometer |
US6018705A (en) | 1997-10-02 | 2000-01-25 | Personal Electronic Devices, Inc. | Measuring foot contact time and foot loft time of a person in locomotion |
US6163769A (en) | 1997-10-02 | 2000-12-19 | Microsoft Corporation | Text-to-speech using clustered context-dependent phoneme-based units |
US6298314B1 (en) | 1997-10-02 | 2001-10-02 | Personal Electronic Devices, Inc. | Detecting the starting and stopping of movement of a person on foot |
US6122340A (en) | 1998-10-01 | 2000-09-19 | Personal Electronic Devices, Inc. | Detachable foot mount for electronic device |
US6385662B1 (en) | 1997-10-03 | 2002-05-07 | Ericsson Inc. | Method of processing information using a personal communication assistant |
DE69820222T2 (en) * | 1997-10-07 | 2004-09-30 | Koninklijke Philips Electronics N.V. | METHOD AND DEVICE FOR ACTIVATING A LANGUAGE-CONTROLLED FUNCTION IN A MULTIPLE NETWORK THROUGH BOTH SPEAKER-DEPENDENT AND SPEAKER-INDEPENDENT LANGUAGE RECOGNITION |
EP0979497A1 (en) | 1997-10-08 | 2000-02-16 | Koninklijke Philips Electronics N.V. | Vocabulary and/or language model training |
US7027568B1 (en) | 1997-10-10 | 2006-04-11 | Verizon Services Corp. | Personal message service with enhanced text to speech synthesis |
KR100238189B1 (en) | 1997-10-16 | 2000-01-15 | 윤종용 | Multi-language tts device and method |
US6035336A (en) | 1997-10-17 | 2000-03-07 | International Business Machines Corporation | Audio ticker system and method for presenting push information including pre-recorded audio |
JP4267081B2 (en) | 1997-10-20 | 2009-05-27 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | Pattern recognition registration in distributed systems |
US6304846B1 (en) | 1997-10-22 | 2001-10-16 | Texas Instruments Incorporated | Singing voice synthesis |
DE69712485T2 (en) | 1997-10-23 | 2002-12-12 | Sony Int Europe Gmbh | Voice interface for a home network |
GB2330670B (en) | 1997-10-24 | 2002-09-11 | Sony Uk Ltd | Data processing |
US5990887A (en) | 1997-10-30 | 1999-11-23 | International Business Machines Corp. | Method and system for efficient network desirable chat feedback over a communication network |
US6108627A (en) | 1997-10-31 | 2000-08-22 | Nortel Networks Corporation | Automatic transcription tool |
US6230322B1 (en) | 1997-11-05 | 2001-05-08 | Sony Corporation | Music channel graphical user interface |
US6182028B1 (en) | 1997-11-07 | 2001-01-30 | Motorola, Inc. | Method, device and system for part-of-speech disambiguation |
US5896321A (en) | 1997-11-14 | 1999-04-20 | Microsoft Corporation | Text completion system for a miniature computer |
US6034621A (en) | 1997-11-18 | 2000-03-07 | Lucent Technologies, Inc. | Wireless remote synchronization of data between PC and PDA |
US5943670A (en) | 1997-11-21 | 1999-08-24 | International Business Machines Corporation | System and method for categorizing objects in combined categories |
KR100287366B1 (en) | 1997-11-24 | 2001-04-16 | 윤순조 | Portable device for reproducing sound by mpeg and method thereof |
US5960422A (en) | 1997-11-26 | 1999-09-28 | International Business Machines Corporation | System and method for optimized source selection in an information retrieval system |
US6047255A (en) | 1997-12-04 | 2000-04-04 | Nortel Networks Corporation | Method and system for producing speech signals |
US6026375A (en) | 1997-12-05 | 2000-02-15 | Nortel Networks Corporation | Method and apparatus for processing orders from customers in a mobile environment |
US6163809A (en) | 1997-12-08 | 2000-12-19 | Microsoft Corporation | System and method for preserving delivery status notification when moving from a native network to a foreign network |
US6983138B1 (en) | 1997-12-12 | 2006-01-03 | Richard J. Helferich | User interface for message access |
US6295541B1 (en) | 1997-12-16 | 2001-09-25 | Starfish Software, Inc. | System and methods for synchronizing two or more datasets |
US6064963A (en) | 1997-12-17 | 2000-05-16 | Opus Telecom, L.L.C. | Automatic key word or phrase speech recognition for the corrections industry |
US6064960A (en) | 1997-12-18 | 2000-05-16 | Apple Computer, Inc. | Method and apparatus for improved duration modeling of phonemes |
US6094649A (en) | 1997-12-22 | 2000-07-25 | Partnet, Inc. | Keyword searches of structured databases |
US6310400B1 (en) | 1997-12-29 | 2001-10-30 | Intel Corporation | Apparatus for capacitively coupling electronic devices |
US6116907A (en) | 1998-01-13 | 2000-09-12 | Sorenson Vision, Inc. | System and method for encoding and retrieving visual signals |
US6064767A (en) | 1998-01-16 | 2000-05-16 | Regents Of The University Of California | Automatic language identification by stroke geometry analysis |
JP3216084B2 (en) | 1998-01-19 | 2001-10-09 | 株式会社ネットワークコミュニティクリエイション | Chat screen display method |
US20020002039A1 (en) | 1998-06-12 | 2002-01-03 | Safi Qureshey | Network-enabled audio device |
US20060033724A1 (en) | 2004-07-30 | 2006-02-16 | Apple Computer, Inc. | Virtual input device placement on a touch screen user interface |
US8479122B2 (en) | 2004-07-30 | 2013-07-02 | Apple Inc. | Gestures for touch sensitive input devices |
US6782510B1 (en) | 1998-01-27 | 2004-08-24 | John N. Gross | Word checking tool for controlling the language content in documents using dictionaries with modifyable status fields |
JP2938420B2 (en) | 1998-01-30 | 1999-08-23 | インターナショナル・ビジネス・マシーンズ・コーポレイション | Function selection method and apparatus, storage medium storing control program for selecting functions, object operation method and apparatus, storage medium storing control program for operating objects, storage medium storing composite icon |
US6035303A (en) | 1998-02-02 | 2000-03-07 | International Business Machines Corporation | Object management system for digital libraries |
US6216131B1 (en) | 1998-02-06 | 2001-04-10 | Starfish Software, Inc. | Methods for mapping data fields from one data set to another in a data processing environment |
US6226403B1 (en) | 1998-02-09 | 2001-05-01 | Motorola, Inc. | Handwritten character recognition using multi-resolution models |
US6421707B1 (en) | 1998-02-13 | 2002-07-16 | Lucent Technologies Inc. | Wireless multi-media messaging communications method and apparatus |
US6249606B1 (en) | 1998-02-19 | 2001-06-19 | Mindmaker, Inc. | Method and system for gesture category recognition and training using a feature vector |
US6623529B1 (en) | 1998-02-23 | 2003-09-23 | David Lakritz | Multilingual electronic document translation, management, and delivery system |
US20020080163A1 (en) | 1998-02-23 | 2002-06-27 | Morey Dale D. | Information retrieval system |
US6345250B1 (en) | 1998-02-24 | 2002-02-05 | International Business Machines Corp. | Developing voice response applications from pre-recorded voice and stored text-to-speech prompts |
US5995590A (en) | 1998-03-05 | 1999-11-30 | International Business Machines Corporation | Method and apparatus for a communication device for use by a hearing impaired/mute or deaf person or in silent environments |
US6356920B1 (en) | 1998-03-09 | 2002-03-12 | X-Aware, Inc | Dynamic, hierarchical data exchange system |
JP3854713B2 (en) | 1998-03-10 | 2006-12-06 | キヤノン株式会社 | Speech synthesis method and apparatus and storage medium |
US6173287B1 (en) | 1998-03-11 | 2001-01-09 | Digital Equipment Corporation | Technique for ranking multimedia annotations of interest |
US6272456B1 (en) | 1998-03-19 | 2001-08-07 | Microsoft Corporation | System and method for identifying the language of written text having a plurality of different length n-gram profiles |
JP4562910B2 (en) | 1998-03-23 | 2010-10-13 | マイクロソフト コーポレーション | Operating system application program interface |
US6963871B1 (en) | 1998-03-25 | 2005-11-08 | Language Analysis Systems, Inc. | System and method for adaptive multi-cultural searching and matching of personal names |
US6675233B1 (en) | 1998-03-26 | 2004-01-06 | O2 Micro International Limited | Audio controller for portable electronic devices |
US6335962B1 (en) | 1998-03-27 | 2002-01-01 | Lucent Technologies Inc. | Apparatus and method for grouping and prioritizing voice messages for convenient playback |
US6195641B1 (en) | 1998-03-27 | 2001-02-27 | International Business Machines Corp. | Network universal spoken language vocabulary |
US6026393A (en) | 1998-03-31 | 2000-02-15 | Casebank Technologies Inc. | Configuration knowledge as an aid to case retrieval |
US6233559B1 (en) | 1998-04-01 | 2001-05-15 | Motorola, Inc. | Speech control of multiple applications using applets |
US6151401A (en) | 1998-04-09 | 2000-11-21 | Compaq Computer Corporation | Planar speaker for multimedia laptop PCs |
US6173279B1 (en) | 1998-04-09 | 2001-01-09 | At&T Corp. | Method of using a natural language interface to retrieve information from one or more data resources |
US7194471B1 (en) | 1998-04-10 | 2007-03-20 | Ricoh Company, Ltd. | Document classification system and method for classifying a document according to contents of the document |
US6018711A (en) | 1998-04-21 | 2000-01-25 | Nortel Networks Corporation | Communication system user interface with animated representation of time remaining for input to recognizer |
US6240303B1 (en) | 1998-04-23 | 2001-05-29 | Motorola Inc. | Voice recognition button for mobile telephones |
US6088731A (en) | 1998-04-24 | 2000-07-11 | Associative Computing, Inc. | Intelligent assistant for use with a local computer and with the internet |
KR100454541B1 (en) | 1998-04-27 | 2004-11-03 | 산요덴키가부시키가이샤 | Method and system of handwritten-character recognition |
AU3717099A (en) | 1998-04-27 | 1999-11-16 | British Telecommunications Public Limited Company | Database access tool |
US6081780A (en) | 1998-04-28 | 2000-06-27 | International Business Machines Corporation | TTS and prosody based authoring system |
US6016471A (en) | 1998-04-29 | 2000-01-18 | Matsushita Electric Industrial Co., Ltd. | Method and apparatus using decision trees to generate and score multiple pronunciations for a spelled word |
US6029132A (en) | 1998-04-30 | 2000-02-22 | Matsushita Electric Industrial Co. | Method for letter-to-sound in text-to-speech synthesis |
US6931255B2 (en) | 1998-04-29 | 2005-08-16 | Telefonaktiebolaget L M Ericsson (Publ) | Mobile terminal with a text-to-speech converter |
US5891180A (en) | 1998-04-29 | 1999-04-06 | Medtronic Inc. | Interrogation of an implantable medical device using audible sound communication |
US6222347B1 (en) | 1998-04-30 | 2001-04-24 | Apple Computer, Inc. | System for charging portable computer's battery using both the dynamically determined power available based on power consumed by sub-system devices and power limits from the battery |
US5998972A (en) | 1998-04-30 | 1999-12-07 | Apple Computer, Inc. | Method and apparatus for rapidly charging a battery of a portable computing device |
US6285786B1 (en) | 1998-04-30 | 2001-09-04 | Motorola, Inc. | Text recognizer and method using non-cumulative character scoring in a forward search |
US6343267B1 (en) | 1998-04-30 | 2002-01-29 | Matsushita Electric Industrial Co., Ltd. | Dimensionality reduction for speaker normalization and speaker and environment adaptation using eigenvoice techniques |
US6076060A (en) | 1998-05-01 | 2000-06-13 | Compaq Computer Corporation | Computer method and apparatus for translating text to sound |
US6144938A (en) | 1998-05-01 | 2000-11-07 | Sun Microsystems, Inc. | Voice user interface with personality |
US6297818B1 (en) | 1998-05-08 | 2001-10-02 | Apple Computer, Inc. | Graphical user interface having sound effects for operating control elements and dragging objects |
JPH11327870A (en) | 1998-05-15 | 1999-11-30 | Fujitsu Ltd | Device for reading-aloud document, reading-aloud control method and recording medium |
US6438523B1 (en) | 1998-05-20 | 2002-08-20 | John A. Oberteuffer | Processing handwritten and hand-drawn input and speech input |
FI981154A (en) | 1998-05-25 | 1999-11-26 | Nokia Mobile Phones Ltd | Voice identification procedure and apparatus |
US6424983B1 (en) | 1998-05-26 | 2002-07-23 | Global Information Research And Technologies, Llc | Spelling and grammar checking system |
US6101470A (en) | 1998-05-26 | 2000-08-08 | International Business Machines Corporation | Methods for generating pitch and duration contours in a text to speech system |
US20070094222A1 (en) | 1998-05-28 | 2007-04-26 | Lawrence Au | Method and system for using voice input for performing network functions |
US6778970B2 (en) | 1998-05-28 | 2004-08-17 | Lawrence Au | Topological methods to organize semantic network data flows for conversational applications |
US7711672B2 (en) | 1998-05-28 | 2010-05-04 | Lawrence Au | Semantic network methods to disambiguate natural language meaning |
US7266365B2 (en) | 1998-05-29 | 2007-09-04 | Research In Motion Limited | System and method for delayed transmission of bundled command messages |
JP3180764B2 (en) | 1998-06-05 | 2001-06-25 | 日本電気株式会社 | Speech synthesizer |
US6563769B1 (en) | 1998-06-11 | 2003-05-13 | Koninklijke Philips Electronics N.V. | Virtual jukebox |
US6411932B1 (en) | 1998-06-12 | 2002-06-25 | Texas Instruments Incorporated | Rule-based learning of word pronunciations from training corpora |
US5969283A (en) | 1998-06-17 | 1999-10-19 | Looney Productions, Llc | Music organizer and entertainment center |
US6542171B1 (en) | 1998-07-08 | 2003-04-01 | Nippon Telegraph Amd Telephone Corporation | Scheme for graphical user interface using polygonal-shaped slider |
US6144958A (en) | 1998-07-15 | 2000-11-07 | Amazon.Com, Inc. | System and method for correcting spelling errors in search queries |
US6105865A (en) | 1998-07-17 | 2000-08-22 | Hardesty; Laurence Daniel | Financial transaction system with retirement saving benefit |
US6421708B2 (en) | 1998-07-31 | 2002-07-16 | Glenayre Electronics, Inc. | World wide web access for voice mail and page |
US6389114B1 (en) | 1998-08-06 | 2002-05-14 | At&T Corp. | Method and apparatus for relaying communication |
JP3865946B2 (en) | 1998-08-06 | 2007-01-10 | 富士通株式会社 | CHARACTER MESSAGE COMMUNICATION SYSTEM, CHARACTER MESSAGE COMMUNICATION DEVICE, CHARACTER MESSAGE COMMUNICATION SERVER, COMPUTER-READABLE RECORDING MEDIUM CONTAINING CHARACTER MESSAGE COMMUNICATION PROGRAM, COMPUTER-READABLE RECORDING MEDIUM RECORDING CHARACTER MESSAGE COMMUNICATION MANAGEMENT PROGRAM Message communication management method |
US6169538B1 (en) | 1998-08-13 | 2001-01-02 | Motorola, Inc. | Method and apparatus for implementing a graphical user interface keyboard and a text buffer on electronic devices |
US6359970B1 (en) | 1998-08-14 | 2002-03-19 | Maverick Consulting Services, Inc. | Communications control method and apparatus |
US6490563B2 (en) | 1998-08-17 | 2002-12-03 | Microsoft Corporation | Proofreading with text to speech feedback |
US6493428B1 (en) | 1998-08-18 | 2002-12-10 | Siemens Information & Communication Networks, Inc | Text-enhanced voice menu system |
CN1254877A (en) | 1998-08-24 | 2000-05-31 | 世韩情报系统株式会社 | Portable MP3 player with multiple functions |
US6542584B1 (en) | 1998-08-31 | 2003-04-01 | Intel Corporation | Digital telephone system with automatic voice mail redirection |
US6208964B1 (en) | 1998-08-31 | 2001-03-27 | Nortel Networks Limited | Method and apparatus for providing unsupervised adaptation of transcriptions |
US6173263B1 (en) | 1998-08-31 | 2001-01-09 | At&T Corp. | Method and system for performing concatenative speech synthesis using half-phonemes |
US6359572B1 (en) | 1998-09-03 | 2002-03-19 | Microsoft Corporation | Dynamic keyboard |
US6271835B1 (en) | 1998-09-03 | 2001-08-07 | Nortel Networks Limited | Touch-screen input device |
US6141644A (en) | 1998-09-04 | 2000-10-31 | Matsushita Electric Industrial Co., Ltd. | Speaker verification and speaker identification based on eigenvoices |
US6684185B1 (en) | 1998-09-04 | 2004-01-27 | Matsushita Electric Industrial Co., Ltd. | Small footprint language and vocabulary independent word recognizer using registration by word spelling |
US6499013B1 (en) | 1998-09-09 | 2002-12-24 | One Voice Technologies, Inc. | Interactive user interface using speech recognition and natural language processing |
US6434524B1 (en) | 1998-09-09 | 2002-08-13 | One Voice Technologies, Inc. | Object interactive user interface using speech recognition and natural language processing |
DE29825146U1 (en) | 1998-09-11 | 2005-08-18 | Püllen, Rainer | Audio on demand system |
US6792082B1 (en) | 1998-09-11 | 2004-09-14 | Comverse Ltd. | Voice mail system with personal assistant provisioning |
US6266637B1 (en) | 1998-09-11 | 2001-07-24 | International Business Machines Corporation | Phrase splicing and variable substitution using a trainable speech synthesizer |
US6594673B1 (en) | 1998-09-15 | 2003-07-15 | Microsoft Corporation | Visualizations for collaborative information |
JP2000099225A (en) | 1998-09-18 | 2000-04-07 | Sony Corp | Device and method for processing information and distribution medium |
US6317831B1 (en) | 1998-09-21 | 2001-11-13 | Openwave Systems Inc. | Method and apparatus for establishing a secure connection over a one-way data path |
US9037451B2 (en) | 1998-09-25 | 2015-05-19 | Rpx Corporation | Systems and methods for multiple mode voice and data communications using intelligently bridged TDM and packet buses and methods for implementing language capabilities using the same |
US6154551A (en) | 1998-09-25 | 2000-11-28 | Frenkel; Anatoly | Microphone having linear optical transducers |
WO2000019697A1 (en) | 1998-09-28 | 2000-04-06 | Varicom Communications Ltd. | A method of sending and forwarding e-mail messages to a telephone |
AU1097300A (en) | 1998-09-30 | 2000-04-17 | Brian Gladstein | Graphic user interface for navigation in speech recognition system grammars |
JP2000105595A (en) | 1998-09-30 | 2000-04-11 | Victor Co Of Japan Ltd | Singing device and recording medium |
US6324511B1 (en) | 1998-10-01 | 2001-11-27 | Mindmaker, Inc. | Method of and apparatus for multi-modal information presentation to computer users with dyslexia, reading disabilities or visual impairment |
US7137126B1 (en) | 1998-10-02 | 2006-11-14 | International Business Machines Corporation | Conversational computing via conversational virtual machine |
US7003463B1 (en) | 1998-10-02 | 2006-02-21 | International Business Machines Corporation | System and method for providing network coordinated conversational services |
US6275824B1 (en) | 1998-10-02 | 2001-08-14 | Ncr Corporation | System and method for managing data privacy in a database management system |
US6161087A (en) | 1998-10-05 | 2000-12-12 | Lernout & Hauspie Speech Products N.V. | Speech-recognition-assisted selective suppression of silent and filled speech pauses during playback of an audio recording |
US6360237B1 (en) | 1998-10-05 | 2002-03-19 | Lernout & Hauspie Speech Products N.V. | Method and system for performing text edits during audio recording playback |
GB9821969D0 (en) | 1998-10-08 | 1998-12-02 | Canon Kk | Apparatus and method for processing natural language |
WO2000022820A1 (en) | 1998-10-09 | 2000-04-20 | Sarnoff Corporation | Method and apparatus for providing vcr-type controls for compressed digital video sequences |
US6928614B1 (en) | 1998-10-13 | 2005-08-09 | Visteon Global Technologies, Inc. | Mobile office with speech recognition |
DE19847419A1 (en) | 1998-10-14 | 2000-04-20 | Philips Corp Intellectual Pty | Procedure for the automatic recognition of a spoken utterance |
GB2342802B (en) | 1998-10-14 | 2003-04-16 | Picturetel Corp | Method and apparatus for indexing conference content |
US6487663B1 (en) | 1998-10-19 | 2002-11-26 | Realnetworks, Inc. | System and method for regulating the transmission of media data |
JP2000122781A (en) | 1998-10-20 | 2000-04-28 | Sony Corp | Processor and method for information processing and provision medium |
US6768979B1 (en) | 1998-10-22 | 2004-07-27 | Sony Corporation | Apparatus and method for noise attenuation in a speech recognition system |
US6453292B2 (en) | 1998-10-28 | 2002-09-17 | International Business Machines Corporation | Command boundary identifier for conversational natural language |
JP3551044B2 (en) | 1998-10-29 | 2004-08-04 | 松下電器産業株式会社 | Facsimile machine |
US6208971B1 (en) | 1998-10-30 | 2001-03-27 | Apple Computer, Inc. | Method and apparatus for command recognition using data-driven semantic inference |
US6292778B1 (en) | 1998-10-30 | 2001-09-18 | Lucent Technologies Inc. | Task-independent utterance verification with subword-based minimum verification error training |
US6321092B1 (en) | 1998-11-03 | 2001-11-20 | Signal Soft Corporation | Multiple input data management for wireless location-based applications |
US6839669B1 (en) | 1998-11-05 | 2005-01-04 | Scansoft, Inc. | Performing actions identified in recognized speech |
US6469732B1 (en) | 1998-11-06 | 2002-10-22 | Vtel Corporation | Acoustic source location using a microphone array |
US6519565B1 (en) | 1998-11-10 | 2003-02-11 | Voice Security Systems, Inc. | Method of comparing utterances for security control |
US6446076B1 (en) | 1998-11-12 | 2002-09-03 | Accenture Llp. | Voice interactive web-based agent system responsive to a user location for prioritizing and formatting information |
EP1138038B1 (en) | 1998-11-13 | 2005-06-22 | Lernout & Hauspie Speech Products N.V. | Speech synthesis using concatenation of speech waveforms |
US6421305B1 (en) | 1998-11-13 | 2002-07-16 | Sony Corporation | Personal music device with a graphical display for contextual information |
US6606599B2 (en) | 1998-12-23 | 2003-08-12 | Interactive Speech Technologies, Llc | Method for integrating computing processes with an interface controlled by voice actuated grammars |
IL127073A0 (en) | 1998-11-15 | 1999-09-22 | Tiktech Software Ltd | Software translation system and method |
JP2002530761A (en) | 1998-11-17 | 2002-09-17 | ルノー・アンド・オスピー・スピーチ・プロダクツ・ナームローゼ・ベンノートシャープ | Improved part-of-speech tagging method and apparatus |
US20030069873A1 (en) | 1998-11-18 | 2003-04-10 | Kevin L. Fox | Multiple engine information retrieval and visualization system |
US6122614A (en) | 1998-11-20 | 2000-09-19 | Custom Speech Usa, Inc. | System and method for automating transcription services |
US6298321B1 (en) | 1998-11-23 | 2001-10-02 | Microsoft Corporation | Trie compression using substates and utilizing pointers to replace or merge identical, reordered states |
US6144939A (en) | 1998-11-25 | 2000-11-07 | Matsushita Electric Industrial Co., Ltd. | Formant-based speech synthesizer employing demi-syllable concatenation with independent cross fade in the filter parameter and source domains |
US6260016B1 (en) | 1998-11-25 | 2001-07-10 | Matsushita Electric Industrial Co., Ltd. | Speech synthesis employing prosody templates |
US6246981B1 (en) | 1998-11-25 | 2001-06-12 | International Business Machines Corporation | Natural language task-oriented dialog manager and method |
US6292772B1 (en) | 1998-12-01 | 2001-09-18 | Justsystem Corporation | Method for identifying the language of individual words |
US7082397B2 (en) | 1998-12-01 | 2006-07-25 | Nuance Communications, Inc. | System for and method of creating and browsing a voice web |
US6260024B1 (en) | 1998-12-02 | 2001-07-10 | Gary Shkedy | Method and apparatus for facilitating buyer-driven purchase orders on a commercial network system |
US7319957B2 (en) | 2004-02-11 | 2008-01-15 | Tegic Communications, Inc. | Handwriting and voice input with automatic correction |
US7881936B2 (en) | 1998-12-04 | 2011-02-01 | Tegic Communications, Inc. | Multimodal disambiguation of speech recognition |
US7712053B2 (en) | 1998-12-04 | 2010-05-04 | Tegic Communications, Inc. | Explicit character filtering of ambiguous text entry |
US7679534B2 (en) | 1998-12-04 | 2010-03-16 | Tegic Communications, Inc. | Contextual prediction of user words and user actions |
US6317707B1 (en) | 1998-12-07 | 2001-11-13 | At&T Corp. | Automatic clustering of tokens from a corpus for grammar acquisition |
US20030187925A1 (en) | 1998-12-08 | 2003-10-02 | Inala Suman Kumar | Software engine for enabling proxy chat-room interaction |
US6177905B1 (en) | 1998-12-08 | 2001-01-23 | Avaya Technology Corp. | Location-triggered reminder for mobile user devices |
US6460015B1 (en) | 1998-12-15 | 2002-10-01 | International Business Machines Corporation | Method, system and computer program product for automatic character transliteration in a text string object |
JP2000181993A (en) | 1998-12-16 | 2000-06-30 | Fujitsu Ltd | Character recognition method and device |
US6308149B1 (en) | 1998-12-16 | 2001-10-23 | Xerox Corporation | Grouping words with equivalent substrings by automatic clustering based on suffix relationships |
US6523172B1 (en) | 1998-12-17 | 2003-02-18 | Evolutionary Technologies International, Inc. | Parser translator system and method |
US6363342B2 (en) | 1998-12-18 | 2002-03-26 | Matsushita Electric Industrial Co., Ltd. | System for developing word-pronunciation pairs |
GB9827930D0 (en) | 1998-12-19 | 1999-02-10 | Symbian Ltd | Keyboard system for a computing device with correction of key based input errors |
US6259436B1 (en) | 1998-12-22 | 2001-07-10 | Ericsson Inc. | Apparatus and method for determining selection of touchable items on a computer touchscreen by an imprecise touch |
CA2284304A1 (en) | 1998-12-22 | 2000-06-22 | Nortel Networks Corporation | Communication systems and methods employing automatic language indentification |
US6460029B1 (en) | 1998-12-23 | 2002-10-01 | Microsoft Corporation | System for improving search text |
US6191939B1 (en) | 1998-12-23 | 2001-02-20 | Gateway, Inc. | Keyboard illumination via reflection of LCD light |
FR2787902B1 (en) | 1998-12-23 | 2004-07-30 | France Telecom | MODEL AND METHOD FOR IMPLEMENTING A RATIONAL DIALOGUE AGENT, SERVER AND MULTI-AGENT SYSTEM FOR IMPLEMENTATION |
US6167369A (en) | 1998-12-23 | 2000-12-26 | Xerox Company | Automatic language identification using both N-gram and word information |
US6513063B1 (en) | 1999-01-05 | 2003-01-28 | Sri International | Accessing network-based electronic information through scripted online interfaces using spoken input |
US6757718B1 (en) | 1999-01-05 | 2004-06-29 | Sri International | Mobile navigation of network-based electronic information using spoken input |
US7036128B1 (en) | 1999-01-05 | 2006-04-25 | Sri International Offices | Using a community of distributed electronic agents to support a highly mobile, ambient computing environment |
US6523061B1 (en) | 1999-01-05 | 2003-02-18 | Sri International, Inc. | System, method, and article of manufacture for agent-based navigation in a speech-based data navigation system |
US6851115B1 (en) | 1999-01-05 | 2005-02-01 | Sri International | Software-based architecture for communication and cooperation among distributed electronic agents |
US6742021B1 (en) | 1999-01-05 | 2004-05-25 | Sri International, Inc. | Navigating network-based electronic information using spoken input with multimodal error feedback |
US7152070B1 (en) | 1999-01-08 | 2006-12-19 | The Regents Of The University Of California | System and method for integrating and accessing multiple data sources within a data warehouse architecture |
JP2000206982A (en) | 1999-01-12 | 2000-07-28 | Toshiba Corp | Speech synthesizer and machine readable recording medium which records sentence to speech converting program |
US6179432B1 (en) | 1999-01-12 | 2001-01-30 | Compaq Computer Corporation | Lighting system for a keyboard |
CA2351411C (en) | 1999-01-19 | 2003-03-18 | Integra5 Communications, Inc. | Method and apparatus for selecting and displaying multi-media messages |
US6598054B2 (en) | 1999-01-26 | 2003-07-22 | Xerox Corporation | System and method for clustering data objects in a collection |
US6385586B1 (en) | 1999-01-28 | 2002-05-07 | International Business Machines Corporation | Speech recognition text-based language conversion and text-to-speech in a client-server configuration to enable language translation devices |
US6360227B1 (en) | 1999-01-29 | 2002-03-19 | International Business Machines Corporation | System and method for generating taxonomies with applications to content-based recommendations |
US6282507B1 (en) | 1999-01-29 | 2001-08-28 | Sony Corporation | Method and apparatus for interactive source language expression recognition and alternative hypothesis presentation and selection |
US7966078B2 (en) | 1999-02-01 | 2011-06-21 | Steven Hoffberg | Network media appliance system and method |
US6505183B1 (en) | 1999-02-04 | 2003-01-07 | Authoria, Inc. | Human resource knowledge modeling and delivery system |
WO2000046701A1 (en) | 1999-02-08 | 2000-08-10 | Huntsman Ici Chemicals Llc | Method for retrieving semantically distant analogies |
US6332175B1 (en) | 1999-02-12 | 2001-12-18 | Compaq Computer Corporation | Low power system and method for playing compressed audio data |
US6377530B1 (en) | 1999-02-12 | 2002-04-23 | Compaq Computer Corporation | System and method for playing compressed audio data |
US6983251B1 (en) | 1999-02-15 | 2006-01-03 | Sharp Kabushiki Kaisha | Information selection apparatus selecting desired information from plurality of audio information by mainly using audio |
US6606632B1 (en) | 1999-02-19 | 2003-08-12 | Sun Microsystems, Inc. | Transforming transient contents of object-oriented database into persistent textual form according to grammar that includes keywords and syntax |
US6961699B1 (en) | 1999-02-19 | 2005-11-01 | Custom Speech Usa, Inc. | Automated transcription system and method using two speech converting instances and computer-assisted correction |
GB2347239B (en) | 1999-02-22 | 2003-09-24 | Nokia Mobile Phones Ltd | A communication terminal having a predictive editor application |
US6317718B1 (en) | 1999-02-26 | 2001-11-13 | Accenture Properties (2) B.V. | System, method and article of manufacture for location-based filtering for shopping agent in the physical world |
US6462778B1 (en) | 1999-02-26 | 2002-10-08 | Sony Corporation | Methods and apparatus for associating descriptive data with digital image files |
GB9904662D0 (en) | 1999-03-01 | 1999-04-21 | Canon Kk | Natural language search method and apparatus |
US20020013852A1 (en) | 2000-03-03 | 2002-01-31 | Craig Janik | System for providing content, management, and interactivity for thin client devices |
AU777693B2 (en) | 1999-03-05 | 2004-10-28 | Canon Kabushiki Kaisha | Database annotation and retrieval |
US6356905B1 (en) | 1999-03-05 | 2002-03-12 | Accenture Llp | System, method and article of manufacture for mobile communication utilizing an interface support framework |
DE50006493D1 (en) | 1999-03-08 | 2004-06-24 | Siemens Ag | METHOD AND ARRANGEMENT FOR DETERMINING A FEATURE DESCRIPTION OF A VOICE SIGNAL |
US6374217B1 (en) | 1999-03-12 | 2002-04-16 | Apple Computer, Inc. | Fast update implementation for efficient latent semantic language modeling |
US6185533B1 (en) | 1999-03-15 | 2001-02-06 | Matsushita Electric Industrial Co., Ltd. | Generation and synthesis of prosody templates |
US6928404B1 (en) | 1999-03-17 | 2005-08-09 | International Business Machines Corporation | System and methods for acoustic and language modeling for automatic speech recognition with large vocabularies |
US6584464B1 (en) | 1999-03-19 | 2003-06-24 | Ask Jeeves, Inc. | Grammar template query system |
US6862710B1 (en) | 1999-03-23 | 2005-03-01 | Insightful Corporation | Internet navigation using soft hyperlinks |
US6510406B1 (en) | 1999-03-23 | 2003-01-21 | Mathsoft, Inc. | Inverse inference engine for high performance web search |
US6469712B1 (en) | 1999-03-25 | 2002-10-22 | International Business Machines Corporation | Projected audio for computer displays |
WO2000058946A1 (en) | 1999-03-26 | 2000-10-05 | Koninklijke Philips Electronics N.V. | Client-server speech recognition |
US6041023A (en) | 1999-03-29 | 2000-03-21 | Lakhansingh; Cynthia | Portable digital radio and compact disk player |
US6671672B1 (en) | 1999-03-30 | 2003-12-30 | Nuance Communications | Voice authentication system having cognitive recall mechanism for password verification |
US6954902B2 (en) | 1999-03-31 | 2005-10-11 | Sony Corporation | Information sharing processing method, information sharing processing program storage medium, information sharing processing apparatus, and information sharing processing system |
US6377928B1 (en) | 1999-03-31 | 2002-04-23 | Sony Corporation | Voice recognition for animated agent-based navigation |
US7761296B1 (en) | 1999-04-02 | 2010-07-20 | International Business Machines Corporation | System and method for rescoring N-best hypotheses of an automatic speech recognition system |
US6356854B1 (en) | 1999-04-05 | 2002-03-12 | Delphi Technologies, Inc. | Holographic object position and type sensing system and method |
WO2000060435A2 (en) | 1999-04-07 | 2000-10-12 | Rensselaer Polytechnic Institute | System and method for accessing personal information |
US6631346B1 (en) | 1999-04-07 | 2003-10-07 | Matsushita Electric Industrial Co., Ltd. | Method and apparatus for natural language parsing using multiple passes and tags |
US6631186B1 (en) | 1999-04-09 | 2003-10-07 | Sbc Technology Resources, Inc. | System and method for implementing and accessing call forwarding services |
US6647260B2 (en) | 1999-04-09 | 2003-11-11 | Openwave Systems Inc. | Method and system facilitating web based provisioning of two-way mobile communications devices |
US6408272B1 (en) | 1999-04-12 | 2002-06-18 | General Magic, Inc. | Distributed voice user interface |
US6538665B2 (en) | 1999-04-15 | 2003-03-25 | Apple Computer, Inc. | User interface for presenting media information |
US6502194B1 (en) | 1999-04-16 | 2002-12-31 | Synetix Technologies | System for playback of network audio material on demand |
JP3789756B2 (en) | 1999-04-19 | 2006-06-28 | 三洋電機株式会社 | Mobile phone equipment |
JP3711411B2 (en) | 1999-04-19 | 2005-11-02 | 沖電気工業株式会社 | Speech synthesizer |
US6463413B1 (en) * | 1999-04-20 | 2002-10-08 | Matsushita Electrical Industrial Co., Ltd. | Speech recognition training for small hardware devices |
US7558381B1 (en) | 1999-04-22 | 2009-07-07 | Agere Systems Inc. | Retrieval of deleted voice messages in voice messaging system |
JP2000305585A (en) | 1999-04-23 | 2000-11-02 | Oki Electric Ind Co Ltd | Speech synthesizing device |
US6924828B1 (en) | 1999-04-27 | 2005-08-02 | Surfnotes | Method and apparatus for improved information representation |
US6697780B1 (en) | 1999-04-30 | 2004-02-24 | At&T Corp. | Method and apparatus for rapid acoustic unit selection from a large speech corpus |
GB9910448D0 (en) | 1999-05-07 | 1999-07-07 | Ensigma Ltd | Cancellation of non-stationary interfering signals for speech recognition |
US6766295B1 (en) * | 1999-05-10 | 2004-07-20 | Nuance Communications | Adaptation of a speech recognition system across multiple remote sessions with a speaker |
US6741264B1 (en) | 1999-05-11 | 2004-05-25 | Gific Corporation | Method of generating an audible indication of data stored in a database |
US6928149B1 (en) | 1999-05-17 | 2005-08-09 | Interwoven, Inc. | Method and apparatus for a user controlled voicemail management system |
US6161944A (en) | 1999-05-18 | 2000-12-19 | Micron Electronics, Inc. | Retractable keyboard illumination device |
US7286115B2 (en) | 2000-05-26 | 2007-10-23 | Tegic Communications, Inc. | Directional input system with automatic correction |
US7030863B2 (en) | 2000-05-26 | 2006-04-18 | America Online, Incorporated | Virtual keyboard system with automatic correction |
ATE443946T1 (en) | 1999-05-27 | 2009-10-15 | Tegic Communications Inc | KEYBOARD SYSTEM WITH AUTOMATIC CORRECTION |
US7821503B2 (en) | 2003-04-09 | 2010-10-26 | Tegic Communications, Inc. | Touch screen and graphical user interface |
WO2000073936A1 (en) | 1999-05-28 | 2000-12-07 | Sehda, Inc. | Phrase-based dialogue modeling with particular application to creating recognition grammars for voice-controlled user interfaces |
US20020032564A1 (en) | 2000-04-19 | 2002-03-14 | Farzad Ehsani | Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface |
JP2000339137A (en) | 1999-05-31 | 2000-12-08 | Sanyo Electric Co Ltd | Electronic mail receiving system |
US6728675B1 (en) | 1999-06-03 | 2004-04-27 | International Business Machines Corporatiion | Data processor controlled display system with audio identifiers for overlapping windows in an interactive graphical user interface |
US6931384B1 (en) | 1999-06-04 | 2005-08-16 | Microsoft Corporation | System and method providing utility-based decision making about clarification dialog given communicative uncertainty |
US6598039B1 (en) | 1999-06-08 | 2003-07-22 | Albert-Inc. S.A. | Natural language interface for searching database |
US6701305B1 (en) | 1999-06-09 | 2004-03-02 | The Boeing Company | Methods, apparatus and computer program products for information retrieval and document classification utilizing a multidimensional subspace |
US7711565B1 (en) | 1999-06-10 | 2010-05-04 | Gazdzinski Robert F | “Smart” elevator system and method |
US7093693B1 (en) | 1999-06-10 | 2006-08-22 | Gazdzinski Robert F | Elevator access control system and method |
US8065155B1 (en) | 1999-06-10 | 2011-11-22 | Gazdzinski Robert F | Adaptive advertising apparatus and methods |
US6615175B1 (en) | 1999-06-10 | 2003-09-02 | Robert F. Gazdzinski | “Smart” elevator system and method |
US6611802B2 (en) | 1999-06-11 | 2003-08-26 | International Business Machines Corporation | Method and system for proofreading and correcting dictated text |
US6658577B2 (en) | 1999-06-14 | 2003-12-02 | Apple Computer, Inc. | Breathing status LED indicator |
US6711585B1 (en) | 1999-06-15 | 2004-03-23 | Kanisa Inc. | System and method for implementing a knowledge management system |
US6401065B1 (en) | 1999-06-17 | 2002-06-04 | International Business Machines Corporation | Intelligent keyboard interface with use of human language processing |
US7190883B2 (en) | 1999-06-18 | 2007-03-13 | Intel Corporation | Systems and methods for fast random access and backward playback of video frames using decoded frame cache |
KR19990073234A (en) | 1999-06-24 | 1999-10-05 | 이영만 | MP3 data transmission and reception device |
JP2001014306A (en) | 1999-06-30 | 2001-01-19 | Sony Corp | Method and device for electronic document processing, and recording medium where electronic document processing program is recorded |
AUPQ138199A0 (en) | 1999-07-02 | 1999-07-29 | Telstra R & D Management Pty Ltd | A search system |
US6615176B2 (en) | 1999-07-13 | 2003-09-02 | International Business Machines Corporation | Speech enabling labeless controls in an existing graphical user interface |
US6442518B1 (en) | 1999-07-14 | 2002-08-27 | Compaq Information Technologies Group, L.P. | Method for refining time alignments of closed captions |
US6904405B2 (en) | 1999-07-17 | 2005-06-07 | Edwin A. Suominen | Message recognition using shared language model |
JP2003520983A (en) | 1999-07-21 | 2003-07-08 | アバイア テクノロジー コーポレーション | Improved text-to-speech conversion |
US6332138B1 (en) | 1999-07-23 | 2001-12-18 | Merck & Co., Inc. | Text influenced molecular indexing system and computer-implemented and/or computer-assisted method for same |
JP3361291B2 (en) | 1999-07-23 | 2003-01-07 | コナミ株式会社 | Speech synthesis method, speech synthesis device, and computer-readable medium recording speech synthesis program |
IL131135A0 (en) | 1999-07-27 | 2001-01-28 | Electric Lighthouse Software L | A method and system for electronic mail |
US6421672B1 (en) | 1999-07-27 | 2002-07-16 | Verizon Services Corp. | Apparatus for and method of disambiguation of directory listing searches utilizing multiple selectable secondary search keys |
US6628808B1 (en) | 1999-07-28 | 2003-09-30 | Datacard Corporation | Apparatus and method for verifying a scanned image |
US6553263B1 (en) | 1999-07-30 | 2003-04-22 | Advanced Bionics Corporation | Implantable pulse generators using rechargeable zero-volt technology lithium-ion batteries |
US6493667B1 (en) | 1999-08-05 | 2002-12-10 | International Business Machines Corporation | Enhanced likelihood computation using regression in a speech recognition system |
US7743188B2 (en) | 1999-08-12 | 2010-06-22 | Palm, Inc. | Method and apparatus for accessing a contacts database and telephone services |
US7451177B1 (en) | 1999-08-12 | 2008-11-11 | Avintaquin Capital, Llc | System for and method of implementing a closed loop response architecture for electronic commerce |
US7007239B1 (en) | 2000-09-21 | 2006-02-28 | Palm, Inc. | Method and apparatus for accessing a contacts database and telephone services |
US6721802B1 (en) | 1999-08-12 | 2004-04-13 | Point2 Technologies Inc. | Method, apparatus and program for the central storage of standardized image data |
US9167073B2 (en) | 1999-08-12 | 2015-10-20 | Hewlett-Packard Development Company, L.P. | Method and apparatus for accessing a contacts database and telephone services |
US8064886B2 (en) | 1999-08-12 | 2011-11-22 | Hewlett-Packard Development Company, L.P. | Control mechanisms for mobile devices |
US7069220B2 (en) | 1999-08-13 | 2006-06-27 | International Business Machines Corporation | Method for determining and maintaining dialog focus in a conversational speech system |
JP2001056233A (en) | 1999-08-17 | 2001-02-27 | Arex:Kk | On-vehicle voice information service device and voice information service system utilizing the same |
US6622121B1 (en) | 1999-08-20 | 2003-09-16 | International Business Machines Corporation | Testing speech recognition systems using test data generated by text-to-speech conversion |
US6792086B1 (en) | 1999-08-24 | 2004-09-14 | Microstrategy, Inc. | Voice network access provider system and method |
EP1079387A3 (en) | 1999-08-26 | 2003-07-09 | Matsushita Electric Industrial Co., Ltd. | Mechanism for storing information about recorded television broadcasts |
US6324512B1 (en) | 1999-08-26 | 2001-11-27 | Matsushita Electric Industrial Co., Ltd. | System and method for allowing family members to access TV contents and program media recorder over telephone or internet |
US6601234B1 (en) | 1999-08-31 | 2003-07-29 | Accenture Llp | Attribute dictionary in a business logic services environment |
US6912499B1 (en) | 1999-08-31 | 2005-06-28 | Nortel Networks Limited | Method and apparatus for training a multilingual speech model set |
US6697824B1 (en) | 1999-08-31 | 2004-02-24 | Accenture Llp | Relationship management in an E-commerce application framework |
US6671856B1 (en) | 1999-09-01 | 2003-12-30 | International Business Machines Corporation | Method, system, and program for determining boundaries in a string using a dictionary |
US6470347B1 (en) | 1999-09-01 | 2002-10-22 | International Business Machines Corporation | Method, system, program, and data structure for a dense array storing character strings |
GB2353927B (en) | 1999-09-06 | 2004-02-11 | Nokia Mobile Phones Ltd | User interface for text to speech conversion |
US6675169B1 (en) | 1999-09-07 | 2004-01-06 | Microsoft Corporation | Method and system for attaching information to words of a trie |
US6448986B1 (en) | 1999-09-07 | 2002-09-10 | Spotware Technologies Llc | Method and system for displaying graphical objects on a display screen |
US6421717B1 (en) | 1999-09-10 | 2002-07-16 | Avantgo, Inc. | System, method, and computer program product for customizing channels, content, and data for mobile devices |
US7127403B1 (en) | 1999-09-13 | 2006-10-24 | Microstrategy, Inc. | System and method for personalizing an interactive voice broadcast of a voice service based on particulars of a request |
US6885734B1 (en) | 1999-09-13 | 2005-04-26 | Microstrategy, Incorporated | System and method for the creation and automatic deployment of personalized, dynamic and interactive inbound and outbound voice services, with real-time interactive voice database queries |
US6633932B1 (en) | 1999-09-14 | 2003-10-14 | Texas Instruments Incorporated | Method and apparatus for using a universal serial bus to provide power to a portable electronic device |
DE19943875A1 (en) | 1999-09-14 | 2001-03-15 | Thomson Brandt Gmbh | Voice control system with a microphone array |
US6918677B2 (en) | 1999-09-15 | 2005-07-19 | Michael Shipman | Illuminated keyboard |
US6217183B1 (en) | 1999-09-15 | 2001-04-17 | Michael Shipman | Keyboard having illuminated keys |
US6601026B2 (en) | 1999-09-17 | 2003-07-29 | Discern Communications, Inc. | Information retrieval by natural language querying |
US6453315B1 (en) | 1999-09-22 | 2002-09-17 | Applied Semantics, Inc. | Meaning-based information organization and retrieval |
US7925610B2 (en) | 1999-09-22 | 2011-04-12 | Google Inc. | Determining a meaning of a knowledge item using document-based information |
US6463128B1 (en) | 1999-09-29 | 2002-10-08 | Denso Corporation | Adjustable coding detection in a portable telephone |
US6879957B1 (en) | 1999-10-04 | 2005-04-12 | William H. Pechter | Method for producing a speech rendition of text from diphone sounds |
US6868385B1 (en) | 1999-10-05 | 2005-03-15 | Yomobile, Inc. | Method and apparatus for the provision of information signals based upon speech recognition |
US6789231B1 (en) | 1999-10-05 | 2004-09-07 | Microsoft Corporation | Method and system for providing alternatives for text derived from stochastic input sources |
US6192253B1 (en) | 1999-10-06 | 2001-02-20 | Motorola, Inc. | Wrist-carried radiotelephone |
US6505175B1 (en) | 1999-10-06 | 2003-01-07 | Goldman, Sachs & Co. | Order centric tracking system |
US6625583B1 (en) | 1999-10-06 | 2003-09-23 | Goldman, Sachs & Co. | Handheld trading system interface |
ATE230917T1 (en) | 1999-10-07 | 2003-01-15 | Zlatan Ribic | METHOD AND ARRANGEMENT FOR RECORDING SOUND SIGNALS |
US7020685B1 (en) | 1999-10-08 | 2006-03-28 | Openwave Systems Inc. | Method and apparatus for providing internet content to SMS-based wireless devices |
US7219123B1 (en) | 1999-10-08 | 2007-05-15 | At Road, Inc. | Portable browser device with adaptive personalization capability |
US6192340B1 (en) | 1999-10-19 | 2001-02-20 | Max Abecassis | Integration of music from a personal library with real-time information |
US7176372B2 (en) | 1999-10-19 | 2007-02-13 | Medialab Solutions Llc | Interactive digital music recorder and player |
US6353794B1 (en) | 1999-10-19 | 2002-03-05 | Ar Group, Inc. | Air travel information and computer data compilation, retrieval and display method and system |
AU8030300A (en) | 1999-10-19 | 2001-04-30 | Sony Electronics Inc. | Natural language interface control system |
US6970915B1 (en) | 1999-11-01 | 2005-11-29 | Tellme Networks, Inc. | Streaming content over a telephone interface |
US6807574B1 (en) | 1999-10-22 | 2004-10-19 | Tellme Networks, Inc. | Method and apparatus for content personalization over a telephone interface |
AU2299701A (en) | 1999-10-22 | 2001-04-30 | Tellme Networks, Inc. | Streaming content over a telephone interface |
US6473630B1 (en) | 1999-10-22 | 2002-10-29 | Sony Corporation | Method and apparatus for powering a wireless headset used with a personal electronic device |
JP2001125896A (en) | 1999-10-26 | 2001-05-11 | Victor Co Of Japan Ltd | Natural language interactive system |
US7310600B1 (en) | 1999-10-28 | 2007-12-18 | Canon Kabushiki Kaisha | Language recognition using a similarity measure |
US6772195B1 (en) | 1999-10-29 | 2004-08-03 | Electronic Arts, Inc. | Chat clusters for a virtual world application |
GB2355834A (en) | 1999-10-29 | 2001-05-02 | Nokia Mobile Phones Ltd | Speech recognition |
US6725190B1 (en) | 1999-11-02 | 2004-04-20 | International Business Machines Corporation | Method and system for speech reconstruction from speech recognition features, pitch and voicing with resampled basis functions providing reconstruction of the spectral envelope |
WO2001033569A1 (en) | 1999-11-02 | 2001-05-10 | Iomega Corporation | Portable audio playback device and removable disk drive |
US6535983B1 (en) | 1999-11-08 | 2003-03-18 | 3Com Corporation | System and method for signaling and detecting request for power over ethernet |
US6546262B1 (en) | 1999-11-12 | 2003-04-08 | Altec Lansing Technologies, Inc. | Cellular telephone accessory device for a personal computer system |
US7050977B1 (en) | 1999-11-12 | 2006-05-23 | Phoenix Solutions, Inc. | Speech-enabled server for internet website and method |
US6633846B1 (en) | 1999-11-12 | 2003-10-14 | Phoenix Solutions, Inc. | Distributed realtime speech recognition system |
US7392185B2 (en) | 1999-11-12 | 2008-06-24 | Phoenix Solutions, Inc. | Speech based learning/training system using semantic decoding |
US6615172B1 (en) | 1999-11-12 | 2003-09-02 | Phoenix Solutions, Inc. | Intelligent query engine for processing voice based queries |
US9076448B2 (en) | 1999-11-12 | 2015-07-07 | Nuance Communications, Inc. | Distributed real time speech recognition system |
US6665640B1 (en) | 1999-11-12 | 2003-12-16 | Phoenix Solutions, Inc. | Interactive speech based learning/training system formulating search queries based on natural language parsing of recognized user queries |
US7725307B2 (en) | 1999-11-12 | 2010-05-25 | Phoenix Solutions, Inc. | Query engine for processing voice based queries including semantic decoding |
DE19955720C2 (en) | 1999-11-16 | 2002-04-11 | Hosseinzadeh Dolkhani Boris | Method and portable training device for performing training |
JP2001148899A (en) | 1999-11-19 | 2001-05-29 | Matsushita Electric Ind Co Ltd | Communication system, hearing aid, and adjustment method for the hearing aid |
US7412643B1 (en) | 1999-11-23 | 2008-08-12 | International Business Machines Corporation | Method and apparatus for linking representation and realization data |
US6532446B1 (en) | 1999-11-24 | 2003-03-11 | Openwave Systems Inc. | Server based speech recognition user interface for wireless devices |
US6526382B1 (en) | 1999-12-07 | 2003-02-25 | Comverse, Inc. | Language-oriented user interfaces for voice activated services |
US6755743B1 (en) | 1999-12-08 | 2004-06-29 | Kabushiki Kaisha Sega Enterprises | Communication game system and processing method thereof |
US6340937B1 (en) | 1999-12-09 | 2002-01-22 | Matej Stepita-Klauco | System and method for mapping multiple identical consecutive keystrokes to replacement characters |
US20010030660A1 (en) | 1999-12-10 | 2001-10-18 | Roustem Zainoulline | Interactive graphical user interface and method for previewing media products |
US7024363B1 (en) | 1999-12-14 | 2006-04-04 | International Business Machines Corporation | Methods and apparatus for contingent transfer and execution of spoken language interfaces |
GB2357395A (en) | 1999-12-14 | 2001-06-20 | Nokia Mobile Phones Ltd | Message exchange between wireless terminals. |
US6377925B1 (en) | 1999-12-16 | 2002-04-23 | Interactive Solutions, Inc. | Electronic translator for assisting communications |
US6978127B1 (en) | 1999-12-16 | 2005-12-20 | Koninklijke Philips Electronics N.V. | Hand-ear user interface for hand-held device |
US7434177B1 (en) | 1999-12-20 | 2008-10-07 | Apple Inc. | User interface for providing consolidation and access |
US6760412B1 (en) | 1999-12-21 | 2004-07-06 | Nortel Networks Limited | Remote reminder of scheduled events |
US20060184886A1 (en) | 1999-12-22 | 2006-08-17 | Urbanpixel Inc. | Spatial chat in a multiple browser environment |
US6397186B1 (en) | 1999-12-22 | 2002-05-28 | Ambush Interactive, Inc. | Hands-free, voice-operated remote control transmitter |
US6526395B1 (en) | 1999-12-31 | 2003-02-25 | Intel Corporation | Application of personality models and interaction with synthetic characters in a computing system |
US20010042107A1 (en) | 2000-01-06 | 2001-11-15 | Palm Stephen R. | Networked audio player transport protocol and architecture |
US7024366B1 (en) | 2000-01-10 | 2006-04-04 | Delphi Technologies, Inc. | Speech recognition with user specific adaptive voice feedback |
US6556983B1 (en) | 2000-01-12 | 2003-04-29 | Microsoft Corporation | Methods and apparatus for finding semantic information, such as usage logs, similar to a query using a pattern lattice data space |
KR100865247B1 (en) | 2000-01-13 | 2008-10-27 | 디지맥 코포레이션 | Authenticating metadata and embedding metadata in watermarks of media signals |
US6546388B1 (en) | 2000-01-14 | 2003-04-08 | International Business Machines Corporation | Metadata search results ranking system |
US6701294B1 (en) | 2000-01-19 | 2004-03-02 | Lucent Technologies, Inc. | User interface for translating natural language inquiries into database queries and data presentations |
US20020055934A1 (en) | 2000-01-24 | 2002-05-09 | Lipscomb Kenneth O. | Dynamic management and organization of media assets in a media player device |
US6732142B1 (en) | 2000-01-25 | 2004-05-04 | International Business Machines Corporation | Method and apparatus for audible presentation of web page content |
US6751621B1 (en) | 2000-01-27 | 2004-06-15 | Manning & Napier Information Services, Llc. | Construction of trainable semantic vectors and clustering, classification, and searching using trainable semantic vectors |
US6269712B1 (en) | 2000-01-28 | 2001-08-07 | John Zentmyer | Automotive full locking differential |
US6813607B1 (en) | 2000-01-31 | 2004-11-02 | International Business Machines Corporation | Translingual visual speech synthesis |
US6829603B1 (en) | 2000-02-02 | 2004-12-07 | International Business Machines Corp. | System, method and program product for interactive natural dialog |
US20030028380A1 (en) | 2000-02-02 | 2003-02-06 | Freeland Warwick Peter | Speech system |
US20010041021A1 (en) | 2000-02-04 | 2001-11-15 | Boyle Dennis J. | System and method for synchronization of image data between a handheld device and a computer |
GB2359177A (en) | 2000-02-08 | 2001-08-15 | Nokia Corp | Orientation sensitive display and selection mechanism |
US7149964B1 (en) | 2000-02-09 | 2006-12-12 | Microsoft Corporation | Creation and delivery of customized content |
US6871346B1 (en) | 2000-02-11 | 2005-03-22 | Microsoft Corp. | Back-end decoupled management model and management system utilizing same |
US6895558B1 (en) | 2000-02-11 | 2005-05-17 | Microsoft Corporation | Multi-access mode electronic personal assistant |
US6640098B1 (en) | 2000-02-14 | 2003-10-28 | Action Engine Corporation | System for obtaining service-related information for local interactive wireless devices |
US6606388B1 (en) | 2000-02-17 | 2003-08-12 | Arboretum Systems, Inc. | Method and system for enhancing audio signals |
GB2365676B (en) | 2000-02-18 | 2004-06-23 | Sensei Ltd | Mobile telephone with improved man-machine interface |
US6850775B1 (en) | 2000-02-18 | 2005-02-01 | Phonak Ag | Fitting-anlage |
US6760754B1 (en) | 2000-02-22 | 2004-07-06 | At&T Corp. | System, method and apparatus for communicating via sound messages and personal sound identifiers |
US20010056342A1 (en) | 2000-02-24 | 2001-12-27 | Piehn Thomas Barry | Voice enabled digital camera and language translator |
US20020055844A1 (en) | 2000-02-25 | 2002-05-09 | L'esperance Lauren | Speech user interface for portable personal devices |
WO2001063382A2 (en) | 2000-02-25 | 2001-08-30 | Synquiry Technologies, Ltd. | Conceptual factoring and unification of graphs representing semantic models |
AU2001243321A1 (en) | 2000-02-28 | 2001-09-12 | C.G.I. Technologies, Llc | Staged image delivery system |
US6934394B1 (en) | 2000-02-29 | 2005-08-23 | Logitech Europe S.A. | Universal four-channel surround sound speaker system for multimedia computer audio sub-systems |
US6490560B1 (en) | 2000-03-01 | 2002-12-03 | International Business Machines Corporation | Method and system for non-intrusive speaker verification using behavior models |
US6519566B1 (en) | 2000-03-01 | 2003-02-11 | International Business Machines Corporation | Method for hands-free operation of a pointer |
US6248946B1 (en) | 2000-03-01 | 2001-06-19 | Ijockey, Inc. | Multimedia content delivery system and method |
US6720980B1 (en) | 2000-03-01 | 2004-04-13 | Microsoft Corporation | Method and system for embedding voice notes |
US6895380B2 (en) | 2000-03-02 | 2005-05-17 | Electro Standards Laboratories | Voice actuation with contextual learning for intelligent machine control |
US6449620B1 (en) | 2000-03-02 | 2002-09-10 | Nimble Technology, Inc. | Method and apparatus for generating information pages using semi-structured data stored in a structured manner |
US6597345B2 (en) | 2000-03-03 | 2003-07-22 | Jetway Technologies Ltd. | Multifunctional keypad on touch screen |
US6466654B1 (en) | 2000-03-06 | 2002-10-15 | Avaya Technology Corp. | Personal virtual assistant with semantic tagging |
EP1275042A2 (en) | 2000-03-06 | 2003-01-15 | Kanisa Inc. | A system and method for providing an intelligent multi-step dialog with a user |
US6757362B1 (en) | 2000-03-06 | 2004-06-29 | Avaya Technology Corp. | Personal virtual assistant |
US6721489B1 (en) | 2000-03-08 | 2004-04-13 | Phatnoise, Inc. | Play list manager |
US6477488B1 (en) | 2000-03-10 | 2002-11-05 | Apple Computer, Inc. | Method for dynamic context scope selection in hybrid n-gram+LSA language modeling |
US6615220B1 (en) | 2000-03-14 | 2003-09-02 | Oracle International Corporation | Method and mechanism for data consolidation |
US8645137B2 (en) | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
US7634528B2 (en) | 2000-03-16 | 2009-12-15 | Microsoft Corporation | Harnessing information about the timing of a user's client-server interactions to enhance messaging and collaboration services |
US6260011B1 (en) | 2000-03-20 | 2001-07-10 | Microsoft Corporation | Methods and apparatus for automatically synchronizing electronic audio files with electronic text files |
US6510417B1 (en) | 2000-03-21 | 2003-01-21 | America Online, Inc. | System and method for voice access to internet-based information |
US6757646B2 (en) | 2000-03-22 | 2004-06-29 | Insightful Corporation | Extended functionality for an inverse inference engine based web search |
GB2366009B (en) | 2000-03-22 | 2004-07-21 | Canon Kk | Natural language machine interface |
US6658389B1 (en) | 2000-03-24 | 2003-12-02 | Ahmet Alpdemir | System, method, and business model for speech-interactive information system having business self-promotion, audio coupon and rating features |
US20020035474A1 (en) | 2000-07-18 | 2002-03-21 | Ahmet Alpdemir | Voice-interactive marketplace providing time and money saving benefits and real-time promotion publishing and feedback |
US6934684B2 (en) | 2000-03-24 | 2005-08-23 | Dialsurf, Inc. | Voice-interactive marketplace providing promotion and promotion tracking, loyalty reward and redemption, and other features |
US6272464B1 (en) | 2000-03-27 | 2001-08-07 | Lucent Technologies Inc. | Method and apparatus for assembling a prediction list of name pronunciation variations for use during speech recognition |
US6918086B2 (en) | 2000-03-28 | 2005-07-12 | Ariel S. Rogson | Method and apparatus for updating database of automatic spelling corrections |
US7187947B1 (en) | 2000-03-28 | 2007-03-06 | Affinity Labs, Llc | System and method for communicating selected information to an electronic device |
US6694297B2 (en) | 2000-03-30 | 2004-02-17 | Fujitsu Limited | Text information read-out device and music/voice reproduction device incorporating the same |
US6304844B1 (en) | 2000-03-30 | 2001-10-16 | Verbaltek, Inc. | Spelling speech recognition apparatus and method for communications |
US20010029455A1 (en) | 2000-03-31 | 2001-10-11 | Chin Jeffrey J. | Method and apparatus for providing multilingual translation over a network |
JP2001282279A (en) | 2000-03-31 | 2001-10-12 | Canon Inc | Voice information processor, and its method and storage medium |
US6704015B1 (en) | 2000-03-31 | 2004-03-09 | Ge Mortgage Holdings, Llc | Methods and apparatus for providing a quality control management system |
JP3728172B2 (en) | 2000-03-31 | 2005-12-21 | キヤノン株式会社 | Speech synthesis method and apparatus |
US7039588B2 (en) | 2000-03-31 | 2006-05-02 | Canon Kabushiki Kaisha | Synthesis unit selection apparatus and method, and storage medium |
AU2001248741A1 (en) | 2000-04-03 | 2001-10-15 | Yamaha Corporation | Portable appliance, power saving method and sound volume compensating method, and storage medium |
NL1014847C1 (en) | 2000-04-05 | 2001-10-08 | Minos B V I O | Rapid data transfer from suppliers of goods and services to clients via eg Internet using hierarchical menu system |
US7177798B2 (en) | 2000-04-07 | 2007-02-13 | Rensselaer Polytechnic Institute | Natural language interface using constrained intermediate dictionary of results |
US7124164B1 (en) | 2001-04-17 | 2006-10-17 | Chemtob Helen J | Method and apparatus for providing group interaction via communications networks |
US7478129B1 (en) | 2000-04-18 | 2009-01-13 | Helen Jeanne Chemtob | Method and apparatus for providing group interaction via communications networks |
US6721734B1 (en) | 2000-04-18 | 2004-04-13 | Claritech Corporation | Method and apparatus for information management using fuzzy typing |
US6976090B2 (en) | 2000-04-20 | 2005-12-13 | Actona Technologies Ltd. | Differentiated content and application delivery via internet |
US7194186B1 (en) | 2000-04-21 | 2007-03-20 | Vulcan Patents Llc | Flexible marking of recording data by a recording unit |
US6963841B2 (en) | 2000-04-21 | 2005-11-08 | Lessac Technology, Inc. | Speech training method with alternative proper pronunciation database |
US6865533B2 (en) | 2000-04-21 | 2005-03-08 | Lessac Technology Inc. | Text to speech |
US6917373B2 (en) | 2000-12-28 | 2005-07-12 | Microsoft Corporation | Context sensitive labels for an electronic device |
US7107204B1 (en) | 2000-04-24 | 2006-09-12 | Microsoft Corporation | Computer-aided writing system and method with cross-language writing wizard |
US6829607B1 (en) | 2000-04-24 | 2004-12-07 | Microsoft Corporation | System and method for facilitating user input by automatically providing dynamically generated completion information |
US6810379B1 (en) | 2000-04-24 | 2004-10-26 | Sensory, Inc. | Client/server architecture for text-to-speech synthesis |
US7058888B1 (en) | 2000-04-25 | 2006-06-06 | Microsoft Corporation | Multi-modal text editing correction |
WO2001084535A2 (en) | 2000-05-02 | 2001-11-08 | Dragon Systems, Inc. | Error correction in speech recognition |
US7162482B1 (en) | 2000-05-03 | 2007-01-09 | Musicmatch, Inc. | Information retrieval engine |
US6784901B1 (en) | 2000-05-09 | 2004-08-31 | There | Method, system and computer program product for the delivery of a chat message in a 3D multi-user environment |
DE60122708D1 (en) | 2000-05-11 | 2006-10-12 | Nes Stewart Irvine | ZERO CLICK |
US8024419B2 (en) | 2000-05-12 | 2011-09-20 | Sony Corporation | Method and system for remote access of personal music |
KR100867760B1 (en) | 2000-05-15 | 2008-11-10 | 소니 가부시끼 가이샤 | Reproducing apparatus, reproducing method and recording medium |
US8463912B2 (en) | 2000-05-23 | 2013-06-11 | Media Farm, Inc. | Remote displays in mobile communication networks |
JP3728177B2 (en) | 2000-05-24 | 2005-12-21 | キヤノン株式会社 | Audio processing system, apparatus, method, and storage medium |
AU2001263397A1 (en) | 2000-05-24 | 2001-12-03 | Stars 1-To-1 | Interactive voice communication method and system for information and entertainment |
FR2809509B1 (en) | 2000-05-26 | 2003-09-12 | Bull Sa | SYSTEM AND METHOD FOR INTERNATIONALIZING THE CONTENT OF TAGGED DOCUMENTS IN A COMPUTER SYSTEM |
US6910007B2 (en) | 2000-05-31 | 2005-06-21 | At&T Corp | Stochastic modeling of spectral adjustment for high quality pitch modification |
EP1160764A1 (en) | 2000-06-02 | 2001-12-05 | Sony France S.A. | Morphological categories for voice synthesis |
US6754504B1 (en) | 2000-06-10 | 2004-06-22 | Motorola, Inc. | Method and apparatus for controlling environmental conditions using a personal area network |
US6889361B1 (en) | 2000-06-13 | 2005-05-03 | International Business Machines Corporation | Educational spell checker |
US6839742B1 (en) | 2000-06-14 | 2005-01-04 | Hewlett-Packard Development Company, L.P. | World wide contextual navigation |
DE10030105A1 (en) | 2000-06-19 | 2002-01-03 | Bosch Gmbh Robert | Speech recognition device |
US20020042707A1 (en) | 2000-06-19 | 2002-04-11 | Gang Zhao | Grammar-packaged parsing |
US6680675B1 (en) | 2000-06-21 | 2004-01-20 | Fujitsu Limited | Interactive to-do list item notification system including GPS interface |
US6591379B1 (en) | 2000-06-23 | 2003-07-08 | Microsoft Corporation | Method and system for injecting an exception to recover unsaved data |
WO2002001401A1 (en) | 2000-06-26 | 2002-01-03 | Onerealm Inc. | Method and apparatus for normalizing and converting structured content |
US6336727B1 (en) | 2000-06-27 | 2002-01-08 | International Business Machines Corporation | Pointing device keyboard light |
JP3573688B2 (en) | 2000-06-28 | 2004-10-06 | 松下電器産業株式会社 | Similar document search device and related keyword extraction device |
JP2002014954A (en) | 2000-06-28 | 2002-01-18 | Toshiba Corp | Chinese language inputting and converting processing device and method, and recording medium |
JP3524846B2 (en) | 2000-06-29 | 2004-05-10 | 株式会社Ssr | Document feature extraction method and apparatus for text mining |
US6823311B2 (en) | 2000-06-29 | 2004-11-23 | Fujitsu Limited | Data processing system for vocalizing web content |
JP2002083152A (en) | 2000-06-30 | 2002-03-22 | Victor Co Of Japan Ltd | Contents download system, portable terminal player, and contents provider |
DE10031008A1 (en) | 2000-06-30 | 2002-01-10 | Nokia Mobile Phones Ltd | Procedure for assembling sentences for speech output |
US7277855B1 (en) | 2000-06-30 | 2007-10-02 | At&T Corp. | Personalized text-to-speech services |
US6684187B1 (en) | 2000-06-30 | 2004-01-27 | At&T Corp. | Method and system for preselection of suitable units for concatenative speech |
US6691111B2 (en) | 2000-06-30 | 2004-02-10 | Research In Motion Limited | System and method for implementing a natural language user interface |
US6505158B1 (en) | 2000-07-05 | 2003-01-07 | At&T Corp. | Synthesis-based pre-selection of suitable units for concatenative speech |
US6662023B1 (en) | 2000-07-06 | 2003-12-09 | Nokia Mobile Phones Ltd. | Method and apparatus for controlling and securing mobile phones that are lost, stolen or misused |
US6240362B1 (en) | 2000-07-10 | 2001-05-29 | Iap Intermodal, Llc | Method to schedule a vehicle in real-time to transport freight and passengers |
JP3949356B2 (en) | 2000-07-12 | 2007-07-25 | 三菱電機株式会社 | Spoken dialogue system |
US7389225B1 (en) | 2000-10-18 | 2008-06-17 | Novell, Inc. | Method and mechanism for superpositioning state vectors in a semantic abstract |
TW521266B (en) | 2000-07-13 | 2003-02-21 | Verbaltek Inc | Perceptual phonetic feature speech recognition system and method |
US6598021B1 (en) | 2000-07-13 | 2003-07-22 | Craig R. Shambaugh | Method of modifying speech to provide a user selectable dialect |
US7672952B2 (en) | 2000-07-13 | 2010-03-02 | Novell, Inc. | System and method of semantic correlation of rich content |
US6621892B1 (en) | 2000-07-14 | 2003-09-16 | America Online, Inc. | System and method for converting electronic mail text to audio for telephonic delivery |
US7289102B2 (en) | 2000-07-17 | 2007-10-30 | Microsoft Corporation | Method and apparatus using multiple sensors in a device with a display |
US7139709B2 (en) | 2000-07-20 | 2006-11-21 | Microsoft Corporation | Middleware layer between speech related applications and engines |
US7143040B2 (en) | 2000-07-20 | 2006-11-28 | British Telecommunications Public Limited Company | Interactive dialogues |
SE516658C2 (en) | 2000-07-21 | 2002-02-12 | Ericsson Telefon Ab L M | Procedure and Device for Enhanced Short Message Services |
US20060143007A1 (en) | 2000-07-24 | 2006-06-29 | Koh V E | User interaction with voice information services |
US7308408B1 (en) | 2000-07-24 | 2007-12-11 | Microsoft Corporation | Providing services for an information processing system using an audio interface |
JP2002041276A (en) | 2000-07-24 | 2002-02-08 | Sony Corp | Interactive operation-supporting system, interactive operation-supporting method and recording medium |
US6789094B2 (en) | 2000-07-25 | 2004-09-07 | Sun Microsystems, Inc. | Method and apparatus for providing extended file attributes in an extended attribute namespace |
KR20020009276A (en) | 2000-07-25 | 2002-02-01 | 구자홍 | A mobile phone equipped with audio player and method for providing a MP3 file to mobile phone |
DE60133902D1 (en) | 2000-07-28 | 2008-06-19 | Siemens Vdo Automotive Corp | |
US7092928B1 (en) | 2000-07-31 | 2006-08-15 | Quantum Leap Research, Inc. | Intelligent portal engine |
US7853664B1 (en) | 2000-07-31 | 2010-12-14 | Landmark Digital Services Llc | Method and system for purchasing pre-recorded music |
US20020013784A1 (en) | 2000-07-31 | 2002-01-31 | Swanson Raymond H. | Audio data transmission system and method of operation thereof |
US6714221B1 (en) | 2000-08-03 | 2004-03-30 | Apple Computer, Inc. | Depicting and setting scroll amount |
JP2002055935A (en) | 2000-08-07 | 2002-02-20 | Sony Corp | Apparatus and method for information processing, service providing system, and recording medium |
US20020015064A1 (en) | 2000-08-07 | 2002-02-07 | Robotham John S. | Gesture-based user interface to multi-level and multi-modal sets of bit-maps |
US6778951B1 (en) | 2000-08-09 | 2004-08-17 | Concerto Software, Inc. | Information retrieval method with natural language interface |
US20020120697A1 (en) | 2000-08-14 | 2002-08-29 | Curtis Generous | Multi-channel messaging system and method |
AU2001285023A1 (en) | 2000-08-17 | 2002-02-25 | Mobileum, Inc. | Method and system for wireless voice channel/data channel integration |
JP4197220B2 (en) | 2000-08-17 | 2008-12-17 | アルパイン株式会社 | Operating device |
WO2002017069A1 (en) | 2000-08-21 | 2002-02-28 | Yahoo! Inc. | Method and system of interpreting and presenting web content using a voice browser |
JP3075809U (en) | 2000-08-23 | 2001-03-06 | 新世代株式会社 | Karaoke microphone |
US7024407B2 (en) | 2000-08-24 | 2006-04-04 | Content Analyst Company, Llc | Word sense disambiguation |
US6766320B1 (en) | 2000-08-24 | 2004-07-20 | Microsoft Corporation | Search engine with natural language-based robust parsing for user query and relevance feedback learning |
NL1016056C2 (en) | 2000-08-30 | 2002-03-15 | Koninkl Kpn Nv | Method and system for personalization of digital information. |
US7062488B1 (en) | 2000-08-30 | 2006-06-13 | Richard Reisman | Task/domain segmentation in applying feedback to command control |
US6529586B1 (en) | 2000-08-31 | 2003-03-04 | Oracle Cable, Inc. | System and method for gathering, personalized rendering, and secure telephonic transmission of audio data |
DE10042944C2 (en) | 2000-08-31 | 2003-03-13 | Siemens Ag | Grapheme-phoneme conversion |
US6799098B2 (en) | 2000-09-01 | 2004-09-28 | Beltpack Corporation | Remote control system for a locomotive using voice commands |
US6556971B1 (en) | 2000-09-01 | 2003-04-29 | Snap-On Technologies, Inc. | Computer-implemented speech recognition system training |
GB2366940B (en) | 2000-09-06 | 2004-08-11 | Ericsson Telefon Ab L M | Text language detection |
US20050030175A1 (en) | 2003-08-07 | 2005-02-10 | Wolfe Daniel G. | Security apparatus, system, and method |
WO2002021438A2 (en) | 2000-09-07 | 2002-03-14 | Koninklijke Philips Electronics N.V. | Image matching |
JP2002082893A (en) | 2000-09-07 | 2002-03-22 | Hiroyuki Tarumi | Terminal with chatting means, editing device, chat server and recording medium |
GB2366542B (en) | 2000-09-09 | 2004-02-18 | Ibm | Keyboard illumination for computing devices having backlit displays |
US7095733B1 (en) | 2000-09-11 | 2006-08-22 | Yahoo! Inc. | Voice integrated VOIP system |
US6603837B1 (en) | 2000-09-11 | 2003-08-05 | Kinera, Inc. | Method and system to provide a global integrated messaging services distributed network with personalized international roaming |
WO2002023796A1 (en) | 2000-09-11 | 2002-03-21 | Sentrycom Ltd. | A biometric-based system and method for enabling authentication of electronic messages sent over a network |
JP3784289B2 (en) | 2000-09-12 | 2006-06-07 | 松下電器産業株式会社 | Media editing method and apparatus |
US7236932B1 (en) | 2000-09-12 | 2007-06-26 | Avaya Technology Corp. | Method of and apparatus for improving productivity of human reviewers of automatically transcribed documents generated by media conversion systems |
US20040205671A1 (en) | 2000-09-13 | 2004-10-14 | Tatsuya Sukehiro | Natural-language processing system |
US7287009B1 (en) | 2000-09-14 | 2007-10-23 | Raanan Liebermann | System and a method for carrying out personal and business transactions |
AU2001290882A1 (en) | 2000-09-15 | 2002-03-26 | Lernout And Hauspie Speech Products N.V. | Fast waveform synchronization for concatenation and time-scale modification of speech |
HRP20000624A2 (en) | 2000-09-20 | 2001-04-30 | Grabar Ivan | Mp3 jukebox |
JP3818428B2 (en) | 2000-09-21 | 2006-09-06 | 株式会社セガ | Character communication device |
US7813915B2 (en) | 2000-09-25 | 2010-10-12 | Fujitsu Limited | Apparatus for reading a plurality of documents and a method thereof |
US6999914B1 (en) | 2000-09-28 | 2006-02-14 | Manning And Napier Information Services Llc | Device and method of determining emotive index corresponding to a message |
AU2001295080A1 (en) | 2000-09-29 | 2002-04-08 | Professorq, Inc. | Natural-language voice-activated personal assistant |
US6836760B1 (en) | 2000-09-29 | 2004-12-28 | Apple Computer, Inc. | Use of semantic inference and context-free grammar with speech recognition system |
US6999932B1 (en) | 2000-10-10 | 2006-02-14 | Intel Corporation | Language independent voice-based search system |
US20020046315A1 (en) | 2000-10-13 | 2002-04-18 | Interactive Objects, Inc. | System and method for mapping interface functionality to codec functionality in a portable audio device |
WO2002031625A2 (en) | 2000-10-13 | 2002-04-18 | Cytaq, Inc. | A system and method of translating a universal query language to sql |
US7043422B2 (en) | 2000-10-13 | 2006-05-09 | Microsoft Corporation | Method and apparatus for distribution-based language model adaptation |
US7219058B1 (en) | 2000-10-13 | 2007-05-15 | At&T Corp. | System and method for processing speech recognition results |
US7149695B1 (en) | 2000-10-13 | 2006-12-12 | Apple Computer, Inc. | Method and apparatus for speech recognition using semantic inference and word agglomeration |
US7574272B2 (en) | 2000-10-13 | 2009-08-11 | Eric Paul Gibbs | System and method for data transfer optimization in a portable audio device |
US7457750B2 (en) * | 2000-10-13 | 2008-11-25 | At&T Corp. | Systems and methods for dynamic re-configurable speech recognition |
US6947728B2 (en) | 2000-10-13 | 2005-09-20 | Matsushita Electric Industrial Co., Ltd. | Mobile phone with music reproduction function, music data reproduction method by mobile phone with music reproduction function, and the program thereof |
US20020151297A1 (en) | 2000-10-14 | 2002-10-17 | Donald Remboski | Context aware wireless communication device and method |
US6757365B1 (en) | 2000-10-16 | 2004-06-29 | Tellme Networks, Inc. | Instant messaging via telephone interfaces |
GB2386724A (en) | 2000-10-16 | 2003-09-24 | Tangis Corp | Dynamically determining appropriate computer interfaces |
US6990450B2 (en) | 2000-10-19 | 2006-01-24 | Qwest Communications International Inc. | System and method for converting text-to-voice |
US6862568B2 (en) | 2000-10-19 | 2005-03-01 | Qwest Communications International, Inc. | System and method for converting text-to-voice |
KR100726582B1 (en) | 2000-10-25 | 2007-06-11 | 주식회사 케이티 | The Method for Providing Multi-National Character Keyboard by Location Validataion of Wireless Communication Terminal |
US6832194B1 (en) | 2000-10-26 | 2004-12-14 | Sensory, Incorporated | Audio recognition peripheral system |
US7027974B1 (en) | 2000-10-27 | 2006-04-11 | Science Applications International Corporation | Ontology-based parser for natural language processing |
IL139347A0 (en) | 2000-10-30 | 2001-11-25 | Speech generating system and method | |
KR100902966B1 (en) | 2000-10-30 | 2009-06-15 | 마이크로소프트 코포레이션 | Method and system for mapping strings for comparison |
US6980953B1 (en) * | 2000-10-31 | 2005-12-27 | International Business Machines Corp. | Real-time remote transcription or translation service |
US6970935B1 (en) | 2000-11-01 | 2005-11-29 | International Business Machines Corporation | Conversational networking via transport, coding and control conversational protocols |
US6934756B2 (en) | 2000-11-01 | 2005-08-23 | International Business Machines Corporation | Conversational networking via transport, coding and control conversational protocols |
US7006969B2 (en) | 2000-11-02 | 2006-02-28 | At&T Corp. | System and method of pattern recognition in very high-dimensional space |
JP2002149187A (en) | 2000-11-07 | 2002-05-24 | Sony Corp | Device and method for recognizing voice and recording medium |
US6918091B2 (en) | 2000-11-09 | 2005-07-12 | Change Tools, Inc. | User definable interface system, method and computer program product |
ATE297588T1 (en) | 2000-11-14 | 2005-06-15 | Ibm | ADJUSTING PHONETIC CONTEXT TO IMPROVE SPEECH RECOGNITION |
US7653691B2 (en) | 2000-11-15 | 2010-01-26 | Pacific Datavision Inc. | Systems and methods for communicating using voice messages |
US6807536B2 (en) | 2000-11-16 | 2004-10-19 | Microsoft Corporation | Methods and systems for computing singular value decompositions of matrices and low rank approximations of matrices |
US6957076B2 (en) | 2000-11-22 | 2005-10-18 | Denso Corporation | Location specific reminders for wireless mobiles |
US20020152076A1 (en) | 2000-11-28 | 2002-10-17 | Jonathan Kahn | System for permanent alignment of text utterances to their associated audio utterances |
US20040085162A1 (en) | 2000-11-29 | 2004-05-06 | Rajeev Agarwal | Method and apparatus for providing a mixed-initiative dialog between a user and a machine |
JP2002169581A (en) | 2000-11-29 | 2002-06-14 | Matsushita Electric Ind Co Ltd | Method and device for voice synthesis |
US6772123B2 (en) | 2000-11-30 | 2004-08-03 | 3Com Corporation | Method and system for performing speech recognition for an internet appliance using a remotely located speech recognition application |
GB0029576D0 (en) | 2000-12-02 | 2001-01-17 | Hewlett Packard Co | Voice site personality setting |
US6978239B2 (en) | 2000-12-04 | 2005-12-20 | Microsoft Corporation | Method and apparatus for speech synthesis without prosody modification |
US20020067308A1 (en) | 2000-12-06 | 2002-06-06 | Xerox Corporation | Location/time-based reminder for personal electronic devices |
US7113943B2 (en) | 2000-12-06 | 2006-09-26 | Content Analyst Company, Llc | Method for document comparison and selection |
US20020072816A1 (en) | 2000-12-07 | 2002-06-13 | Yoav Shdema | Audio system |
US7117231B2 (en) | 2000-12-07 | 2006-10-03 | International Business Machines Corporation | Method and system for the automatic generation of multi-lingual synchronized sub-titles for audiovisual data |
US20020072914A1 (en) | 2000-12-08 | 2002-06-13 | Hiyan Alshawi | Method and apparatus for creation and user-customization of speech-enabled services |
US7016847B1 (en) | 2000-12-08 | 2006-03-21 | Ben Franklin Patent Holdings L.L.C. | Open architecture for a voice user interface |
US6910186B2 (en) | 2000-12-08 | 2005-06-21 | Kyunam Kim | Graphic chatting with organizational avatars |
US7043420B2 (en) | 2000-12-11 | 2006-05-09 | International Business Machines Corporation | Trainable dynamic phrase reordering for natural language generation in conversational systems |
EP1215661A1 (en) | 2000-12-14 | 2002-06-19 | TELEFONAKTIEBOLAGET L M ERICSSON (publ) | Mobile terminal controllable by spoken utterances |
US6718331B2 (en) | 2000-12-14 | 2004-04-06 | International Business Machines Corporation | Method and apparatus for locating inter-enterprise resources using text-based strings |
WO2002050816A1 (en) | 2000-12-18 | 2002-06-27 | Koninklijke Philips Electronics N.V. | Store speech, select vocabulary to recognize word |
US20020077082A1 (en) | 2000-12-18 | 2002-06-20 | Nortel Networks Limited | Voice message presentation on personal wireless devices |
US6910004B2 (en) | 2000-12-19 | 2005-06-21 | Xerox Corporation | Method and computer system for part-of-speech tagging of incomplete sentences |
US20040190688A1 (en) | 2003-03-31 | 2004-09-30 | Timmins Timothy A. | Communications methods and systems using voiceprints |
US7197120B2 (en) | 2000-12-22 | 2007-03-27 | Openwave Systems Inc. | Method and system for facilitating mediated communication |
WO2002052863A2 (en) | 2000-12-22 | 2002-07-04 | Anthropics Technology Limited | Communication system |
US6762741B2 (en) | 2000-12-22 | 2004-07-13 | Visteon Global Technologies, Inc. | Automatic brightness control system and method for a display device using a logarithmic sensor |
EP1217609A3 (en) | 2000-12-22 | 2004-02-25 | Hewlett-Packard Company | Speech recognition |
US6738738B2 (en) | 2000-12-23 | 2004-05-18 | Tellme Networks, Inc. | Automated transformation from American English to British English |
US6973427B2 (en) | 2000-12-26 | 2005-12-06 | Microsoft Corporation | Method for adding phonetic descriptions to a speech recognition lexicon |
TW490655B (en) | 2000-12-27 | 2002-06-11 | Winbond Electronics Corp | Method and device for recognizing authorized users using voice spectrum information |
SE518418C2 (en) | 2000-12-28 | 2002-10-08 | Ericsson Telefon Ab L M | Sound-based proximity detector |
US6937986B2 (en) | 2000-12-28 | 2005-08-30 | Comverse, Inc. | Automatic dynamic speech recognition vocabulary based on external sources of information |
MXPA02008345A (en) | 2000-12-29 | 2002-12-13 | Gen Electric | Method and system for identifying repeatedly malfunctioning equipment. |
US20020133347A1 (en) | 2000-12-29 | 2002-09-19 | Eberhard Schoneburg | Method and apparatus for natural language dialog interface |
US7254773B2 (en) | 2000-12-29 | 2007-08-07 | International Business Machines Corporation | Automated spell analysis |
US7054419B2 (en) | 2001-01-02 | 2006-05-30 | Soundbite Communications, Inc. | Answering machine detection for voice message delivery method and system |
US6731312B2 (en) | 2001-01-08 | 2004-05-04 | Apple Computer, Inc. | Media player interface |
US7085723B2 (en) | 2001-01-12 | 2006-08-01 | International Business Machines Corporation | System and method for determining utterance context in a multi-context speech application |
US7249018B2 (en) | 2001-01-12 | 2007-07-24 | International Business Machines Corporation | System and method for relating syntax and semantics for a conversational speech application |
US7257537B2 (en) | 2001-01-12 | 2007-08-14 | International Business Machines Corporation | Method and apparatus for performing dialog management in a computer conversational interface |
AU2001224979A1 (en) | 2001-01-23 | 2001-05-08 | Phonak Ag | Communication method and a hearing aid system |
US20020099552A1 (en) | 2001-01-25 | 2002-07-25 | Darryl Rubin | Annotating electronic information with audio clips |
US6529608B2 (en) | 2001-01-26 | 2003-03-04 | Ford Global Technologies, Inc. | Speech recognition system |
GB2374772B (en) | 2001-01-29 | 2004-12-29 | Hewlett Packard Co | Audio user interface |
US6625576B2 (en) | 2001-01-29 | 2003-09-23 | Lucent Technologies Inc. | Method and apparatus for performing text-to-speech conversion in a client/server environment |
US7123699B2 (en) | 2001-02-01 | 2006-10-17 | Estech Systems, Inc. | Voice mail in a voice over IP telephone system |
JP2002229955A (en) | 2001-02-02 | 2002-08-16 | Matsushita Electric Ind Co Ltd | Information terminal device and authentication system |
US6964023B2 (en) | 2001-02-05 | 2005-11-08 | International Business Machines Corporation | System and method for multi-modal focus detection, referential ambiguity resolution and mood classification using multi-modal input |
US6983238B2 (en) | 2001-02-07 | 2006-01-03 | American International Group, Inc. | Methods and apparatus for globalizing software |
US20020152255A1 (en) | 2001-02-08 | 2002-10-17 | International Business Machines Corporation | Accessibility on demand |
US7698652B2 (en) | 2001-02-09 | 2010-04-13 | Koninklijke Philips Electronics N.V. | Rapid retrieval user interface designed around small displays and few buttons for searching long lists |
US7617099B2 (en) | 2001-02-12 | 2009-11-10 | FortMedia Inc. | Noise suppression by two-channel tandem spectrum modification for speech signal in an automobile |
US20020111810A1 (en) | 2001-02-15 | 2002-08-15 | Khan M. Salahuddin | Spatially built word list for automatic speech recognition program and method for formation thereof |
US7171365B2 (en) | 2001-02-16 | 2007-01-30 | International Business Machines Corporation | Tracking time using portable recorders and speech recognition |
US6622136B2 (en) | 2001-02-16 | 2003-09-16 | Motorola, Inc. | Interactive tool for semi-automatic creation of a domain model |
US7340389B2 (en) | 2001-02-16 | 2008-03-04 | Microsoft Corporation | Multilanguage UI with localized resources |
US7013289B2 (en) | 2001-02-21 | 2006-03-14 | Michel Horn | Global electronic commerce system |
US6970820B2 (en) | 2001-02-26 | 2005-11-29 | Matsushita Electric Industrial Co., Ltd. | Voice personalization of speech synthesizer |
US6804677B2 (en) | 2001-02-26 | 2004-10-12 | Ori Software Development Ltd. | Encoding semi-structured data for efficient search and browsing |
US7290039B1 (en) | 2001-02-27 | 2007-10-30 | Microsoft Corporation | Intent based processing |
US6850887B2 (en) | 2001-02-28 | 2005-02-01 | International Business Machines Corporation | Speech recognition in noisy environments |
KR100605854B1 (en) | 2001-02-28 | 2006-08-01 | 삼성전자주식회사 | Method for downloading and replaying data of mobile communication terminal |
GB2372864B (en) | 2001-02-28 | 2005-09-07 | Vox Generation Ltd | Spoken language interface |
US20030164848A1 (en) | 2001-03-01 | 2003-09-04 | International Business Machines Corporation | Method and apparatus for summarizing content of a document for a visually impaired user |
US20020122053A1 (en) | 2001-03-01 | 2002-09-05 | International Business Machines Corporation | Method and apparatus for presenting non-displayed text in Web pages |
US20020123894A1 (en) | 2001-03-01 | 2002-09-05 | International Business Machines Corporation | Processing speech recognition errors in an embedded speech recognition system |
US6721728B2 (en) | 2001-03-02 | 2004-04-13 | The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration | System, method and apparatus for discovering phrases in a database |
US7076738B2 (en) | 2001-03-02 | 2006-07-11 | Semantic Compaction Systems | Computer device, method and article of manufacture for utilizing sequenced symbols to enable programmed application and commands |
AUPR360701A0 (en) | 2001-03-06 | 2001-04-05 | Worldlingo, Inc | Seamless translation system |
US20020126097A1 (en) | 2001-03-07 | 2002-09-12 | Savolainen Sampo Jussi Pellervo | Alphanumeric data entry method and apparatus using reduced keyboard and context related dictionaries |
US7000189B2 (en) | 2001-03-08 | 2006-02-14 | International Business Mahcines Corporation | Dynamic data generation suitable for talking browser |
US7200558B2 (en) | 2001-03-08 | 2007-04-03 | Matsushita Electric Industrial Co., Ltd. | Prosody generating device, prosody generating method, and program |
US20020173961A1 (en) | 2001-03-09 | 2002-11-21 | Guerra Lisa M. | System, method and computer program product for dynamic, robust and fault tolerant audio output in a speech recognition framework |
US7024364B2 (en) | 2001-03-09 | 2006-04-04 | Bevocal, Inc. | System, method and computer program product for looking up business addresses and directions based on a voice dial-up session |
US20020169605A1 (en) | 2001-03-09 | 2002-11-14 | Damiba Bertrand A. | System, method and computer program product for self-verifying file content in a speech recognition framework |
US7174297B2 (en) | 2001-03-09 | 2007-02-06 | Bevocal, Inc. | System, method and computer program product for a dynamically configurable voice portal |
US7216073B2 (en) | 2001-03-13 | 2007-05-08 | Intelligate, Ltd. | Dynamic natural language understanding |
US6513008B2 (en) | 2001-03-15 | 2003-01-28 | Matsushita Electric Industrial Co., Ltd. | Method and tool for customization of speech synthesizer databases using hierarchical generalized speech templates |
US7860706B2 (en) | 2001-03-16 | 2010-12-28 | Eli Abir | Knowledge system method and appparatus |
US6448485B1 (en) | 2001-03-16 | 2002-09-10 | Intel Corporation | Method and system for embedding audio titles |
US6985858B2 (en) | 2001-03-20 | 2006-01-10 | Microsoft Corporation | Method and apparatus for removing noise from feature vectors |
US7209880B1 (en) | 2001-03-20 | 2007-04-24 | At&T Corp. | Systems and methods for dynamic re-configurable speech recognition |
US6677929B2 (en) | 2001-03-21 | 2004-01-13 | Agilent Technologies, Inc. | Optical pseudo trackball controls the operation of an appliance or machine |
JP2002351789A (en) | 2001-03-21 | 2002-12-06 | Sharp Corp | Electronic mail transmission/reception system and electronic mail transission/reception program |
JP3925611B2 (en) | 2001-03-22 | 2007-06-06 | セイコーエプソン株式会社 | Information providing system, information providing apparatus, program, information storage medium, and user interface setting method |
US6922726B2 (en) | 2001-03-23 | 2005-07-26 | International Business Machines Corporation | Web accessibility service apparatus and method |
US7058889B2 (en) | 2001-03-23 | 2006-06-06 | Koninklijke Philips Electronics N.V. | Synchronizing text/visual information with audio playback |
FI20010644A (en) | 2001-03-28 | 2002-09-29 | Nokia Corp | Specify the language of the character sequence |
US6738743B2 (en) | 2001-03-28 | 2004-05-18 | Intel Corporation | Unified client-server distributed architectures for spoken dialogue systems |
US6834264B2 (en) | 2001-03-29 | 2004-12-21 | Provox Technologies Corporation | Method and apparatus for voice dictation and document production |
US7437670B2 (en) | 2001-03-29 | 2008-10-14 | International Business Machines Corporation | Magnifying the text of a link while still retaining browser function in the magnified display |
US6535852B2 (en) | 2001-03-29 | 2003-03-18 | International Business Machines Corporation | Training of text-to-speech systems |
US7406421B2 (en) | 2001-10-26 | 2008-07-29 | Intellisist Inc. | Systems and methods for reviewing informational content in a vehicle |
US7035794B2 (en) | 2001-03-30 | 2006-04-25 | Intel Corporation | Compressing and using a concatenative speech database in text-to-speech systems |
US6748398B2 (en) | 2001-03-30 | 2004-06-08 | Microsoft Corporation | Relevance maximizing, iteration minimizing, relevance-feedback, content-based image retrieval (CBIR) |
US6996531B2 (en) | 2001-03-30 | 2006-02-07 | Comverse Ltd. | Automated database assistance using a telephone for a speech based or text based multimedia communication mode |
US6792407B2 (en) | 2001-03-30 | 2004-09-14 | Matsushita Electric Industrial Co., Ltd. | Text selection and recording by feedback and adaptation for development of personalized text-to-speech systems |
JP3597141B2 (en) | 2001-04-03 | 2004-12-02 | 泰鈞 温 | Information input device and method, mobile phone and character input method of mobile phone |
CN1156819C (en) | 2001-04-06 | 2004-07-07 | 国际商业机器公司 | Method of producing individual characteristic speech sound from text |
US6690828B2 (en) | 2001-04-09 | 2004-02-10 | Gary Elliott Meyers | Method for representing and comparing digital images |
US6724370B2 (en) | 2001-04-12 | 2004-04-20 | International Business Machines Corporation | Touchscreen user interface |
US7155668B2 (en) | 2001-04-19 | 2006-12-26 | International Business Machines Corporation | Method and system for identifying relationships between text documents and structured variables pertaining to the text documents |
TW504916B (en) | 2001-04-24 | 2002-10-01 | Inventec Appliances Corp | Method capable of generating different input values by pressing a single key from multiple directions |
US20020161865A1 (en) | 2001-04-25 | 2002-10-31 | Gateway, Inc. | Automated network configuration of connected device |
DE60142938D1 (en) | 2001-04-25 | 2010-10-07 | Sony France Sa | Method and apparatus for identifying the type of information, e.g. for identifying the name content of a music file |
US6820055B2 (en) | 2001-04-26 | 2004-11-16 | Speche Communications | Systems and methods for automated audio transcription, translation, and transfer with text display software for manipulating the text |
GB0110326D0 (en) | 2001-04-27 | 2001-06-20 | Ibm | Method and apparatus for interoperation between legacy software and screen reader programs |
US6970881B1 (en) | 2001-05-07 | 2005-11-29 | Intelligenxia, Inc. | Concept-based method and system for dynamically analyzing unstructured information |
US7024400B2 (en) | 2001-05-08 | 2006-04-04 | Sunflare Co., Ltd. | Differential LSI space-based probabilistic document classifier |
US6654740B2 (en) | 2001-05-08 | 2003-11-25 | Sunflare Co., Ltd. | Probabilistic information retrieval based on differential latent semantic space |
US6751595B2 (en) | 2001-05-09 | 2004-06-15 | Bellsouth Intellectual Property Corporation | Multi-stage large vocabulary speech recognition system and method |
JP4369132B2 (en) | 2001-05-10 | 2009-11-18 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | Background learning of speaker voice |
DE10122828A1 (en) | 2001-05-11 | 2002-11-14 | Philips Corp Intellectual Pty | Procedure for training or adapting a speech recognizer |
US20020169592A1 (en) | 2001-05-11 | 2002-11-14 | Aityan Sergey Khachatur | Open environment for real-time multilingual communication |
US7085722B2 (en) | 2001-05-14 | 2006-08-01 | Sony Computer Entertainment America Inc. | System and method for menu-driven voice control of characters in a game environment |
US6766233B2 (en) | 2001-05-15 | 2004-07-20 | Intellisist, Llc | Modular telematic control unit |
US7620363B2 (en) | 2001-05-16 | 2009-11-17 | Aol Llc | Proximity synchronization of audio content among multiple playback and storage devices |
US7730401B2 (en) | 2001-05-16 | 2010-06-01 | Synaptics Incorporated | Touch screen with user interface enhancement |
US20050024341A1 (en) | 2001-05-16 | 2005-02-03 | Synaptics, Inc. | Touch screen with user interface enhancement |
US7024460B2 (en) | 2001-07-31 | 2006-04-04 | Bytemobile, Inc. | Service-based compression of content within a network communication system |
US6775358B1 (en) | 2001-05-17 | 2004-08-10 | Oracle Cable, Inc. | Method and system for enhanced interactive playback of audio content to telephone callers |
US6944594B2 (en) | 2001-05-30 | 2005-09-13 | Bellsouth Intellectual Property Corporation | Multi-context conversational environment system and method |
AU2002314933A1 (en) * | 2001-05-30 | 2002-12-09 | Cameronsound, Inc. | Language independent and voice operated information management system |
US7020663B2 (en) | 2001-05-30 | 2006-03-28 | George M. Hay | System and method for the delivery of electronic books |
US6877003B2 (en) | 2001-05-31 | 2005-04-05 | Oracle International Corporation | Efficient collation element structure for handling large numbers of characters |
JP2002358092A (en) | 2001-06-01 | 2002-12-13 | Sony Corp | Voice synthesizing system |
GB0113570D0 (en) | 2001-06-04 | 2001-07-25 | Hewlett Packard Co | Audio-form presentation of text messages |
US20020194003A1 (en) | 2001-06-05 | 2002-12-19 | Mozer Todd F. | Client-server security system and method |
US20030056207A1 (en) | 2001-06-06 | 2003-03-20 | Claudius Fischer | Process for deploying software from a central computer system to remotely located devices |
GB0114236D0 (en) | 2001-06-12 | 2001-08-01 | Hewlett Packard Co | Artificial language generation |
US7076527B2 (en) | 2001-06-14 | 2006-07-11 | Apple Computer, Inc. | Method and apparatus for filtering email |
SE519177C2 (en) | 2001-06-14 | 2003-01-28 | Ericsson Telefon Ab L M | A mobile terminal and a method of a mobile communication system for downloading messages to the mobile terminal |
US7119267B2 (en) | 2001-06-15 | 2006-10-10 | Yamaha Corporation | Portable mixing recorder and method and program for controlling the same |
US20070016563A1 (en) | 2005-05-16 | 2007-01-18 | Nosa Omoigui | Information nervous system |
US6801604B2 (en) | 2001-06-25 | 2004-10-05 | International Business Machines Corporation | Universal IP-based and scalable architectures across conversational applications using web services for speech and audio processing resources |
US20020198714A1 (en) | 2001-06-26 | 2002-12-26 | Guojun Zhou | Statistical spoken dialog system |
US7139722B2 (en) | 2001-06-27 | 2006-11-21 | Bellsouth Intellectual Property Corporation | Location and time sensitive wireless calendaring |
CA2463922C (en) | 2001-06-27 | 2013-07-16 | 4 Media, Inc. | Improved media delivery platform |
US6671670B2 (en) | 2001-06-27 | 2003-12-30 | Telelogue, Inc. | System and method for pre-processing information used by an automated attendant |
US7752546B2 (en) | 2001-06-29 | 2010-07-06 | Thomson Licensing | Method and system for providing an acoustic interface |
US7092950B2 (en) | 2001-06-29 | 2006-08-15 | Microsoft Corporation | Method for generic object oriented description of structured data (GDL) |
US6751298B2 (en) | 2001-06-29 | 2004-06-15 | International Business Machines Corporation | Localized voice mail system |
KR100492976B1 (en) | 2001-06-29 | 2005-06-07 | 삼성전자주식회사 | Method for storing and transmitting voice mail using simple voice mail service in mobile telecommunication terminal |
US7302686B2 (en) | 2001-07-04 | 2007-11-27 | Sony Corporation | Task management system |
US7188143B2 (en) | 2001-07-06 | 2007-03-06 | Yahoo! Inc. | Messenger-controlled applications in an instant messaging environment |
US7133900B1 (en) | 2001-07-06 | 2006-11-07 | Yahoo! Inc. | Sharing and implementing instant messaging environments |
US20030013483A1 (en) | 2001-07-06 | 2003-01-16 | Ausems Michiel R. | User interface for handheld communication device |
US20030020760A1 (en) | 2001-07-06 | 2003-01-30 | Kazunori Takatsu | Method for setting a function and a setting item by selectively specifying a position in a tree-structured menu |
US6526351B2 (en) | 2001-07-09 | 2003-02-25 | Charles Lamont Whitham | Interactive multimedia tour guide |
US6604059B2 (en) | 2001-07-10 | 2003-08-05 | Koninklijke Philips Electronics N.V. | Predictive calendar |
US20050134578A1 (en) | 2001-07-13 | 2005-06-23 | Universal Electronics Inc. | System and methods for interacting with a control environment |
US7668718B2 (en) * | 2001-07-17 | 2010-02-23 | Custom Speech Usa, Inc. | Synchronized pattern recognition source data processed by manual or automatic means for creation of shared speaker-dependent speech user profile |
US6766324B2 (en) | 2001-07-20 | 2004-07-20 | International Business Machines Corporation | System and method for defining, configuring and using dynamic, persistent Java classes |
US7188085B2 (en) | 2001-07-20 | 2007-03-06 | International Business Machines Corporation | Method and system for delivering encrypted content with associated geographical-based advertisements |
US9009590B2 (en) | 2001-07-31 | 2015-04-14 | Invention Machines Corporation | Semantic processor for recognition of cause-effect relations in natural language documents |
JP2003044091A (en) | 2001-07-31 | 2003-02-14 | Ntt Docomo Inc | Voice recognition system, portable information terminal, device and method for processing audio information, and audio information processing program |
US6940958B2 (en) | 2001-08-02 | 2005-09-06 | Intel Corporation | Forwarding telephone data via email |
US20030033153A1 (en) | 2001-08-08 | 2003-02-13 | Apple Computer, Inc. | Microphone elements for a computing system |
US7185276B2 (en) | 2001-08-09 | 2007-02-27 | Voxera Corporation | System and method for dynamically translating HTML to VoiceXML intelligently |
US7987151B2 (en) | 2001-08-10 | 2011-07-26 | General Dynamics Advanced Info Systems, Inc. | Apparatus and method for problem solving using intelligent agents |
US20050022114A1 (en) | 2001-08-13 | 2005-01-27 | Xerox Corporation | Meta-document management system with personality identifiers |
US6778979B2 (en) | 2001-08-13 | 2004-08-17 | Xerox Corporation | System for automatically generating queries |
US7149813B2 (en) | 2001-08-14 | 2006-12-12 | Microsoft Corporation | Method and system for synchronizing mobile devices |
US6529592B1 (en) | 2001-08-15 | 2003-03-04 | Bellsouth Intellectual Property Corporation | Internet-based message delivery with PSTN billing |
US7920682B2 (en) | 2001-08-21 | 2011-04-05 | Byrne William J | Dynamic interactive voice interface |
US6810378B2 (en) | 2001-08-22 | 2004-10-26 | Lucent Technologies Inc. | Method and apparatus for controlling a speech synthesis system to provide multiple styles of speech |
KR100761474B1 (en) | 2001-08-23 | 2007-09-27 | 삼성전자주식회사 | Portable device and a phonetic output and filename/directoryname writing method using the same |
JP2003076464A (en) | 2001-08-27 | 2003-03-14 | Internatl Business Mach Corp <Ibm> | Computer device, keyboard and display meter |
US7774388B1 (en) | 2001-08-31 | 2010-08-10 | Margaret Runchey | Model of everything with UR-URL combination identity-identifier-addressing-indexing method, means, and apparatus |
US6813491B1 (en) | 2001-08-31 | 2004-11-02 | Openwave Systems Inc. | Method and apparatus for adapting settings of wireless communication devices in accordance with user proximity |
US7505911B2 (en) * | 2001-09-05 | 2009-03-17 | Roth Daniel L | Combined speech recognition and sound recording |
US7313526B2 (en) | 2001-09-05 | 2007-12-25 | Voice Signal Technologies, Inc. | Speech recognition using selectable recognition modes |
US6892083B2 (en) | 2001-09-05 | 2005-05-10 | Vocera Communications Inc. | Voice-controlled wireless communications system and method |
US7953447B2 (en) | 2001-09-05 | 2011-05-31 | Vocera Communications, Inc. | Voice-controlled communications system and method using a badge application |
US7809574B2 (en) | 2001-09-05 | 2010-10-05 | Voice Signal Technologies Inc. | Word recognition using choice lists |
MXPA04002235A (en) | 2001-09-10 | 2004-06-29 | Thomson Licensing Sa | Method and apparatus for creating an indexed playlist in a digital audio data player. |
BR0212418A (en) | 2001-09-11 | 2004-08-03 | Thomson Licensing Sa | Method and apparatus for activating automatic equalization mode |
US6901364B2 (en) | 2001-09-13 | 2005-05-31 | Matsushita Electric Industrial Co., Ltd. | Focused language models for improved speech input of structured documents |
EP1304680A3 (en) | 2001-09-13 | 2004-03-03 | Yamaha Corporation | Apparatus and method for synthesizing a plurality of waveforms in synchronized manner |
JP4689111B2 (en) | 2001-09-13 | 2011-05-25 | クラリオン株式会社 | Music player |
US6829018B2 (en) | 2001-09-17 | 2004-12-07 | Koninklijke Philips Electronics N.V. | Three-dimensional sound creation assisted by visual information |
US8046689B2 (en) | 2004-11-04 | 2011-10-25 | Apple Inc. | Media presentation with supplementary media |
CN100339809C (en) | 2001-09-21 | 2007-09-26 | 联想(新加坡)私人有限公司 | Input apparatus, computer apparatus, method for identifying input object, method for identifying input object in keyboard, and computer program |
US7010581B2 (en) | 2001-09-24 | 2006-03-07 | International Business Machines Corporation | Method and system for providing browser functions on a web page for client-specific accessibility |
US7403938B2 (en) | 2001-09-24 | 2008-07-22 | Iac Search & Media, Inc. | Natural language query processing |
US7062547B2 (en) | 2001-09-24 | 2006-06-13 | International Business Machines Corporation | Method and system for providing a central repository for client-specific accessibility |
US6985865B1 (en) | 2001-09-26 | 2006-01-10 | Sprint Spectrum L.P. | Method and system for enhanced response to voice commands in a voice command platform |
US7050976B1 (en) | 2001-09-26 | 2006-05-23 | Sprint Spectrum L.P. | Method and system for use of navigation history in a voice command platform |
US20050196732A1 (en) | 2001-09-26 | 2005-09-08 | Scientific Learning Corporation | Method and apparatus for automated training of language learning skills |
US6650735B2 (en) | 2001-09-27 | 2003-11-18 | Microsoft Corporation | Integrated voice access to a variety of personal information services |
US7124081B1 (en) | 2001-09-28 | 2006-10-17 | Apple Computer, Inc. | Method and apparatus for speech recognition using latent semantic adaptation |
US6948094B2 (en) | 2001-09-28 | 2005-09-20 | Intel Corporation | Method of correcting a machine check error |
US7308404B2 (en) | 2001-09-28 | 2007-12-11 | Sri International | Method and apparatus for speech recognition using a dynamic vocabulary |
JP3997459B2 (en) | 2001-10-02 | 2007-10-24 | 株式会社日立製作所 | Voice input system, voice portal server, and voice input terminal |
US7324947B2 (en) | 2001-10-03 | 2008-01-29 | Promptu Systems Corporation | Global speech user interface |
US6763089B2 (en) | 2001-10-12 | 2004-07-13 | Nortel Networks Limited | System for enabling TDD communication in a telephone network and method for using same |
US7027990B2 (en) | 2001-10-12 | 2006-04-11 | Lester Sussman | System and method for integrating the visual display of text menus for interactive voice response systems |
US7167832B2 (en) | 2001-10-15 | 2007-01-23 | At&T Corp. | Method for dialog management |
US20030074457A1 (en) | 2001-10-17 | 2003-04-17 | Kluth Michael R. | Computer system with separable input device |
CA2461214A1 (en) | 2001-10-18 | 2003-04-24 | Yeong Kuang Oon | System and method of improved recording of medical transactions |
US20030078969A1 (en) | 2001-10-19 | 2003-04-24 | Wavexpress, Inc. | Synchronous control of media in a peer-to-peer network |
US7353247B2 (en) | 2001-10-19 | 2008-04-01 | Microsoft Corporation | Querying applications using online messenger service |
US20040054535A1 (en) | 2001-10-22 | 2004-03-18 | Mackie Andrew William | System and method of processing structured text for text-to-speech synthesis |
US20030167318A1 (en) | 2001-10-22 | 2003-09-04 | Apple Computer, Inc. | Intelligent synchronization of media player with host computer |
US7084856B2 (en) | 2001-10-22 | 2006-08-01 | Apple Computer, Inc. | Mouse having a rotary dial |
US7345671B2 (en) | 2001-10-22 | 2008-03-18 | Apple Inc. | Method and apparatus for use of rotational user inputs |
US7312785B2 (en) | 2001-10-22 | 2007-12-25 | Apple Inc. | Method and apparatus for accelerated scrolling |
KR100718613B1 (en) | 2001-10-22 | 2007-05-16 | 애플 인크. | Intelligent synchronization for a media player |
ITFI20010199A1 (en) | 2001-10-22 | 2003-04-22 | Riccardo Vieri | SYSTEM AND METHOD TO TRANSFORM TEXTUAL COMMUNICATIONS INTO VOICE AND SEND THEM WITH AN INTERNET CONNECTION TO ANY TELEPHONE SYSTEM |
US7046230B2 (en) | 2001-10-22 | 2006-05-16 | Apple Computer, Inc. | Touch pad handheld device |
US6934812B1 (en) | 2001-10-22 | 2005-08-23 | Apple Computer, Inc. | Media player with instant play capability |
US6801964B1 (en) | 2001-10-25 | 2004-10-05 | Novell, Inc. | Methods and systems to fast fill media players |
US7379053B2 (en) | 2001-10-27 | 2008-05-27 | Vortant Technologies, Llc | Computer interface for navigating graphical user interface by touch |
GB2381409B (en) | 2001-10-27 | 2004-04-28 | Hewlett Packard Ltd | Asynchronous access to synchronous voice services |
US7359671B2 (en) | 2001-10-30 | 2008-04-15 | Unwired Technology Llc | Multiple channel wireless communication system |
ATE365413T1 (en) | 2001-10-30 | 2007-07-15 | Hewlett Packard Co | COMMUNICATION SYSTEM AND METHOD |
KR100438826B1 (en) | 2001-10-31 | 2004-07-05 | 삼성전자주식회사 | System for speech synthesis using a smoothing filter and method thereof |
US6912407B1 (en) | 2001-11-03 | 2005-06-28 | Susan Lee Clarke | Portable device for storing and searching telephone listings, and method and computer program product for transmitting telephone information to a portable device |
GB2388738B (en) | 2001-11-03 | 2004-06-02 | Dremedia Ltd | Time ordered indexing of audio data |
EP1311102A1 (en) | 2001-11-08 | 2003-05-14 | Hewlett-Packard Company | Streaming audio under voice control |
US7212614B1 (en) | 2001-11-09 | 2007-05-01 | At&T Corp | Voice-messaging with attachments |
US7113172B2 (en) | 2001-11-09 | 2006-09-26 | Lifescan, Inc. | Alphanumeric keypad and display system and method |
US7069213B2 (en) | 2001-11-09 | 2006-06-27 | Netbytel, Inc. | Influencing a voice recognition matching operation with user barge-in time |
FI114051B (en) | 2001-11-12 | 2004-07-30 | Nokia Corp | Procedure for compressing dictionary data |
US7181386B2 (en) | 2001-11-15 | 2007-02-20 | At&T Corp. | Systems and methods for generating weighted finite-state automata representing grammars |
NO316480B1 (en) | 2001-11-15 | 2004-01-26 | Forinnova As | Method and system for textual examination and discovery |
US7043479B2 (en) | 2001-11-16 | 2006-05-09 | Sigmatel, Inc. | Remote-directed management of media content |
JP2003150529A (en) | 2001-11-19 | 2003-05-23 | Hitachi Ltd | Information exchange method, information exchange terminal unit, information exchange server device and program |
JP3980331B2 (en) | 2001-11-20 | 2007-09-26 | 株式会社エビデンス | Multilingual conversation support system |
US7031530B2 (en) | 2001-11-27 | 2006-04-18 | Lockheed Martin Corporation | Compound classifier for pattern recognition applications |
EP1315086B1 (en) | 2001-11-27 | 2006-07-05 | Sun Microsystems, Inc. | Generation of localized software applications |
US20030115552A1 (en) | 2001-11-27 | 2003-06-19 | Jorg Jahnke | Method and system for automatic creation of multilingual immutable image files |
EP1315084A1 (en) | 2001-11-27 | 2003-05-28 | Sun Microsystems, Inc. | Method and apparatus for localizing software |
US20030101054A1 (en) | 2001-11-27 | 2003-05-29 | Ncc, Llc | Integrated system and method for electronic speech recognition and transcription |
US6816578B1 (en) | 2001-11-27 | 2004-11-09 | Nortel Networks Limited | Efficient instant messaging using a telephony interface |
JP2003163745A (en) | 2001-11-28 | 2003-06-06 | Matsushita Electric Ind Co Ltd | Telephone set, interactive responder, interactive responding terminal, and interactive response system |
US6996777B2 (en) | 2001-11-29 | 2006-02-07 | Nokia Corporation | Method and apparatus for presenting auditory icons in a mobile terminal |
US20030101045A1 (en) | 2001-11-29 | 2003-05-29 | Peter Moffatt | Method and apparatus for playing recordings of spoken alphanumeric characters |
US6766294B2 (en) | 2001-11-30 | 2004-07-20 | Dictaphone Corporation | Performance gauge for a distributed speech recognition system |
KR100437142B1 (en) | 2001-12-07 | 2004-06-25 | 에피밸리 주식회사 | Optical microphone |
US20060069567A1 (en) | 2001-12-10 | 2006-03-30 | Tischer Steven N | Methods, systems, and products for translating text to speech |
US7483832B2 (en) | 2001-12-10 | 2009-01-27 | At&T Intellectual Property I, L.P. | Method and system for customizing voice translation of text to speech |
US6791529B2 (en) | 2001-12-13 | 2004-09-14 | Koninklijke Philips Electronics N.V. | UI with graphics-assisted voice control system |
US7124085B2 (en) | 2001-12-13 | 2006-10-17 | Matsushita Electric Industrial Co., Ltd. | Constraint-based speech recognition system and method |
US7007026B2 (en) | 2001-12-14 | 2006-02-28 | Sun Microsystems, Inc. | System for controlling access to and generation of localized application values |
JP3574106B2 (en) | 2001-12-14 | 2004-10-06 | 株式会社スクウェア・エニックス | Network game system, game server device, video game device, message transmission method and display control method in network game, program, and recording medium |
US6915246B2 (en) | 2001-12-17 | 2005-07-05 | International Business Machines Corporation | Employing speech recognition and capturing customer speech to improve customer service |
US7231343B1 (en) | 2001-12-20 | 2007-06-12 | Ianywhere Solutions, Inc. | Synonyms mechanism for natural language systems |
GB2383495A (en) | 2001-12-20 | 2003-06-25 | Hewlett Packard Co | Data processing devices which communicate via short range telecommunication signals with other compatible devices |
GB2388209C (en) | 2001-12-20 | 2005-08-23 | Canon Kk | Control apparatus |
TW541517B (en) | 2001-12-25 | 2003-07-11 | Univ Nat Cheng Kung | Speech recognition system |
DE60218899T2 (en) | 2001-12-26 | 2007-12-06 | Research In Motion Ltd., Waterloo | USER INTERFACE AND METHOD FOR LOOKING AT UNIFORM COMMUNICATION EVENTS IN A MOBILE DEVICE |
US8288641B2 (en) | 2001-12-27 | 2012-10-16 | Intel Corporation | Portable hand-held music synthesizer and networking method and apparatus |
US20030125927A1 (en) | 2001-12-28 | 2003-07-03 | Microsoft Corporation | Method and system for translating instant messages |
US6690387B2 (en) | 2001-12-28 | 2004-02-10 | Koninklijke Philips Electronics N.V. | Touch-screen image scrolling system and method |
US7065485B1 (en) | 2002-01-09 | 2006-06-20 | At&T Corp | Enhancing speech intelligibility using variable-rate time-scale modification |
US20030128819A1 (en) | 2002-01-10 | 2003-07-10 | Lee Anne Yin-Fee | Method for retrieving multimedia messages from a multimedia mailbox |
US7111248B2 (en) | 2002-01-15 | 2006-09-19 | Openwave Systems Inc. | Alphanumeric information input method |
US7159174B2 (en) | 2002-01-16 | 2007-01-02 | Microsoft Corporation | Data preparation for media browsing |
US20030197736A1 (en) | 2002-01-16 | 2003-10-23 | Murphy Michael W. | User interface for character entry using a minimum number of selection keys |
JP2003223437A (en) | 2002-01-29 | 2003-08-08 | Internatl Business Mach Corp <Ibm> | Method of displaying candidate for correct word, method of checking spelling, computer device, and program |
US20030144846A1 (en) | 2002-01-31 | 2003-07-31 | Denenberg Lawrence A. | Method and system for modifying the behavior of an application based upon the application's grammar |
US7130390B2 (en) | 2002-02-01 | 2006-10-31 | Microsoft Corporation | Audio messaging system and method |
US6826515B2 (en) | 2002-02-01 | 2004-11-30 | Plantronics, Inc. | Headset noise exposure dosimeter |
US9374451B2 (en) | 2002-02-04 | 2016-06-21 | Nokia Technologies Oy | System and method for multimodal short-cuts to digital services |
US20030149567A1 (en) | 2002-02-04 | 2003-08-07 | Tony Schmitz | Method and system for using natural language in computer resource utilization analysis via a communications network |
US7139713B2 (en) | 2002-02-04 | 2006-11-21 | Microsoft Corporation | Systems and methods for managing interactions from multiple speech-enabled applications |
US7272377B2 (en) | 2002-02-07 | 2007-09-18 | At&T Corp. | System and method of ubiquitous language translation for wireless devices |
US7177814B2 (en) | 2002-02-07 | 2007-02-13 | Sap Aktiengesellschaft | Dynamic grammar for voice-enabled applications |
US20030149978A1 (en) | 2002-02-07 | 2003-08-07 | Bruce Plotnick | System and method for using a personal digital assistant as an electronic program guide |
US6690800B2 (en) | 2002-02-08 | 2004-02-10 | Andrew M. Resnick | Method and apparatus for communication operator privacy |
US7024362B2 (en) | 2002-02-11 | 2006-04-04 | Microsoft Corporation | Objective measure for estimating mean opinion score of synthesized speech |
US6901411B2 (en) | 2002-02-11 | 2005-05-31 | Microsoft Corporation | Statistical bigram correlation model for image retrieval |
JP2003233568A (en) | 2002-02-13 | 2003-08-22 | Matsushita Electric Ind Co Ltd | E-mail transmitting-receiving device and e-mail transmitting-receiving program |
US20030152203A1 (en) | 2002-02-13 | 2003-08-14 | Berger Adam L. | Message accessing |
US8249880B2 (en) | 2002-02-14 | 2012-08-21 | Intellisist, Inc. | Real-time display of system instructions |
US20030158737A1 (en) | 2002-02-15 | 2003-08-21 | Csicsatka Tibor George | Method and apparatus for incorporating additional audio information into audio data file identifying information |
DE60314929T2 (en) | 2002-02-15 | 2008-04-03 | Canon K.K. | Information processing apparatus and method with speech synthesis function |
US6895257B2 (en) | 2002-02-18 | 2005-05-17 | Matsushita Electric Industrial Co., Ltd. | Personalized agent for portable devices and cellular phone |
US7035807B1 (en) | 2002-02-19 | 2006-04-25 | Brittain John W | Sound on sound-annotations |
KR20030070179A (en) | 2002-02-21 | 2003-08-29 | 엘지전자 주식회사 | Method of the audio stream segmantation |
US20030167167A1 (en) | 2002-02-26 | 2003-09-04 | Li Gong | Intelligent personal assistants |
US7096183B2 (en) | 2002-02-27 | 2006-08-22 | Matsushita Electric Industrial Co., Ltd. | Customizing the speaking style of a speech synthesizer based on semantic analysis |
GB0204686D0 (en) | 2002-02-28 | 2002-04-17 | Koninkl Philips Electronics Nv | Interactive system using tags |
US20030167335A1 (en) | 2002-03-04 | 2003-09-04 | Vigilos, Inc. | System and method for network-based communication |
JP4039086B2 (en) | 2002-03-05 | 2008-01-30 | ソニー株式会社 | Information processing apparatus and information processing method, information processing system, recording medium, and program |
US20040054690A1 (en) | 2002-03-08 | 2004-03-18 | Hillerbrand Eric T. | Modeling and using computer resources over a heterogeneous distributed network using semantic ontologies |
US7031909B2 (en) | 2002-03-12 | 2006-04-18 | Verity, Inc. | Method and system for naming a cluster of words and phrases |
JP4150198B2 (en) | 2002-03-15 | 2008-09-17 | ソニー株式会社 | Speech synthesis method, speech synthesis apparatus, program and recording medium, and robot apparatus |
US6957183B2 (en) | 2002-03-20 | 2005-10-18 | Qualcomm Inc. | Method for robust voice recognition by analyzing redundant features of source signal |
BR0308368A (en) | 2002-03-22 | 2005-01-11 | Sony Ericsson Mobile Comm Ab | Method for entering text into an electronic communications device, and, electronic communications device |
EP1347361A1 (en) | 2002-03-22 | 2003-09-24 | Sony Ericsson Mobile Communications AB | Entering text into an electronic communications device |
KR100760666B1 (en) | 2002-03-27 | 2007-09-20 | 노키아 코포레이션 | Pattern recognition |
WO2003084196A1 (en) | 2002-03-28 | 2003-10-09 | Martin Dunsmuir | Closed-loop command and response system for automatic communications between interacting computer systems over an audio communications channel |
US6870529B1 (en) | 2002-03-28 | 2005-03-22 | Ncr Corporation | System and method for adjusting display brightness levels according to user preferences |
JP2003295882A (en) | 2002-04-02 | 2003-10-15 | Canon Inc | Text structure for speech synthesis, speech synthesizing method, speech synthesizer and computer program therefor |
US7707221B1 (en) | 2002-04-03 | 2010-04-27 | Yahoo! Inc. | Associating and linking compact disc metadata |
US20030191645A1 (en) | 2002-04-05 | 2003-10-09 | Guojun Zhou | Statistical pronunciation model for text to speech |
US7038659B2 (en) | 2002-04-06 | 2006-05-02 | Janusz Wiktor Rajkowski | Symbol encoding apparatus and method |
US7187948B2 (en) | 2002-04-09 | 2007-03-06 | Skullcandy, Inc. | Personal portable integrator for music player and mobile phone |
US7359493B1 (en) | 2002-04-11 | 2008-04-15 | Aol Llc, A Delaware Limited Liability Company | Bulk voicemail |
US7177794B2 (en) | 2002-04-12 | 2007-02-13 | Babu V Mani | System and method for writing Indian languages using English alphabet |
US20030193481A1 (en) | 2002-04-12 | 2003-10-16 | Alexander Sokolsky | Touch-sensitive input overlay for graphical user interface |
US7043474B2 (en) | 2002-04-15 | 2006-05-09 | International Business Machines Corporation | System and method for measuring image similarity based on semantic meaning |
US7073193B2 (en) | 2002-04-16 | 2006-07-04 | Microsoft Corporation | Media content descriptions |
US7197460B1 (en) | 2002-04-23 | 2007-03-27 | At&T Corp. | System for handling frequently asked questions in a natural language dialog service |
US6847966B1 (en) | 2002-04-24 | 2005-01-25 | Engenium Corporation | Method and system for optimally searching a document database using a representative semantic space |
US6877001B2 (en) | 2002-04-25 | 2005-04-05 | Mitsubishi Electric Research Laboratories, Inc. | Method and system for retrieving documents with spoken queries |
US20030200858A1 (en) | 2002-04-29 | 2003-10-30 | Jianlei Xie | Mixing MP3 audio and T T P for enhanced E-book application |
US8135115B1 (en) | 2006-11-22 | 2012-03-13 | Securus Technologies, Inc. | System and method for multi-channel recording |
US7490034B2 (en) | 2002-04-30 | 2009-02-10 | Microsoft Corporation | Lexicon with sectionalized data and method of using the same |
US8250073B2 (en) | 2002-04-30 | 2012-08-21 | University Of Southern California | Preparing and presenting content |
US6957077B2 (en) | 2002-05-06 | 2005-10-18 | Microsoft Corporation | System and method for enabling instant messaging on a mobile device |
US7221937B2 (en) | 2002-05-06 | 2007-05-22 | Research In Motion Limited | Event reminder method |
US7093199B2 (en) | 2002-05-07 | 2006-08-15 | International Business Machines Corporation | Design environment to facilitate accessible software |
US6986106B2 (en) | 2002-05-13 | 2006-01-10 | Microsoft Corporation | Correction widget |
TWI238348B (en) | 2002-05-13 | 2005-08-21 | Kyocera Corp | Portable information terminal, display control device, display control method, and recording media |
JP3574119B2 (en) | 2002-05-14 | 2004-10-06 | 株式会社スクウェア・エニックス | Network game system, video game apparatus, program, and recording medium |
US7380203B2 (en) | 2002-05-14 | 2008-05-27 | Microsoft Corporation | Natural input recognition tool |
US7136818B1 (en) | 2002-05-16 | 2006-11-14 | At&T Corp. | System and method of providing conversational visual prosody for talking heads |
US7062723B2 (en) | 2002-05-20 | 2006-06-13 | Gateway Inc. | Systems, methods and apparatus for magnifying portions of a display |
JP2003338769A (en) | 2002-05-22 | 2003-11-28 | Nec Access Technica Ltd | Portable radio terminal device |
US8611919B2 (en) | 2002-05-23 | 2013-12-17 | Wounder Gmbh., Llc | System, method, and computer program product for providing location based services and mobile e-commerce |
US7546382B2 (en) | 2002-05-28 | 2009-06-09 | International Business Machines Corporation | Methods and systems for authoring of mixed-initiative multi-modal interactions and related browsing mechanisms |
US6996575B2 (en) | 2002-05-31 | 2006-02-07 | Sas Institute Inc. | Computer-implemented system and method for text-based document processing |
US7634532B2 (en) | 2002-05-31 | 2009-12-15 | Onkyo Corporation | Network type content reproduction system |
US7522910B2 (en) | 2002-05-31 | 2009-04-21 | Oracle International Corporation | Method and apparatus for controlling data provided to a mobile device |
US7398209B2 (en) | 2002-06-03 | 2008-07-08 | Voicebox Technologies, Inc. | Systems and methods for responding to natural language speech utterance |
US7366659B2 (en) | 2002-06-07 | 2008-04-29 | Lucent Technologies Inc. | Methods and devices for selectively generating time-scaled sound signals |
US8285255B2 (en) | 2002-06-10 | 2012-10-09 | Research In Motion Limited | Voicemail user interface methods and apparatus for mobile communication devices |
US20030233230A1 (en) | 2002-06-12 | 2003-12-18 | Lucent Technologies Inc. | System and method for representing and resolving ambiguity in spoken dialogue systems |
FI118549B (en) | 2002-06-14 | 2007-12-14 | Nokia Corp | A method and system for providing audio feedback to a digital wireless terminal and a corresponding terminal and server |
US20030233237A1 (en) | 2002-06-17 | 2003-12-18 | Microsoft Corporation | Integration of speech and stylus input to provide an efficient natural input experience |
US7680649B2 (en) | 2002-06-17 | 2010-03-16 | International Business Machines Corporation | System, method, program product, and networking use for recognizing words and their parts of speech in one or more natural languages |
KR20050054874A (en) | 2002-06-17 | 2005-06-10 | 포르토 라넬리, 에스. 에이 | Enabling communication between users surfing the same web page |
US20030236663A1 (en) | 2002-06-19 | 2003-12-25 | Koninklijke Philips Electronics N.V. | Mega speaker identification (ID) system and corresponding methods therefor |
US8219608B2 (en) | 2002-06-20 | 2012-07-10 | Koninklijke Philips Electronics N.V. | Scalable architecture for web services |
US7174298B2 (en) * | 2002-06-24 | 2007-02-06 | Intel Corporation | Method and apparatus to improve accuracy of mobile speech-enabled services |
US6999066B2 (en) | 2002-06-24 | 2006-02-14 | Xerox Corporation | System for audible feedback for touch screen displays |
CN1663249A (en) | 2002-06-24 | 2005-08-31 | 松下电器产业株式会社 | Metadata preparing device, preparing method therefor and retrieving device |
US7260529B1 (en) | 2002-06-25 | 2007-08-21 | Lengen Nicholas D | Command insertion system and method for voice recognition applications |
US7233790B2 (en) | 2002-06-28 | 2007-06-19 | Openwave Systems, Inc. | Device capability based discovery, packaging and provisioning of content for wireless mobile devices |
GB0215123D0 (en) | 2002-06-28 | 2002-08-07 | Ibm | Method and apparatus for preparing a document to be read by a text-to-speech-r eader |
US7299033B2 (en) | 2002-06-28 | 2007-11-20 | Openwave Systems Inc. | Domain-based management of distribution of digital content from multiple suppliers to multiple wireless services subscribers |
US7065185B1 (en) | 2002-06-28 | 2006-06-20 | Bellsouth Intellectual Property Corp. | Systems and methods for providing real-time conversation using disparate communication devices |
US7656393B2 (en) | 2005-03-04 | 2010-02-02 | Apple Inc. | Electronic device having display and surrounding touch sensitive bezel for user interface and control |
US11275405B2 (en) | 2005-03-04 | 2022-03-15 | Apple Inc. | Multi-functional hand-held device |
RU2251737C2 (en) | 2002-10-18 | 2005-05-10 | Аби Софтвер Лтд. | Method for automatic recognition of language of recognized text in case of multilingual recognition |
WO2004008801A1 (en) | 2002-07-12 | 2004-01-22 | Widex A/S | Hearing aid and a method for enhancing speech intelligibility |
US7693720B2 (en) | 2002-07-15 | 2010-04-06 | Voicebox Technologies, Inc. | Mobile systems and methods for responding to natural language speech utterance |
AU2003252024A1 (en) | 2002-07-16 | 2004-02-02 | Bruce L. Horn | Computer system for automatic organization, indexing and viewing of information from multiple sources |
US20040012556A1 (en) | 2002-07-17 | 2004-01-22 | Sea-Weng Yong | Method and related device for controlling illumination of a backlight of a liquid crystal display |
US8150922B2 (en) | 2002-07-17 | 2012-04-03 | Research In Motion Limited | Voice and text group chat display management techniques for wireless mobile terminals |
US6882971B2 (en) | 2002-07-18 | 2005-04-19 | General Instrument Corporation | Method and apparatus for improving listener differentiation of talkers during a conference call |
US8947347B2 (en) | 2003-08-27 | 2015-02-03 | Sony Computer Entertainment Inc. | Controlling actions in a video game unit |
US6799226B1 (en) | 2002-07-23 | 2004-09-28 | Apple Computer, Inc. | Hot unpluggable media storage device |
EP1527398B1 (en) | 2002-07-23 | 2009-11-18 | Research In Motion Limited | Systems and methods of building and using custom word lists |
US7143028B2 (en) | 2002-07-24 | 2006-11-28 | Applied Minds, Inc. | Method and system for masking speech |
US7620547B2 (en) * | 2002-07-25 | 2009-11-17 | Sony Deutschland Gmbh | Spoken man-machine interface with speaker identification |
US20040051729A1 (en) | 2002-07-25 | 2004-03-18 | Borden George R. | Aural user interface |
US7535997B1 (en) | 2002-07-29 | 2009-05-19 | At&T Intellectual Property I, L.P. | Systems and methods for silent message delivery |
US7166791B2 (en) | 2002-07-30 | 2007-01-23 | Apple Computer, Inc. | Graphical user interface and methods of use thereof in a multimedia player |
US7194413B2 (en) | 2002-07-31 | 2007-03-20 | Deere & Company | Method of providing localized information from a single global transformation source |
TW591488B (en) | 2002-08-01 | 2004-06-11 | Tatung Co | Window scrolling method and device thereof |
US8068881B2 (en) | 2002-08-09 | 2011-11-29 | Avon Associates, Inc. | Voice controlled multimedia and communications system |
US7072686B1 (en) | 2002-08-09 | 2006-07-04 | Avon Associates, Inc. | Voice controlled multimedia and communications device |
US20040210634A1 (en) | 2002-08-23 | 2004-10-21 | Miguel Ferrer | Method enabling a plurality of computer users to communicate via a set of interconnected terminals |
US6950502B1 (en) | 2002-08-23 | 2005-09-27 | Bellsouth Intellectual Property Corp. | Enhanced scheduled messaging system |
US20050086605A1 (en) | 2002-08-23 | 2005-04-21 | Miguel Ferrer | Method and apparatus for online advertising |
US20040036715A1 (en) | 2002-08-26 | 2004-02-26 | Peter Warren | Multi-level user help |
US7496631B2 (en) | 2002-08-27 | 2009-02-24 | Aol Llc | Delivery of an electronic communication using a lifespan |
GB2392592B (en) | 2002-08-27 | 2004-07-07 | 20 20 Speech Ltd | Speech synthesis apparatus and method |
EP1604350A4 (en) | 2002-09-06 | 2007-11-21 | Voice Signal Technologies Inc | Methods, systems, and programming for performing speech recognition |
US20040049391A1 (en) | 2002-09-09 | 2004-03-11 | Fuji Xerox Co., Ltd. | Systems and methods for dynamic reading fluency proficiency assessment |
WO2004025938A1 (en) | 2002-09-09 | 2004-03-25 | Vertu Ltd | Cellular radio telephone |
US20040125922A1 (en) | 2002-09-12 | 2004-07-01 | Specht Jeffrey L. | Communications device with sound masking system |
US7047193B1 (en) | 2002-09-13 | 2006-05-16 | Apple Computer, Inc. | Unsupervised data-driven pronunciation modeling |
US20040054534A1 (en) | 2002-09-13 | 2004-03-18 | Junqua Jean-Claude | Client-server voice customization |
US6907397B2 (en) | 2002-09-16 | 2005-06-14 | Matsushita Electric Industrial Co., Ltd. | System and method of media file access and retrieval using speech recognition |
US7103157B2 (en) | 2002-09-17 | 2006-09-05 | International Business Machines Corporation | Audio quality when streaming audio to non-streaming telephony devices |
US7194697B2 (en) | 2002-09-24 | 2007-03-20 | Microsoft Corporation | Magnification engine |
US7328155B2 (en) | 2002-09-25 | 2008-02-05 | Toyota Infotechnology Center Co., Ltd. | Method and system for speech recognition using grammar weighted based upon location information |
US7260190B2 (en) | 2002-09-26 | 2007-08-21 | International Business Machines Corporation | System and method for managing voicemails using metadata |
US7434167B2 (en) | 2002-09-30 | 2008-10-07 | Microsoft Corporation | Accessibility system and method |
EP1550033A2 (en) | 2002-09-30 | 2005-07-06 | Ning-Ping Chan | Pointer initiated instant bilingual annotation on textual information in an electronic document |
US20040061717A1 (en) | 2002-09-30 | 2004-04-01 | Menon Rama R. | Mechanism for voice-enabling legacy internet content for use with multi-modal browsers |
CA2406047A1 (en) | 2002-09-30 | 2004-03-30 | Ali Solehdin | A graphical user interface for digital media and network portals using detail-in-context lenses |
MXPA05002322A (en) | 2002-09-30 | 2005-06-08 | Microsoft Corp | System and method for making user interface elements known to an application and user. |
US7123696B2 (en) | 2002-10-04 | 2006-10-17 | Frederick Lowe | Method and apparatus for generating and distributing personalized media clips |
US6925438B2 (en) | 2002-10-08 | 2005-08-02 | Motorola, Inc. | Method and apparatus for providing an animated display with translated speech |
US7467087B1 (en) | 2002-10-10 | 2008-12-16 | Gillick Laurence S | Training and using pronunciation guessers in speech recognition |
US20040073428A1 (en) | 2002-10-10 | 2004-04-15 | Igor Zlokarnik | Apparatus, methods, and programming for speech synthesis via bit manipulations of compressed database |
US7124082B2 (en) | 2002-10-11 | 2006-10-17 | Twisted Innovations | Phonetic speech-to-text-to-speech system and method |
US7136874B2 (en) | 2002-10-16 | 2006-11-14 | Microsoft Corporation | Adaptive menu system for media players |
US7054888B2 (en) | 2002-10-16 | 2006-05-30 | Microsoft Corporation | Optimizing media player memory during rendering |
US7373612B2 (en) | 2002-10-21 | 2008-05-13 | Battelle Memorial Institute | Multidimensional structured data visualization method and apparatus, text visualization method and apparatus, method and apparatus for visualizing and graphically navigating the world wide web, method and apparatus for visualizing hierarchies |
US7519534B2 (en) | 2002-10-31 | 2009-04-14 | Agiletv Corporation | Speech controlled access to content on a presentation medium |
JP2004152063A (en) | 2002-10-31 | 2004-05-27 | Nec Corp | Structuring method, structuring device and structuring program of multimedia contents, and providing method thereof |
US20040218451A1 (en) | 2002-11-05 | 2004-11-04 | Said Joe P. | Accessible user interface and navigation system and method |
US20040086120A1 (en) | 2002-11-06 | 2004-05-06 | Akins Glendon L. | Selecting and downloading content to a portable player |
US7152033B2 (en) | 2002-11-12 | 2006-12-19 | Motorola, Inc. | Method, system and module for multi-modal data fusion |
US7003099B1 (en) | 2002-11-15 | 2006-02-21 | Fortmedia, Inc. | Small array microphone for acoustic echo cancellation and noise suppression |
US7796977B2 (en) | 2002-11-18 | 2010-09-14 | Research In Motion Limited | Voice mailbox configuration methods and apparatus for mobile communication devices |
KR100477796B1 (en) | 2002-11-21 | 2005-03-22 | 주식회사 팬택앤큐리텔 | Apparatus for switching hand free mode by responding to velocity and method thereof |
US7386799B1 (en) | 2002-11-21 | 2008-06-10 | Forterra Systems, Inc. | Cinematic techniques in avatar-centric communication during a multi-user online simulation |
AU2003293071A1 (en) | 2002-11-22 | 2004-06-18 | Roy Rosser | Autonomous response engine |
WO2004049110A2 (en) | 2002-11-22 | 2004-06-10 | Transclick, Inc. | Language translation system and method |
US7457745B2 (en) * | 2002-12-03 | 2008-11-25 | Hrl Laboratories, Llc | Method and apparatus for fast on-line automatic speaker/environment adaptation for speech/speaker recognition in the presence of changing environments |
US7684985B2 (en) | 2002-12-10 | 2010-03-23 | Richard Dominach | Techniques for disambiguating speech input using multimodal interfaces |
US7386449B2 (en) | 2002-12-11 | 2008-06-10 | Voice Enabling Systems Technology Inc. | Knowledge-based flexible natural speech dialogue system |
US7177817B1 (en) | 2002-12-12 | 2007-02-13 | Tuvox Incorporated | Automatic generation of voice content for a voice response system |
US7353139B1 (en) | 2002-12-13 | 2008-04-01 | Garmin Ltd. | Portable apparatus with performance monitoring and audio entertainment features |
US7797064B2 (en) | 2002-12-13 | 2010-09-14 | Stephen Loomis | Apparatus and method for skipping songs without delay |
WO2004061850A1 (en) | 2002-12-17 | 2004-07-22 | Thomson Licensing S.A. | Method for tagging and displaying songs in a digital audio player |
FR2848688A1 (en) | 2002-12-17 | 2004-06-18 | France Telecom | Text language identifying device for linguistic analysis of text, has analyzing unit to analyze chain characters of words extracted from one text, where each chain is completed so that each time chains are found in word |
US20040121761A1 (en) | 2002-12-19 | 2004-06-24 | Abinash Tripathy | Method and apparatus for processing voicemail messages |
US20040205151A1 (en) | 2002-12-19 | 2004-10-14 | Sprigg Stephen A. | Triggering event processing |
JP3974511B2 (en) | 2002-12-19 | 2007-09-12 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Computer system for generating data structure for information retrieval, method therefor, computer-executable program for generating data structure for information retrieval, computer-executable program for generating data structure for information retrieval Stored computer-readable storage medium, information retrieval system, and graphical user interface system |
US20040203520A1 (en) | 2002-12-20 | 2004-10-14 | Tom Schirtzinger | Apparatus and method for application control in an electronic device |
DE60231844D1 (en) | 2002-12-20 | 2009-05-14 | Nokia Corp | NEW RELEASE INFORMATION WITH META INFORMATION |
AU2003283783A1 (en) | 2002-12-20 | 2005-05-11 | Koninklijke Philips Electronics N.V. | Video content detection |
JP2004205605A (en) | 2002-12-24 | 2004-07-22 | Yamaha Corp | Speech and musical piece reproducing device and sequence data format |
US20040124583A1 (en) | 2002-12-26 | 2004-07-01 | Landis Mark T. | Board game method and device |
US6927763B2 (en) | 2002-12-30 | 2005-08-09 | Motorola, Inc. | Method and system for providing a disambiguated keypad |
US20040127198A1 (en) | 2002-12-30 | 2004-07-01 | Roskind James A. | Automatically changing a mobile device configuration based on environmental condition |
GB2396927A (en) | 2002-12-30 | 2004-07-07 | Digital Fidelity Ltd | Media file distribution system |
KR20040062289A (en) | 2003-01-02 | 2004-07-07 | 삼성전자주식회사 | Portable computer and control method thereof |
US7956766B2 (en) | 2003-01-06 | 2011-06-07 | Panasonic Corporation | Apparatus operating system |
US7194699B2 (en) | 2003-01-14 | 2007-03-20 | Microsoft Corporation | Animating images to reflect user selection |
US7522735B2 (en) | 2003-01-14 | 2009-04-21 | Timothy Dale Van Tassel | Electronic circuit with spring reverberation effect and improved output controllability |
US7382358B2 (en) | 2003-01-16 | 2008-06-03 | Forword Input, Inc. | System and method for continuous stroke word-based text input |
US7266189B1 (en) | 2003-01-27 | 2007-09-04 | Cisco Technology, Inc. | Who said that? teleconference speaker identification apparatus and method |
US7593868B2 (en) | 2003-01-29 | 2009-09-22 | Innovation Interactive Llc | Systems and methods for providing contextual advertising information via a communication network |
US8285537B2 (en) | 2003-01-31 | 2012-10-09 | Comverse, Inc. | Recognition of proper nouns using native-language pronunciation |
US20040162741A1 (en) | 2003-02-07 | 2004-08-19 | David Flaxer | Method and apparatus for product lifecycle management in a distributed environment enabled by dynamic business process composition and execution by rule inference |
US7606714B2 (en) | 2003-02-11 | 2009-10-20 | Microsoft Corporation | Natural language classification within an automated response system |
US20040160419A1 (en) | 2003-02-11 | 2004-08-19 | Terradigital Systems Llc. | Method for entering alphanumeric characters into a graphical user interface |
US7617094B2 (en) | 2003-02-28 | 2009-11-10 | Palo Alto Research Center Incorporated | Methods, apparatus, and products for identifying a conversation |
US7805299B2 (en) | 2004-03-01 | 2010-09-28 | Coifman Robert E | Method and apparatus for improving the transcription accuracy of speech recognition software |
US7809565B2 (en) | 2003-03-01 | 2010-10-05 | Coifman Robert E | Method and apparatus for improving the transcription accuracy of speech recognition software |
US7426468B2 (en) | 2003-03-01 | 2008-09-16 | Coifman Robert E | Method and apparatus for improving the transcription accuracy of speech recognition software |
SG135918A1 (en) | 2003-03-03 | 2007-10-29 | Xrgomics Pte Ltd | Unambiguous text input method for touch screens and reduced keyboard systems |
US7529671B2 (en) | 2003-03-04 | 2009-05-05 | Microsoft Corporation | Block synchronous decoding |
JP4828091B2 (en) | 2003-03-05 | 2011-11-30 | ヒューレット・パッカード・カンパニー | Clustering method program and apparatus |
US20040186713A1 (en) | 2003-03-06 | 2004-09-23 | Gomas Steven W. | Content delivery and speech system and apparatus for the blind and print-handicapped |
US7103852B2 (en) | 2003-03-10 | 2006-09-05 | International Business Machines Corporation | Dynamic resizing of clickable areas of touch screen applications |
US6980949B2 (en) | 2003-03-14 | 2005-12-27 | Sonum Technologies, Inc. | Natural language processor |
US7835504B1 (en) | 2003-03-16 | 2010-11-16 | Palm, Inc. | Telephone number parsing and linking |
US9274576B2 (en) | 2003-03-17 | 2016-03-01 | Callahan Cellular L.L.C. | System and method for activation of portable and mobile media player devices for wireless LAN services |
US20040186714A1 (en) | 2003-03-18 | 2004-09-23 | Aurilab, Llc | Speech recognition improvement through post-processsing |
US7062223B2 (en) | 2003-03-18 | 2006-06-13 | Phonak Communications Ag | Mobile transceiver and electronic module for controlling the transceiver |
US20040183833A1 (en) | 2003-03-19 | 2004-09-23 | Chua Yong Tong | Keyboard error reduction method and apparatus |
US20060217967A1 (en) | 2003-03-20 | 2006-09-28 | Doug Goertzen | System and methods for storing and presenting personal information |
US7496498B2 (en) | 2003-03-24 | 2009-02-24 | Microsoft Corporation | Front-end architecture for a multi-lingual text-to-speech system |
FR2853127A1 (en) | 2003-03-25 | 2004-10-01 | France Telecom | DISTRIBUTED SPEECH RECOGNITION SYSTEM |
US7280968B2 (en) | 2003-03-25 | 2007-10-09 | International Business Machines Corporation | Synthetically generated speech responses including prosodic characteristics of speech inputs |
US7146319B2 (en) | 2003-03-31 | 2006-12-05 | Novauris Technologies Ltd. | Phonetically based speech recognition system and method |
EP1465047A1 (en) | 2003-04-03 | 2004-10-06 | Deutsche Thomson-Brandt Gmbh | Method for presenting menu buttons |
US7729542B2 (en) | 2003-04-04 | 2010-06-01 | Carnegie Mellon University | Using edges and corners for character input |
US7941009B2 (en) | 2003-04-08 | 2011-05-10 | The Penn State Research Foundation | Real-time computerized annotation of pictures |
US7394947B2 (en) | 2003-04-08 | 2008-07-01 | The Penn State Research Foundation | System and method for automatic linguistic indexing of images by a statistical modeling approach |
US20070136064A1 (en) | 2003-04-16 | 2007-06-14 | Carroll David W | Mobile personal computer with movement sensor |
US7463727B2 (en) | 2003-04-18 | 2008-12-09 | At&T International Property, I, L.P. | Caller ID messaging device |
GB2420946B (en) | 2003-04-22 | 2006-09-27 | Spinvox Ltd | A method of providing voicemails to a mobile telephone |
MXPA05011082A (en) | 2003-04-24 | 2006-05-19 | Thomson Licensing | Creation of playlists using audio identification. |
US7627343B2 (en) | 2003-04-25 | 2009-12-01 | Apple Inc. | Media player system |
US6728729B1 (en) | 2003-04-25 | 2004-04-27 | Apple Computer, Inc. | Accessing media across networks |
JP4130190B2 (en) | 2003-04-28 | 2008-08-06 | 富士通株式会社 | Speech synthesis system |
US20050033771A1 (en) | 2003-04-30 | 2005-02-10 | Schmitter Thomas A. | Contextual advertising system |
US20040220798A1 (en) | 2003-05-01 | 2004-11-04 | Visteon Global Technologies, Inc. | Remote voice identification system |
US7669134B1 (en) | 2003-05-02 | 2010-02-23 | Apple Inc. | Method and apparatus for displaying information during an instant messaging session |
US7443971B2 (en) | 2003-05-05 | 2008-10-28 | Microsoft Corporation | Computer system with do not disturb system and method |
US7496630B2 (en) | 2003-05-06 | 2009-02-24 | At&T Intellectual Property I, L.P. | Adaptive notification delivery in a multi-device environment |
US8046705B2 (en) | 2003-05-08 | 2011-10-25 | Hillcrest Laboratories, Inc. | Systems and methods for resolution consistent semantic zooming |
US7313523B1 (en) | 2003-05-14 | 2007-12-25 | Apple Inc. | Method and apparatus for assigning word prominence to new or previous information in speech synthesis |
US7421393B1 (en) | 2004-03-01 | 2008-09-02 | At&T Corp. | System for developing a dialog manager using modular spoken-dialog components |
GB2402031B (en) | 2003-05-19 | 2007-03-28 | Toshiba Res Europ Ltd | Lexical stress prediction |
DE60318181T2 (en) | 2003-05-20 | 2008-12-04 | Sony Ericsson Mobile Communications Ab | Automatic adjustment of a keyboard input mode in response to an incoming text message |
US7269544B2 (en) | 2003-05-20 | 2007-09-11 | Hewlett-Packard Development Company, L.P. | System and method for identifying special word usage in a document |
US20050045373A1 (en) | 2003-05-27 | 2005-03-03 | Joseph Born | Portable media device with audio prompt menu |
US20040242286A1 (en) | 2003-05-28 | 2004-12-02 | Benco David S. | Configurable network initiated response to mobile low battery condition |
US20040243412A1 (en) | 2003-05-29 | 2004-12-02 | Gupta Sunil K. | Adaptation of speech models in speech recognition |
US7200559B2 (en) | 2003-05-29 | 2007-04-03 | Microsoft Corporation | Semantic object synchronous understanding implemented with speech application language tags |
US8301436B2 (en) | 2003-05-29 | 2012-10-30 | Microsoft Corporation | Semantic object synchronous understanding for highly interactive interface |
WO2004110099A2 (en) | 2003-06-06 | 2004-12-16 | Gn Resound A/S | A hearing aid wireless network |
US20040252966A1 (en) | 2003-06-10 | 2004-12-16 | Holloway Marty M. | Video storage and playback system and method |
US7577568B2 (en) | 2003-06-10 | 2009-08-18 | At&T Intellctual Property Ii, L.P. | Methods and system for creating voice files using a VoiceXML application |
GB2402855A (en) | 2003-06-12 | 2004-12-15 | Seiko Epson Corp | Multiple language text to speech processing |
US7720683B1 (en) | 2003-06-13 | 2010-05-18 | Sensory, Inc. | Method and apparatus of specifying and performing speech recognition operations |
KR100634496B1 (en) | 2003-06-16 | 2006-10-13 | 삼성전자주식회사 | Input language recognition method and apparatus and method and apparatus for automatically interchanging input language modes employing the same |
WO2004111869A1 (en) | 2003-06-17 | 2004-12-23 | Kwangwoon Foundation | Exceptional pronunciation dictionary generation method for the automatic pronunciation generation in korean |
US20040259536A1 (en) | 2003-06-20 | 2004-12-23 | Keskar Dhananjay V. | Method, apparatus and system for enabling context aware notification in mobile devices |
US7559026B2 (en) | 2003-06-20 | 2009-07-07 | Apple Inc. | Video conferencing system having focus control |
US7827047B2 (en) | 2003-06-24 | 2010-11-02 | At&T Intellectual Property I, L.P. | Methods and systems for assisting scheduling with automation |
WO2005003899A2 (en) | 2003-06-24 | 2005-01-13 | Ntech Properties, Inc. | Method, system and apparatus for information delivery |
US7512884B2 (en) | 2003-06-25 | 2009-03-31 | Microsoft Corporation | System and method for switching of media presentation |
US7757182B2 (en) | 2003-06-25 | 2010-07-13 | Microsoft Corporation | Taskbar media player |
US7107296B2 (en) | 2003-06-25 | 2006-09-12 | Microsoft Corporation | Media library synchronizer |
US7428000B2 (en) | 2003-06-26 | 2008-09-23 | Microsoft Corp. | System and method for distributed meetings |
US7580551B1 (en) | 2003-06-30 | 2009-08-25 | The Research Foundation Of State University Of Ny | Method and apparatus for analyzing and/or comparing handwritten and/or biometric samples |
US7057607B2 (en) | 2003-06-30 | 2006-06-06 | Motorola, Inc. | Application-independent text entry for touch-sensitive display |
US7257585B2 (en) | 2003-07-02 | 2007-08-14 | Vibrant Media Limited | Method and system for augmenting web content |
US20060277058A1 (en) | 2003-07-07 | 2006-12-07 | J Maev Jack I | Method and apparatus for providing aftermarket service for a product |
US20080097937A1 (en) | 2003-07-10 | 2008-04-24 | Ali Hadjarian | Distributed method for integrating data mining and text categorization techniques |
US20050055433A1 (en) | 2003-07-11 | 2005-03-10 | Boban Mathew | System and method for advanced rule creation and management within an integrated virtual workspace |
US7154526B2 (en) | 2003-07-11 | 2006-12-26 | Fuji Xerox Co., Ltd. | Telepresence system and method for video teleconferencing |
US8638910B2 (en) | 2003-07-14 | 2014-01-28 | Cisco Technology, Inc. | Integration of enterprise voicemail in mobile systems |
US20050015772A1 (en) | 2003-07-16 | 2005-01-20 | Saare John E. | Method and system for device specific application optimization via a portal server |
US20070061753A1 (en) | 2003-07-17 | 2007-03-15 | Xrgomics Pte Ltd | Letter and word choice text input method for keyboards and reduced keyboard systems |
US7757173B2 (en) | 2003-07-18 | 2010-07-13 | Apple Inc. | Voice menu system |
WO2005010725A2 (en) | 2003-07-23 | 2005-02-03 | Xow, Inc. | Stop motion capture tool |
JP2005044149A (en) | 2003-07-23 | 2005-02-17 | Sanyo Electric Co Ltd | Content output device |
WO2005010866A1 (en) | 2003-07-23 | 2005-02-03 | Nexidia Inc. | Spoken word spotting queries |
JP4551635B2 (en) | 2003-07-31 | 2010-09-29 | ソニー株式会社 | Pipeline processing system and information processing apparatus |
US20050027385A1 (en) | 2003-08-01 | 2005-02-03 | Wen-Hsiang Yueh | MP3 player having a wireless earphone communication with a mobile |
US7386438B1 (en) | 2003-08-04 | 2008-06-10 | Google Inc. | Identifying language attributes through probabilistic analysis |
US7280647B2 (en) | 2003-08-07 | 2007-10-09 | Microsoft Corporation | Dynamic photo caller identification |
JP3979432B2 (en) | 2003-08-08 | 2007-09-19 | オンキヨー株式会社 | Network AV system |
US8826137B2 (en) | 2003-08-14 | 2014-09-02 | Freedom Scientific, Inc. | Screen reader having concurrent communication of non-textual information |
CA2536265C (en) | 2003-08-21 | 2012-11-13 | Idilia Inc. | System and method for processing a query |
US7475010B2 (en) | 2003-09-03 | 2009-01-06 | Lingospot, Inc. | Adaptive and scalable method for resolving natural language ambiguities |
US7539619B1 (en) | 2003-09-05 | 2009-05-26 | Spoken Translation Ind. | Speech-enabled language translation system and method enabling interactive user supervision of translation and speech recognition accuracy |
US20060253787A1 (en) | 2003-09-09 | 2006-11-09 | Fogg Brian J | Graphical messaging system |
JP2005086624A (en) | 2003-09-10 | 2005-03-31 | Aol Japan Inc | Communication system using cellular phone, cell phone, internet protocol server, and program |
US7386451B2 (en) | 2003-09-11 | 2008-06-10 | Microsoft Corporation | Optimization of an objective measure for estimating mean opinion score of synthesized speech |
GB2422518B (en) | 2003-09-11 | 2007-11-14 | Voice Signal Technologies Inc | Method and apparatus for using audio prompts in mobile communication devices |
JP4663223B2 (en) | 2003-09-11 | 2011-04-06 | パナソニック株式会社 | Arithmetic processing unit |
WO2005027485A1 (en) | 2003-09-12 | 2005-03-24 | Nokia Corporation | Method and device for handling missed calls in a mobile communications environment |
US7266495B1 (en) | 2003-09-12 | 2007-09-04 | Nuance Communications, Inc. | Method and system for learning linguistically valid word pronunciations from acoustic data |
JP2005092441A (en) | 2003-09-16 | 2005-04-07 | Aizu:Kk | Character input method |
US7411575B2 (en) | 2003-09-16 | 2008-08-12 | Smart Technologies Ulc | Gesture recognition method and touch system incorporating the same |
US7418392B1 (en) | 2003-09-25 | 2008-08-26 | Sensory, Inc. | System and method for controlling the operation of a device by voice commands |
US7460652B2 (en) | 2003-09-26 | 2008-12-02 | At&T Intellectual Property I, L.P. | VoiceXML and rule engine based switchboard for interactive voice response (IVR) services |
CN1320482C (en) | 2003-09-29 | 2007-06-06 | 摩托罗拉公司 | Natural voice pause in identification text strings |
JP4146322B2 (en) | 2003-09-30 | 2008-09-10 | カシオ計算機株式会社 | Communication system and information communication terminal |
EP1671326A1 (en) | 2003-09-30 | 2006-06-21 | Koninklijke Philips Electronics N.V. | Cache management for improving trick play performance |
US7194611B2 (en) | 2003-09-30 | 2007-03-20 | Microsoft Corporation | Method and system for navigation using media transport controls |
US20060008256A1 (en) | 2003-10-01 | 2006-01-12 | Khedouri Robert K | Audio visual player apparatus and system and method of content distribution using the same |
US6813218B1 (en) | 2003-10-06 | 2004-11-02 | The United States Of America As Represented By The Secretary Of The Navy | Buoyant device for bi-directional acousto-optic signal transfer across the air-water interface |
US9984377B2 (en) | 2003-10-06 | 2018-05-29 | Yellowpages.Com Llc | System and method for providing advertisement |
US20070162296A1 (en) | 2003-10-06 | 2007-07-12 | Utbk, Inc. | Methods and apparatuses for audio advertisements |
US10425538B2 (en) | 2003-10-06 | 2019-09-24 | Yellowpages.Com Llc | Methods and apparatuses for advertisements on mobile devices for communication connections |
US7302392B1 (en) | 2003-10-07 | 2007-11-27 | Sprint Spectrum L.P. | Voice browser with weighting of browser-level grammar to enhance usability |
US7383170B2 (en) | 2003-10-10 | 2008-06-03 | At&T Knowledge Ventures, L.P. | System and method for analyzing automatic speech recognition performance data |
KR100801396B1 (en) | 2003-10-16 | 2008-02-05 | 마츠시타 덴끼 산교 가부시키가이샤 | Video/audio recorder/reproducer, video/audio recording method and reproducing method |
US7487092B2 (en) | 2003-10-17 | 2009-02-03 | International Business Machines Corporation | Interactive debugging and tuning method for CTTS voice building |
US7409347B1 (en) | 2003-10-23 | 2008-08-05 | Apple Inc. | Data-driven global boundary optimization |
US7643990B1 (en) | 2003-10-23 | 2010-01-05 | Apple Inc. | Global boundary-centric feature extraction and associated discontinuity metrics |
WO2005041170A1 (en) | 2003-10-24 | 2005-05-06 | Nokia Corpration | Noise-dependent postfiltering |
US7155706B2 (en) | 2003-10-24 | 2006-12-26 | Microsoft Corporation | Administrative tool environment |
FI20031566A (en) | 2003-10-27 | 2005-04-28 | Nokia Corp | Select a language for word recognition |
US20070083623A1 (en) | 2003-10-30 | 2007-04-12 | Makoto Nishimura | Mobile terminal apparatus |
US20050102144A1 (en) | 2003-11-06 | 2005-05-12 | Rapoport Ezra J. | Speech synthesis |
US8074184B2 (en) | 2003-11-07 | 2011-12-06 | Mocrosoft Corporation | Modifying electronic documents with recognized content or other associated data |
US20050102625A1 (en) | 2003-11-07 | 2005-05-12 | Lee Yong C. | Audio tag retrieval system and method |
US7292726B2 (en) | 2003-11-10 | 2007-11-06 | Microsoft Corporation | Recognition of electronic ink with late strokes |
US7302099B2 (en) | 2003-11-10 | 2007-11-27 | Microsoft Corporation | Stroke segmentation for template-based cursive handwriting recognition |
US7561069B2 (en) | 2003-11-12 | 2009-07-14 | Legalview Assets, Limited | Notification systems and methods enabling a response to change particulars of delivery or pickup |
US7584092B2 (en) | 2004-11-15 | 2009-09-01 | Microsoft Corporation | Unsupervised learning of paraphrase/translation alternations and selective application thereof |
US7412385B2 (en) | 2003-11-12 | 2008-08-12 | Microsoft Corporation | System for identifying paraphrases using machine translation |
US20090018828A1 (en) | 2003-11-12 | 2009-01-15 | Honda Motor Co., Ltd. | Automatic Speech Recognition System |
US20050108074A1 (en) | 2003-11-14 | 2005-05-19 | Bloechl Peter E. | Method and system for prioritization of task items |
US7206391B2 (en) | 2003-12-23 | 2007-04-17 | Apptera Inc. | Method for creating and deploying system changes in a voice application system |
US8055713B2 (en) | 2003-11-17 | 2011-11-08 | Hewlett-Packard Development Company, L.P. | Email application with user voice interface |
EP1695177B1 (en) | 2003-11-19 | 2013-10-02 | Agero Connection Services, Inc. | Wirelessly delivered owner s manual |
US7310605B2 (en) | 2003-11-25 | 2007-12-18 | International Business Machines Corporation | Method and apparatus to transliterate text using a portable device |
US7447630B2 (en) | 2003-11-26 | 2008-11-04 | Microsoft Corporation | Method and apparatus for multi-sensory speech enhancement |
US20050114140A1 (en) | 2003-11-26 | 2005-05-26 | Brackett Charles C. | Method and apparatus for contextual voice cues |
KR100621092B1 (en) | 2003-11-27 | 2006-09-08 | 삼성전자주식회사 | Method and apparatus for sharing application using P2P |
US20050119890A1 (en) | 2003-11-28 | 2005-06-02 | Yoshifumi Hirose | Speech synthesis apparatus and speech synthesis method |
DE602004016681D1 (en) | 2003-12-05 | 2008-10-30 | Kenwood Corp | AUDIO DEVICE CONTROL DEVICE, AUDIO DEVICE CONTROL METHOD AND PROGRAM |
US7865354B2 (en) | 2003-12-05 | 2011-01-04 | International Business Machines Corporation | Extracting and grouping opinions from text documents |
US20050144003A1 (en) | 2003-12-08 | 2005-06-30 | Nokia Corporation | Multi-lingual speech synthesis |
JP4006395B2 (en) | 2003-12-11 | 2007-11-14 | キヤノン株式会社 | Information processing apparatus, control method therefor, and program |
US7412388B2 (en) | 2003-12-12 | 2008-08-12 | International Business Machines Corporation | Language-enhanced programming tools |
WO2005059895A1 (en) | 2003-12-16 | 2005-06-30 | Loquendo S.P.A. | Text-to-speech method and system, computer program product therefor |
JP2005181386A (en) | 2003-12-16 | 2005-07-07 | Mitsubishi Electric Corp | Device, method, and program for speech interactive processing |
US7334090B2 (en) | 2003-12-17 | 2008-02-19 | At&T Delaware Intellectual Property, Inc. | Methods, systems, and storage mediums for providing information storage services |
US7427024B1 (en) | 2003-12-17 | 2008-09-23 | Gazdzinski Mark J | Chattel management apparatus and methods |
US20050144070A1 (en) | 2003-12-23 | 2005-06-30 | Cheshire Stuart D. | Method and apparatus for advertising a user interface for configuring, controlling and/or monitoring a service |
WO2005064592A1 (en) | 2003-12-26 | 2005-07-14 | Kabushikikaisha Kenwood | Device control device, speech recognition device, agent device, on-vehicle device control device, navigation device, audio device, device control method, speech recognition method, agent processing method, on-vehicle device control method, navigation method, and audio device control method, and program |
US7404143B2 (en) | 2003-12-26 | 2008-07-22 | Microsoft Corporation | Server-based single roundtrip spell checking |
US7631276B2 (en) | 2003-12-29 | 2009-12-08 | International Business Machines Corporation | Method for indication and navigating related items |
KR20050072256A (en) | 2004-01-06 | 2005-07-11 | 엘지전자 주식회사 | Method for managing and reproducing a menu sound of high density optical disc |
US20050149510A1 (en) | 2004-01-07 | 2005-07-07 | Uri Shafrir | Concept mining and concept discovery-semantic search tool for large digital databases |
US7401300B2 (en) | 2004-01-09 | 2008-07-15 | Nokia Corporation | Adaptive user interface input device |
US7552055B2 (en) | 2004-01-10 | 2009-06-23 | Microsoft Corporation | Dialog component re-use in recognition systems |
JP2005202014A (en) | 2004-01-14 | 2005-07-28 | Sony Corp | Audio signal processor, audio signal processing method, and audio signal processing program |
US7359851B2 (en) | 2004-01-14 | 2008-04-15 | Clairvoyance Corporation | Method of identifying the language of a textual passage using short word and/or n-gram comparisons |
US7298904B2 (en) | 2004-01-14 | 2007-11-20 | International Business Machines Corporation | Method and apparatus for scaling handwritten character input for handwriting recognition |
AU2005207606B2 (en) | 2004-01-16 | 2010-11-11 | Nuance Communications, Inc. | Corpus-based speech synthesis based on segment recombination |
US8689113B2 (en) | 2004-01-22 | 2014-04-01 | Sony Corporation | Methods and apparatus for presenting content |
US20050165607A1 (en) | 2004-01-22 | 2005-07-28 | At&T Corp. | System and method to disambiguate and clarify user intention in a spoken dialog system |
DE602004017955D1 (en) | 2004-01-29 | 2009-01-08 | Daimler Ag | Method and system for voice dialogue interface |
US7383250B2 (en) | 2004-01-30 | 2008-06-03 | Research In Motion Limited | Contact query data system and method |
US7596499B2 (en) | 2004-02-02 | 2009-09-29 | Panasonic Corporation | Multilingual text-to-speech system with limited resources |
FR2865846A1 (en) | 2004-02-02 | 2005-08-05 | France Telecom | VOICE SYNTHESIS SYSTEM |
JP4274962B2 (en) | 2004-02-04 | 2009-06-10 | 株式会社国際電気通信基礎技術研究所 | Speech recognition system |
US7580866B2 (en) | 2004-02-10 | 2009-08-25 | Verizon Business Global Llc | Apparatus, methods, and computer readable medium for determining the location of a portable device in a shopping environment |
JP4262113B2 (en) | 2004-02-13 | 2009-05-13 | シチズン電子株式会社 | Backlight |
US8200475B2 (en) | 2004-02-13 | 2012-06-12 | Microsoft Corporation | Phonetic-based text input method |
KR100612839B1 (en) | 2004-02-18 | 2006-08-18 | 삼성전자주식회사 | Method and apparatus for domain-based dialog speech recognition |
US20050185598A1 (en) | 2004-02-20 | 2005-08-25 | Mika Grundstrom | System and method for device discovery |
US7505906B2 (en) | 2004-02-26 | 2009-03-17 | At&T Intellectual Property, Ii | System and method for augmenting spoken language understanding by correcting common errors in linguistic performance |
KR100462292B1 (en) | 2004-02-26 | 2004-12-17 | 엔에이치엔(주) | A method for providing search results list based on importance information and a system thereof |
US20050190970A1 (en) | 2004-02-27 | 2005-09-01 | Research In Motion Limited | Text input system for a mobile electronic device and methods thereof |
US20050195094A1 (en) | 2004-03-05 | 2005-09-08 | White Russell W. | System and method for utilizing a bicycle computer to monitor athletic performance |
US7693715B2 (en) | 2004-03-10 | 2010-04-06 | Microsoft Corporation | Generating large units of graphonemes with mutual information criterion for letter to sound conversion |
US7711129B2 (en) | 2004-03-11 | 2010-05-04 | Apple Inc. | Method and system for approximating graphic equalizers using dynamic filter order reduction |
US20050210394A1 (en) | 2004-03-16 | 2005-09-22 | Crandall Evan S | Method for providing concurrent audio-video and audio instant messaging sessions |
US7478033B2 (en) | 2004-03-16 | 2009-01-13 | Google Inc. | Systems and methods for translating Chinese pinyin to Chinese characters |
FI20045077A (en) | 2004-03-16 | 2005-09-17 | Nokia Corp | Method and apparatus for indicating size restriction of message |
US7084758B1 (en) | 2004-03-19 | 2006-08-01 | Advanced Micro Devices, Inc. | Location-based reminders |
JP4458888B2 (en) | 2004-03-22 | 2010-04-28 | 富士通株式会社 | Conference support system, minutes generation method, and computer program |
CN100346274C (en) | 2004-03-25 | 2007-10-31 | 升达科技股份有限公司 | Inputtig method, control module and product with starting location and moving direction as definition |
US7571111B2 (en) | 2004-03-29 | 2009-08-04 | United Parcel Service Of America, Inc. | Computer system for monitoring actual performance to standards in real time |
US20050222973A1 (en) | 2004-03-30 | 2005-10-06 | Matthias Kaiser | Methods and systems for summarizing information |
US7409337B1 (en) | 2004-03-30 | 2008-08-05 | Microsoft Corporation | Natural language processing interface |
GB0407389D0 (en) | 2004-03-31 | 2004-05-05 | British Telecomm | Information retrieval |
US7496512B2 (en) | 2004-04-13 | 2009-02-24 | Microsoft Corporation | Refining of segmental boundaries in speech waveforms using contextual-dependent models |
JP2005311864A (en) | 2004-04-23 | 2005-11-04 | Toshiba Corp | Household appliances, adapter instrument, and household appliance system |
US20050245243A1 (en) | 2004-04-28 | 2005-11-03 | Zuniga Michael A | System and method for wireless delivery of audio content over wireless high speed data networks |
US20050246350A1 (en) | 2004-04-30 | 2005-11-03 | Opence Inc. | System and method for classifying and normalizing structured data |
US7447665B2 (en) | 2004-05-10 | 2008-11-04 | Kinetx, Inc. | System and method of self-learning conceptual mapping to organize and interpret data |
US7366461B1 (en) | 2004-05-17 | 2008-04-29 | Wendell Brown | Method and apparatus for improving the quality of a recorded broadcast audio program |
CN100524457C (en) | 2004-05-31 | 2009-08-05 | 国际商业机器公司 | Device and method for text-to-speech conversion and corpus adjustment |
US8095364B2 (en) | 2004-06-02 | 2012-01-10 | Tegic Communications, Inc. | Multimodal disambiguation of speech recognition |
US8224649B2 (en) | 2004-06-02 | 2012-07-17 | International Business Machines Corporation | Method and apparatus for remote command, control and diagnostics of systems using conversational or audio interface |
US20050273626A1 (en) | 2004-06-02 | 2005-12-08 | Steven Pearson | System and method for portable authentication |
US20050273337A1 (en) | 2004-06-02 | 2005-12-08 | Adoram Erell | Apparatus and method for synthesized audible response to an utterance in speaker-independent voice recognition |
US7673340B1 (en) | 2004-06-02 | 2010-03-02 | Clickfox Llc | System and method for analyzing system user behavior |
CA2573002A1 (en) | 2004-06-04 | 2005-12-22 | Benjamin Firooz Ghassabian | Systems to enhance data entry in mobile and fixed environment |
US20050271216A1 (en) | 2004-06-04 | 2005-12-08 | Khosrow Lashkari | Method and apparatus for loudspeaker equalization |
US7774378B2 (en) | 2004-06-04 | 2010-08-10 | Icentera Corporation | System and method for providing intelligence centers |
US7472065B2 (en) | 2004-06-04 | 2008-12-30 | International Business Machines Corporation | Generating paralinguistic phenomena via markup in text-to-speech synthesis |
JP4477428B2 (en) | 2004-06-15 | 2010-06-09 | 株式会社日立製作所 | Display control apparatus, information display apparatus including the same, display system including these, display control program, and display control method |
DE102004029203B4 (en) | 2004-06-16 | 2021-01-21 | Volkswagen Ag | Control device for a motor vehicle |
US7565104B1 (en) | 2004-06-16 | 2009-07-21 | Wendell Brown | Broadcast audio program guide |
US8321786B2 (en) | 2004-06-17 | 2012-11-27 | Apple Inc. | Routine and interface for correcting electronic text |
GB0413743D0 (en) | 2004-06-19 | 2004-07-21 | Ibm | Method and system for approximate string matching |
US20070214133A1 (en) | 2004-06-23 | 2007-09-13 | Edo Liberty | Methods for filtering data and filling in missing data using nonlinear inference |
US20050289463A1 (en) | 2004-06-23 | 2005-12-29 | Google Inc., A Delaware Corporation | Systems and methods for spell correction of non-roman characters and words |
JP4416643B2 (en) | 2004-06-29 | 2010-02-17 | キヤノン株式会社 | Multimodal input method |
US7720674B2 (en) | 2004-06-29 | 2010-05-18 | Sap Ag | Systems and methods for processing natural language queries |
US20060004570A1 (en) | 2004-06-30 | 2006-01-05 | Microsoft Corporation | Transcribing speech data with dialog context and/or recognition alternative information |
TWI248576B (en) | 2004-07-05 | 2006-02-01 | Elan Microelectronics Corp | Method for controlling rolling of scroll bar on a touch panel |
US20060007174A1 (en) | 2004-07-06 | 2006-01-12 | Chung-Yi Shen | Touch control method for a drag gesture and control module thereof |
US7228278B2 (en) | 2004-07-06 | 2007-06-05 | Voxify, Inc. | Multi-slot dialog systems and methods |
US7505795B1 (en) | 2004-07-07 | 2009-03-17 | Advanced Micro Devices, Inc. | Power save management with customized range for user configuration and tuning value based upon recent usage |
US7823123B2 (en) | 2004-07-13 | 2010-10-26 | The Mitre Corporation | Semantic system for integrating software components |
JP4652737B2 (en) | 2004-07-14 | 2011-03-16 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Word boundary probability estimation device and method, probabilistic language model construction device and method, kana-kanji conversion device and method, and unknown word model construction method, |
WO2006019993A2 (en) | 2004-07-15 | 2006-02-23 | Aurilab, Llc | Distributed pattern recognition training method and system |
TWI240573B (en) | 2004-07-15 | 2005-09-21 | Ali Corp | Methods and related circuit for automatic audio volume level control |
US8036893B2 (en) | 2004-07-22 | 2011-10-11 | Nuance Communications, Inc. | Method and system for identifying and correcting accent-induced speech recognition difficulties |
TWI252049B (en) | 2004-07-23 | 2006-03-21 | Inventec Corp | Sound control system and method |
US7738637B2 (en) | 2004-07-24 | 2010-06-15 | Massachusetts Institute Of Technology | Interactive voice message retrieval |
KR100984596B1 (en) | 2004-07-30 | 2010-09-30 | 애플 인크. | Gestures for touch sensitive input devices |
US7725318B2 (en) | 2004-07-30 | 2010-05-25 | Nice Systems Inc. | System and method for improving the accuracy of audio searching |
US7788098B2 (en) | 2004-08-02 | 2010-08-31 | Nokia Corporation | Predicting tone pattern information for textual information used in telecommunication systems |
KR100875723B1 (en) | 2004-08-04 | 2008-12-24 | 천지은 | Call storage system and method |
US7508324B2 (en) | 2004-08-06 | 2009-03-24 | Daniel Suraqui | Finger activated reduced keyboard and a method for performing text input |
US7869999B2 (en) | 2004-08-11 | 2011-01-11 | Nuance Communications, Inc. | Systems and methods for selecting from multiple phonectic transcriptions for text-to-speech synthesis |
US7685118B2 (en) | 2004-08-12 | 2010-03-23 | Iwint International Holdings Inc. | Method using ontology and user query processing to solve inventor problems and user problems |
US7580363B2 (en) | 2004-08-16 | 2009-08-25 | Nokia Corporation | Apparatus and method for facilitating contact selection in communication devices |
US7912699B1 (en) | 2004-08-23 | 2011-03-22 | At&T Intellectual Property Ii, L.P. | System and method of lattice-based search for spoken utterance retrieval |
US20060048055A1 (en) | 2004-08-25 | 2006-03-02 | Jun Wu | Fault-tolerant romanized input method for non-roman characters |
US7853574B2 (en) | 2004-08-26 | 2010-12-14 | International Business Machines Corporation | Method of generating a context-inferenced search query and of sorting a result of the query |
US20060262876A1 (en) | 2004-08-26 | 2006-11-23 | Ladue Christoph K | Wave matrix mechanics method & apparatus |
US7477238B2 (en) | 2004-08-31 | 2009-01-13 | Research In Motion Limited | Handheld electronic device with text disambiguation |
KR20060022001A (en) | 2004-09-06 | 2006-03-09 | 현대모비스 주식회사 | Button mounting structure for a car audio |
US20060050865A1 (en) | 2004-09-07 | 2006-03-09 | Sbc Knowledge Ventures, Lp | System and method for adapting the level of instructional detail provided through a user interface |
US7587482B2 (en) | 2004-09-08 | 2009-09-08 | Yahoo! Inc. | Multimodal interface for mobile messaging |
US20060058999A1 (en) | 2004-09-10 | 2006-03-16 | Simon Barker | Voice model adaptation |
KR20070053246A (en) | 2004-09-14 | 2007-05-23 | 가부시키가이샤 아이.피.비. | Device for drawing document correlation diagram where documents are arranged in time series |
US7319385B2 (en) | 2004-09-17 | 2008-01-15 | Nokia Corporation | Sensor data sharing |
US20060061488A1 (en) | 2004-09-17 | 2006-03-23 | Dunton Randy R | Location based task reminder |
ITRM20040447A1 (en) | 2004-09-22 | 2004-12-22 | Link Formazione S R L | INTERACTIVE SEMINARS SUPPLY SYSTEM, AND RELATED METHOD. |
TW200629959A (en) | 2004-09-22 | 2006-08-16 | Citizen Electronics | Electro-dynamic exciter |
US20060072716A1 (en) | 2004-09-27 | 2006-04-06 | Avaya Technology Corp. | Downloadable and controllable music-on-hold |
US20060067536A1 (en) | 2004-09-27 | 2006-03-30 | Michael Culbert | Method and system for time synchronizing multiple loudspeakers |
US7716056B2 (en) | 2004-09-27 | 2010-05-11 | Robert Bosch Corporation | Method and system for interactive conversational dialogue for cognitively overloaded device users |
US20060067535A1 (en) | 2004-09-27 | 2006-03-30 | Michael Culbert | Method and system for automatically equalizing multiple loudspeakers |
US20060074660A1 (en) | 2004-09-29 | 2006-04-06 | France Telecom | Method and apparatus for enhancing speech recognition accuracy by using geographic data to filter a set of words |
KR100754385B1 (en) | 2004-09-30 | 2007-08-31 | 삼성전자주식회사 | Apparatus and method for object localization, tracking, and separation using audio and video sensors |
EP1797506A1 (en) | 2004-09-30 | 2007-06-20 | Koninklijke Philips Electronics N.V. | Automatic text correction |
US7996208B2 (en) | 2004-09-30 | 2011-08-09 | Google Inc. | Methods and systems for selecting a language for text segmentation |
JP4478939B2 (en) | 2004-09-30 | 2010-06-09 | 株式会社国際電気通信基礎技術研究所 | Audio processing apparatus and computer program therefor |
US8107401B2 (en) | 2004-09-30 | 2012-01-31 | Avaya Inc. | Method and apparatus for providing a virtual assistant to a communication participant |
US7603381B2 (en) | 2004-09-30 | 2009-10-13 | Microsoft Corporation | Contextual action publishing |
CN1755796A (en) | 2004-09-30 | 2006-04-05 | 国际商业机器公司 | Distance defining method and system based on statistic technology in text-to speech conversion |
US8099482B2 (en) | 2004-10-01 | 2012-01-17 | E-Cast Inc. | Prioritized content download for an entertainment device |
US7917554B2 (en) | 2005-08-23 | 2011-03-29 | Ricoh Co. Ltd. | Visibly-perceptible hot spots in documents |
US9100776B2 (en) | 2004-10-06 | 2015-08-04 | Intelligent Mechatronic Systems Inc. | Location based event reminder for mobile device |
US7809763B2 (en) | 2004-10-15 | 2010-10-05 | Oracle International Corporation | Method(s) for updating database object metadata |
US7684988B2 (en) | 2004-10-15 | 2010-03-23 | Microsoft Corporation | Testing and tuning of automatic speech recognition systems using synthetic inputs generated from its acoustic models |
US7543232B2 (en) | 2004-10-19 | 2009-06-02 | International Business Machines Corporation | Intelligent web based help system |
US7693719B2 (en) | 2004-10-29 | 2010-04-06 | Microsoft Corporation | Providing personalized voice font for text-to-speech applications |
US7595742B2 (en) | 2004-10-29 | 2009-09-29 | Lenovo (Singapore) Pte. Ltd. | System and method for generating language specific diacritics for different languages using a single keyboard layout |
US7362312B2 (en) | 2004-11-01 | 2008-04-22 | Nokia Corporation | Mobile communication terminal and method |
US7735012B2 (en) | 2004-11-04 | 2010-06-08 | Apple Inc. | Audio user interface for computing devices |
US7505894B2 (en) | 2004-11-04 | 2009-03-17 | Microsoft Corporation | Order model for dependency structure |
US7552046B2 (en) | 2004-11-15 | 2009-06-23 | Microsoft Corporation | Unsupervised learning of paraphrase/translation alternations and selective application thereof |
US7546235B2 (en) | 2004-11-15 | 2009-06-09 | Microsoft Corporation | Unsupervised learning of paraphrase/translation alternations and selective application thereof |
US7885844B1 (en) | 2004-11-16 | 2011-02-08 | Amazon Technologies, Inc. | Automatically generating task recommendations for human task performers |
US7650284B2 (en) | 2004-11-19 | 2010-01-19 | Nuance Communications, Inc. | Enabling voice click in a multimodal page |
JP4604178B2 (en) | 2004-11-22 | 2010-12-22 | 独立行政法人産業技術総合研究所 | Speech recognition apparatus and method, and program |
WO2006056822A1 (en) | 2004-11-23 | 2006-06-01 | Nokia Corporation | Processing a message received from a mobile cellular network |
US7702500B2 (en) | 2004-11-24 | 2010-04-20 | Blaedow Karen R | Method and apparatus for determining the meaning of natural language |
CN1609859A (en) | 2004-11-26 | 2005-04-27 | 孙斌 | Search result clustering method |
US7376645B2 (en) | 2004-11-29 | 2008-05-20 | The Intellection Group, Inc. | Multimodal natural language query system and architecture for processing voice and proximity-based queries |
GB0426347D0 (en) | 2004-12-01 | 2005-01-05 | Ibm | Methods, apparatus and computer programs for automatic speech recognition |
US20060122834A1 (en) | 2004-12-03 | 2006-06-08 | Bennett Ian M | Emotion detection device & method for use in distributed systems |
US8214214B2 (en) | 2004-12-03 | 2012-07-03 | Phoenix Solutions, Inc. | Emotion detection device and method for use in distributed systems |
US8024194B2 (en) | 2004-12-08 | 2011-09-20 | Nuance Communications, Inc. | Dynamic switching between local and remote speech rendering |
US7636657B2 (en) | 2004-12-09 | 2009-12-22 | Microsoft Corporation | Method and apparatus for automatic grammar generation from data entries |
US7853445B2 (en) | 2004-12-10 | 2010-12-14 | Deception Discovery Technologies LLC | Method and system for the automatic recognition of deceptive language |
US7218943B2 (en) | 2004-12-13 | 2007-05-15 | Research In Motion Limited | Text messaging conversation user interface functionality |
US7451397B2 (en) | 2004-12-15 | 2008-11-11 | Microsoft Corporation | System and method for automatically completing spreadsheet formulas |
US8275618B2 (en) | 2004-12-22 | 2012-09-25 | Nuance Communications, Inc. | Mobile dictation correction user interface |
US20080004881A1 (en) | 2004-12-22 | 2008-01-03 | David Attwater | Turn-taking model |
US20060143576A1 (en) | 2004-12-23 | 2006-06-29 | Gupta Anurag K | Method and system for resolving cross-modal references in user inputs |
US7987244B1 (en) | 2004-12-30 | 2011-07-26 | At&T Intellectual Property Ii, L.P. | Network repository for voice fonts |
FI20041689A0 (en) | 2004-12-30 | 2004-12-30 | Nokia Corp | Marking and / or splitting of media stream into a cellular network terminal |
US8478589B2 (en) | 2005-01-05 | 2013-07-02 | At&T Intellectual Property Ii, L.P. | Library of existing spoken dialog data for use in generating new natural language spoken dialog systems |
US7593782B2 (en) | 2005-01-07 | 2009-09-22 | Apple Inc. | Highly portable media device |
US8069422B2 (en) | 2005-01-10 | 2011-11-29 | Samsung Electronics, Co., Ltd. | Contextual task recommendation system and method for determining user's context and suggesting tasks |
US7363227B2 (en) | 2005-01-10 | 2008-04-22 | Herman Miller, Inc. | Disruption of speech understanding by adding a privacy sound thereto |
US7418389B2 (en) | 2005-01-11 | 2008-08-26 | Microsoft Corporation | Defining atom units between phone and syllable for TTS systems |
US20080189099A1 (en) | 2005-01-12 | 2008-08-07 | Howard Friedman | Customizable Delivery of Audio Information |
US8552984B2 (en) | 2005-01-13 | 2013-10-08 | 602531 British Columbia Ltd. | Method, system, apparatus and computer-readable media for directing input associated with keyboard-type device |
US7337170B2 (en) | 2005-01-18 | 2008-02-26 | International Business Machines Corporation | System and method for planning and generating queries for multi-dimensional analysis using domain models and data federation |
JP2008529345A (en) | 2005-01-20 | 2008-07-31 | ロウェ,フレデリック | System and method for generating and distributing personalized media |
US7873654B2 (en) | 2005-01-24 | 2011-01-18 | The Intellection Group, Inc. | Multimodal natural language query system for processing and analyzing voice and proximity-based queries |
US8150872B2 (en) | 2005-01-24 | 2012-04-03 | The Intellection Group, Inc. | Multimodal natural language query system for processing and analyzing voice and proximity-based queries |
US20060168507A1 (en) | 2005-01-26 | 2006-07-27 | Hansen Kim D | Apparatus, system, and method for digitally presenting the contents of a printed publication |
US20060167676A1 (en) | 2005-01-26 | 2006-07-27 | Research In Motion Limited | Method and apparatus for correction of spelling errors in text composition |
US7508373B2 (en) | 2005-01-28 | 2009-03-24 | Microsoft Corporation | Form factor and input method for language input |
US8243891B2 (en) | 2005-01-28 | 2012-08-14 | Value-Added Communications, Inc. | Voice message exchange |
EP1693784A3 (en) | 2005-01-28 | 2012-04-04 | IDMS Software Inc. | Handwritten word recognition based on geometric decomposition |
US20060174207A1 (en) | 2005-01-31 | 2006-08-03 | Sharp Laboratories Of America, Inc. | Systems and methods for implementing a user interface for multiple simultaneous instant messaging, conference and chat room sessions |
US8200700B2 (en) | 2005-02-01 | 2012-06-12 | Newsilike Media Group, Inc | Systems and methods for use of structured and unstructured distributed data |
US8045953B2 (en) | 2005-02-03 | 2011-10-25 | Research In Motion Limited | Method and apparatus for the autoselection of an emergency number in a mobile station |
GB0502259D0 (en) | 2005-02-03 | 2005-03-09 | British Telecomm | Document searching tool and method |
US7949533B2 (en) | 2005-02-04 | 2011-05-24 | Vococollect, Inc. | Methods and systems for assessing and improving the performance of a speech recognition system |
US8200495B2 (en) | 2005-02-04 | 2012-06-12 | Vocollect, Inc. | Methods and systems for considering information about an expected response when performing speech recognition |
US20060187073A1 (en) | 2005-02-18 | 2006-08-24 | Chao-Hua Lin | Energy status indicator in a portable device |
EP1693830B1 (en) | 2005-02-21 | 2017-12-20 | Harman Becker Automotive Systems GmbH | Voice-controlled data system |
EP1693829B1 (en) | 2005-02-21 | 2018-12-05 | Harman Becker Automotive Systems GmbH | Voice-controlled data system |
US8041557B2 (en) | 2005-02-24 | 2011-10-18 | Fuji Xerox Co., Ltd. | Word translation device, translation method, and computer readable medium |
US7634413B1 (en) | 2005-02-25 | 2009-12-15 | Apple Inc. | Bitrate constrained variable bitrate audio encoding |
US7788087B2 (en) | 2005-03-01 | 2010-08-31 | Microsoft Corporation | System for processing sentiment-bearing text |
US20060212415A1 (en) | 2005-03-01 | 2006-09-21 | Alejandro Backer | Query-less searching |
US20060197755A1 (en) | 2005-03-02 | 2006-09-07 | Bawany Muhammad A | Computer stylus cable system and method |
WO2005057425A2 (en) | 2005-03-07 | 2005-06-23 | Linguatec Sprachtechnologien Gmbh | Hybrid machine translation system |
KR100679044B1 (en) | 2005-03-07 | 2007-02-06 | 삼성전자주식회사 | Method and apparatus for speech recognition |
US7676026B1 (en) | 2005-03-08 | 2010-03-09 | Baxtech Asia Pte Ltd | Desktop telephony system |
US7788248B2 (en) | 2005-03-08 | 2010-08-31 | Apple Inc. | Immediate search feedback |
JP4404211B2 (en) | 2005-03-14 | 2010-01-27 | 富士ゼロックス株式会社 | Multilingual translation memory, translation method and translation program |
US7706510B2 (en) | 2005-03-16 | 2010-04-27 | Research In Motion | System and method for personalized text-to-voice synthesis |
US20060218506A1 (en) | 2005-03-23 | 2006-09-28 | Edward Srenger | Adaptive menu for a user interface |
US7565380B1 (en) | 2005-03-24 | 2009-07-21 | Netlogic Microsystems, Inc. | Memory optimized pattern searching |
US7925525B2 (en) | 2005-03-25 | 2011-04-12 | Microsoft Corporation | Smart reminders |
US20060253210A1 (en) | 2005-03-26 | 2006-11-09 | Outland Research, Llc | Intelligent Pace-Setting Portable Media Player |
EP1865404A4 (en) | 2005-03-28 | 2012-09-05 | Panasonic Corp | User interface system |
WO2006105105A2 (en) | 2005-03-28 | 2006-10-05 | Sound Id | Personal sound system |
US7529678B2 (en) | 2005-03-30 | 2009-05-05 | International Business Machines Corporation | Using a spoken utterance for disambiguation of spelling inputs into a speech recognition system |
US7721301B2 (en) | 2005-03-31 | 2010-05-18 | Microsoft Corporation | Processing files from a mobile device using voice commands |
US7664558B2 (en) | 2005-04-01 | 2010-02-16 | Apple Inc. | Efficient techniques for modifying audio playback rates |
US7716052B2 (en) | 2005-04-07 | 2010-05-11 | Nuance Communications, Inc. | Method, apparatus and computer program providing a multi-speaker database for concatenative text-to-speech synthesis |
US20080120342A1 (en) | 2005-04-07 | 2008-05-22 | Iofy Corporation | System and Method for Providing Data to be Used in a Presentation on a Device |
GB0507036D0 (en) | 2005-04-07 | 2005-05-11 | Ibm | Method and system for language identification |
US20080141180A1 (en) | 2005-04-07 | 2008-06-12 | Iofy Corporation | Apparatus and Method for Utilizing an Information Unit to Provide Navigation Features on a Device |
JP2008537225A (en) | 2005-04-11 | 2008-09-11 | テキストディガー,インコーポレイテッド | Search system and method for queries |
US7746989B2 (en) | 2005-04-12 | 2010-06-29 | Onset Technology, Ltd. | System and method for recording and attaching an audio file to an electronic message generated by a portable client device |
US20080195601A1 (en) | 2005-04-14 | 2008-08-14 | The Regents Of The University Of California | Method For Information Retrieval |
US7516123B2 (en) | 2005-04-14 | 2009-04-07 | International Business Machines Corporation | Page rank for the semantic web query |
US7471284B2 (en) | 2005-04-15 | 2008-12-30 | Microsoft Corporation | Tactile scroll bar with illuminated document position indicator |
US7627481B1 (en) | 2005-04-19 | 2009-12-01 | Apple Inc. | Adapting masking thresholds for encoding a low frequency transient signal in audio data |
US20060239419A1 (en) | 2005-04-20 | 2006-10-26 | Siemens Communications, Inc. | Selective and dynamic voicemail |
US7584093B2 (en) | 2005-04-25 | 2009-09-01 | Microsoft Corporation | Method and system for generating spelling suggestions |
US20060240866A1 (en) | 2005-04-25 | 2006-10-26 | Texas Instruments Incorporated | Method and system for controlling a portable communication device based on its orientation |
US20060242190A1 (en) | 2005-04-26 | 2006-10-26 | Content Analyst Comapny, Llc | Latent semantic taxonomy generation |
US20060288024A1 (en) | 2005-04-28 | 2006-12-21 | Freescale Semiconductor Incorporated | Compressed representations of tries |
US7292579B2 (en) | 2005-04-29 | 2007-11-06 | Scenera Technologies, Llc | Processing operations associated with resources on a local network |
US7684990B2 (en) | 2005-04-29 | 2010-03-23 | Nuance Communications, Inc. | Method and apparatus for multiple value confirmation and correction in spoken dialog systems |
US20060246955A1 (en) | 2005-05-02 | 2006-11-02 | Mikko Nirhamo | Mobile communication device and method therefor |
DE602005022562D1 (en) | 2005-05-03 | 2010-09-09 | Oticon As | System and method for sharing network resources between hearing aids |
US8036878B2 (en) | 2005-05-18 | 2011-10-11 | Never Wall Treuhand GmbH | Device incorporating improved text input mechanism |
US7686215B2 (en) | 2005-05-21 | 2010-03-30 | Apple Inc. | Techniques and systems for supporting podcasting |
US7886233B2 (en) | 2005-05-23 | 2011-02-08 | Nokia Corporation | Electronic text input involving word completion functionality for predicting word candidates for partial word inputs |
FR2886445A1 (en) | 2005-05-30 | 2006-12-01 | France Telecom | METHOD, DEVICE AND COMPUTER PROGRAM FOR SPEECH RECOGNITION |
WO2006129967A1 (en) | 2005-05-30 | 2006-12-07 | Daumsoft, Inc. | Conversation system and method using conversational agent |
US8041570B2 (en) | 2005-05-31 | 2011-10-18 | Robert Bosch Corporation | Dialogue management using scripts |
US7580576B2 (en) | 2005-06-02 | 2009-08-25 | Microsoft Corporation | Stroke localization and binding to electronic document |
US8300841B2 (en) | 2005-06-03 | 2012-10-30 | Apple Inc. | Techniques for presenting sound effects on a portable media player |
JP4640591B2 (en) | 2005-06-09 | 2011-03-02 | 富士ゼロックス株式会社 | Document search device |
US20060282264A1 (en) | 2005-06-09 | 2006-12-14 | Bellsouth Intellectual Property Corporation | Methods and systems for providing noise filtering using speech recognition |
WO2006133571A1 (en) | 2005-06-17 | 2006-12-21 | National Research Council Of Canada | Means and method for adapted language translation |
JP2007004633A (en) | 2005-06-24 | 2007-01-11 | Microsoft Corp | Language model generation device and language processing device using language model generated by the same |
JP4064413B2 (en) | 2005-06-27 | 2008-03-19 | 株式会社東芝 | Communication support device, communication support method, and communication support program |
US8024195B2 (en) | 2005-06-27 | 2011-09-20 | Sensory, Inc. | Systems and methods of performing speech recognition using historical information |
US7538685B1 (en) | 2005-06-28 | 2009-05-26 | Avaya Inc. | Use of auditory feedback and audio queues in the realization of a personal virtual assistant |
US8396456B2 (en) | 2005-06-28 | 2013-03-12 | Avaya Integrated Cabinet Solutions Inc. | Visual voicemail management |
US8396715B2 (en) | 2005-06-28 | 2013-03-12 | Microsoft Corporation | Confidence threshold tuning |
GB0513225D0 (en) | 2005-06-29 | 2005-08-03 | Ibm | Method and system for building and contracting a linguistic dictionary |
US7627703B2 (en) | 2005-06-29 | 2009-12-01 | Microsoft Corporation | Input device with audio capabilities |
US20070004451A1 (en) | 2005-06-30 | 2007-01-04 | C Anderson Eric | Controlling functions of a handheld multifunction device |
US7542967B2 (en) | 2005-06-30 | 2009-06-02 | Microsoft Corporation | Searching an index of media content |
US7925995B2 (en) | 2005-06-30 | 2011-04-12 | Microsoft Corporation | Integration of location logs, GPS signals, and spatial resources for identifying user activities, goals, and context |
US7433869B2 (en) | 2005-07-01 | 2008-10-07 | Ebrary, Inc. | Method and apparatus for document clustering and document sketching |
US7826945B2 (en) | 2005-07-01 | 2010-11-02 | You Zhang | Automobile speech-recognition interface |
US20070021956A1 (en) | 2005-07-19 | 2007-01-25 | Yan Qu | Method and apparatus for generating ideographic representations of letter based names |
US7613264B2 (en) | 2005-07-26 | 2009-11-03 | Lsi Corporation | Flexible sampling-rate encoder |
US20090048821A1 (en) | 2005-07-27 | 2009-02-19 | Yahoo! Inc. | Mobile language interpreter with text to speech |
US20070027732A1 (en) | 2005-07-28 | 2007-02-01 | Accu-Spatial, Llc | Context-sensitive, location-dependent information delivery at a construction site |
US7890520B2 (en) | 2005-08-01 | 2011-02-15 | Sony Corporation | Processing apparatus and associated methodology for content table generation and transfer |
US8160614B2 (en) | 2005-08-05 | 2012-04-17 | Targus Information Corporation | Automated concierge system and method |
US7640160B2 (en) | 2005-08-05 | 2009-12-29 | Voicebox Technologies, Inc. | Systems and methods for responding to natural language speech utterance |
US20070073726A1 (en) | 2005-08-05 | 2007-03-29 | Klein Eric N Jr | System and method for queuing purchase transactions |
US7362738B2 (en) | 2005-08-09 | 2008-04-22 | Deere & Company | Method and system for delivering information to a user |
US7620549B2 (en) | 2005-08-10 | 2009-11-17 | Voicebox Technologies, Inc. | System and method of supporting adaptive misrecognition in conversational speech |
US20070038609A1 (en) | 2005-08-11 | 2007-02-15 | William Wu | System and method of query paraphrasing |
US20070041361A1 (en) | 2005-08-15 | 2007-02-22 | Nokia Corporation | Apparatus and methods for implementing an in-call voice user interface using context information |
US8126716B2 (en) | 2005-08-19 | 2012-02-28 | Nuance Communications, Inc. | Method and system for collecting audio prompts in a dynamically generated voice application |
US20090076821A1 (en) | 2005-08-19 | 2009-03-19 | Gracenote, Inc. | Method and apparatus to control operation of a playback device |
WO2007025119A2 (en) | 2005-08-26 | 2007-03-01 | Veveo, Inc. | User interface for visual cooperation between text input and display device |
US20070050184A1 (en) | 2005-08-26 | 2007-03-01 | Drucker David M | Personal audio content delivery apparatus and method |
US7668825B2 (en) | 2005-08-26 | 2010-02-23 | Convera Corporation | Search system and method |
US7949529B2 (en) | 2005-08-29 | 2011-05-24 | Voicebox Technologies, Inc. | Mobile systems and methods of supporting natural language human-machine interactions |
KR100739726B1 (en) | 2005-08-30 | 2007-07-13 | 삼성전자주식회사 | Method and system for name matching and computer readable medium recording the method |
US8265939B2 (en) | 2005-08-31 | 2012-09-11 | Nuance Communications, Inc. | Hierarchical methods and apparatus for extracting user intent from spoken utterances |
US7634409B2 (en) | 2005-08-31 | 2009-12-15 | Voicebox Technologies, Inc. | Dynamic speech sharpening |
EP1919771A4 (en) | 2005-08-31 | 2010-06-09 | Intuview Itd | Decision-support expert system and methods for real-time exploitation of documents in non-english languages |
US7443316B2 (en) | 2005-09-01 | 2008-10-28 | Motorola, Inc. | Entering a character into an electronic device |
WO2007028128A2 (en) | 2005-09-01 | 2007-03-08 | Vishal Dhawan | Voice application network platform |
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US20070055514A1 (en) | 2005-09-08 | 2007-03-08 | Beattie Valerie L | Intelligent tutoring feedback |
US7694231B2 (en) | 2006-01-05 | 2010-04-06 | Apple Inc. | Keyboards for portable electronic devices |
US7873356B2 (en) | 2005-09-16 | 2011-01-18 | Microsoft Corporation | Search interface for mobile devices |
US7378963B1 (en) | 2005-09-20 | 2008-05-27 | Begault Durand R | Reconfigurable auditory-visual display |
JP4542974B2 (en) | 2005-09-27 | 2010-09-15 | 株式会社東芝 | Speech recognition apparatus, speech recognition method, and speech recognition program |
US7280958B2 (en) | 2005-09-30 | 2007-10-09 | Motorola, Inc. | Method and system for suppressing receiver audio regeneration |
JP4908094B2 (en) | 2005-09-30 | 2012-04-04 | 株式会社リコー | Information processing system, information processing method, and information processing program |
US7577522B2 (en) | 2005-12-05 | 2009-08-18 | Outland Research, Llc | Spatially associated personal reminder system and method |
US7930168B2 (en) | 2005-10-04 | 2011-04-19 | Robert Bosch Gmbh | Natural language processing of disfluent sentences |
CN100483399C (en) | 2005-10-09 | 2009-04-29 | 株式会社东芝 | Training transliteration model, segmentation statistic model and automatic transliterating method and device |
US20070083467A1 (en) | 2005-10-10 | 2007-04-12 | Apple Computer, Inc. | Partial encryption techniques for media data |
US9454747B2 (en) | 2005-10-11 | 2016-09-27 | Aol Inc. | Ordering of conversations based on monitored recipient user interaction with corresponding electronic messages |
US8620667B2 (en) | 2005-10-17 | 2013-12-31 | Microsoft Corporation | Flexible speech-activated command and control |
US7707032B2 (en) | 2005-10-20 | 2010-04-27 | National Cheng Kung University | Method and system for matching speech data |
US20070093277A1 (en) | 2005-10-21 | 2007-04-26 | Acco Brands Corporation Usa Llc | Updating a static image from an accessory to an electronic device to provide user feedback during interaction with the accessory |
KR101386708B1 (en) | 2005-10-21 | 2014-04-18 | 에스에프엑스 테크놀로지스 리미티드 | An electronic device configured to radiate sound and a method therefor |
US8229745B2 (en) | 2005-10-21 | 2012-07-24 | Nuance Communications, Inc. | Creating a mixed-initiative grammar from directed dialog grammars |
US7894580B2 (en) | 2005-10-26 | 2011-02-22 | Research In Motion Limited | Methods and apparatus for reliable voicemail message deletion alerts at mobile communication devices |
US7941316B2 (en) | 2005-10-28 | 2011-05-10 | Microsoft Corporation | Combined speech and alternate input modality to a mobile device |
US7729481B2 (en) | 2005-10-28 | 2010-06-01 | Yahoo! Inc. | User interface for integrating diverse methods of communication |
US7778632B2 (en) | 2005-10-28 | 2010-08-17 | Microsoft Corporation | Multi-modal device capable of automated actions |
US20070100883A1 (en) | 2005-10-31 | 2007-05-03 | Rose Daniel E | Methods for providing audio feedback during the navigation of collections of information |
US7918788B2 (en) | 2005-10-31 | 2011-04-05 | Ethicon, Inc. | Apparatus and method for providing flow to endoscope channels |
US20070098195A1 (en) | 2005-10-31 | 2007-05-03 | Holmes David W | Wireless hearing aid system and method |
US7831428B2 (en) | 2005-11-09 | 2010-11-09 | Microsoft Corporation | Speech index pruning |
US20070106674A1 (en) | 2005-11-10 | 2007-05-10 | Purusharth Agrawal | Field sales process facilitation systems and methods |
US20070106513A1 (en) | 2005-11-10 | 2007-05-10 | Boillot Marc A | Method for facilitating text to speech synthesis using a differential vocoder |
US20070112572A1 (en) | 2005-11-15 | 2007-05-17 | Fail Keith W | Method and apparatus for assisting vision impaired individuals with selecting items from a list |
US7676463B2 (en) | 2005-11-15 | 2010-03-09 | Kroll Ontrack, Inc. | Information exploration systems and method |
US8326629B2 (en) | 2005-11-22 | 2012-12-04 | Nuance Communications, Inc. | Dynamically changing voice attributes during speech synthesis based upon parameter differentiation for dialog contexts |
US7644054B2 (en) | 2005-11-23 | 2010-01-05 | Veveo, Inc. | System and method for finding desired results by incremental search using an ambiguous keypad with the input containing orthographic and typographic errors |
US20070185926A1 (en) | 2005-11-28 | 2007-08-09 | Anand Prahlad | Systems and methods for classifying and transferring information in a storage network |
US8261189B2 (en) | 2005-11-30 | 2012-09-04 | International Business Machines Corporation | Database monitor replay |
DE102005057406A1 (en) | 2005-11-30 | 2007-06-06 | Valenzuela, Carlos Alberto, Dr.-Ing. | Method for recording a sound source with time-variable directional characteristics and for playback and system for carrying out the method |
TWI298844B (en) | 2005-11-30 | 2008-07-11 | Delta Electronics Inc | User-defines speech-controlled shortcut module and method |
KR101176540B1 (en) | 2005-12-02 | 2012-08-24 | 삼성전자주식회사 | Poly-Si Thin Film Transistor and organic light emitting display adopting the same |
US8498624B2 (en) | 2005-12-05 | 2013-07-30 | At&T Intellectual Property I, L.P. | Method and apparatus for managing voicemail messages |
KR100810500B1 (en) | 2005-12-08 | 2008-03-07 | 한국전자통신연구원 | Method for enhancing usability in a spoken dialog system |
US20070136778A1 (en) | 2005-12-09 | 2007-06-14 | Ari Birger | Controller and control method for media retrieval, routing and playback |
US7800596B2 (en) | 2005-12-14 | 2010-09-21 | Research In Motion Limited | Handheld electronic device having virtual navigational input device, and associated method |
GB2433403B (en) | 2005-12-16 | 2009-06-24 | Emil Ltd | A text editing apparatus and method |
US20070211071A1 (en) | 2005-12-20 | 2007-09-13 | Benjamin Slotznick | Method and apparatus for interacting with a visually displayed document on a screen reader |
DE102005061365A1 (en) | 2005-12-21 | 2007-06-28 | Siemens Ag | Background applications e.g. home banking system, controlling method for use over e.g. user interface, involves associating transactions and transaction parameters over universal dialog specification, and universally operating applications |
US8234494B1 (en) | 2005-12-21 | 2012-07-31 | At&T Intellectual Property Ii, L.P. | Speaker-verification digital signatures |
US7996228B2 (en) | 2005-12-22 | 2011-08-09 | Microsoft Corporation | Voice initiated network operations |
US7599918B2 (en) | 2005-12-29 | 2009-10-06 | Microsoft Corporation | Dynamic search with implicit user intention mining |
US7685144B1 (en) | 2005-12-29 | 2010-03-23 | Google Inc. | Dynamically autocompleting a data entry |
US7890330B2 (en) | 2005-12-30 | 2011-02-15 | Alpine Electronics Inc. | Voice recording tool for creating database used in text to speech synthesis system |
FI20055717A0 (en) | 2005-12-30 | 2005-12-30 | Nokia Corp | Code conversion method in a mobile communication system |
US8180779B2 (en) | 2005-12-30 | 2012-05-15 | Sap Ag | System and method for using external references to validate a data object's classification / consolidation |
TWI302265B (en) | 2005-12-30 | 2008-10-21 | High Tech Comp Corp | Moving determination apparatus |
KR20070071675A (en) | 2005-12-30 | 2007-07-04 | 주식회사 팬택 | Method for performing multiple language tts process in mibile terminal |
US7673238B2 (en) | 2006-01-05 | 2010-03-02 | Apple Inc. | Portable media device with video acceleration capabilities |
US7684991B2 (en) | 2006-01-05 | 2010-03-23 | Alpine Electronics, Inc. | Digital audio file search method and apparatus using text-to-speech processing |
JP2007183864A (en) | 2006-01-10 | 2007-07-19 | Fujitsu Ltd | File retrieval method and system therefor |
US8006180B2 (en) | 2006-01-10 | 2011-08-23 | Mircrosoft Corporation | Spell checking in network browser based applications |
EP1977312A2 (en) | 2006-01-16 | 2008-10-08 | Zlango Ltd. | Iconic communication |
KR100673849B1 (en) | 2006-01-18 | 2007-01-24 | 주식회사 비에스이 | Condenser microphone for inserting in mainboard and potable communication device including the same |
JP4241736B2 (en) | 2006-01-19 | 2009-03-18 | 株式会社東芝 | Speech processing apparatus and method |
FR2896603B1 (en) | 2006-01-20 | 2008-05-02 | Thales Sa | METHOD AND DEVICE FOR EXTRACTING INFORMATION AND TRANSFORMING THEM INTO QUALITATIVE DATA OF A TEXTUAL DOCUMENT |
US20070174396A1 (en) | 2006-01-24 | 2007-07-26 | Cisco Technology, Inc. | Email text-to-speech conversion in sender's voice |
US7934169B2 (en) | 2006-01-25 | 2011-04-26 | Nokia Corporation | Graphical user interface, electronic device, method and computer program that uses sliders for user input |
US20070174188A1 (en) | 2006-01-25 | 2007-07-26 | Fish Robert D | Electronic marketplace that facilitates transactions between consolidated buyers and/or sellers |
US8060357B2 (en) | 2006-01-27 | 2011-11-15 | Xerox Corporation | Linguistic user interface |
US7929805B2 (en) | 2006-01-31 | 2011-04-19 | The Penn State Research Foundation | Image-based CAPTCHA generation system |
IL174107A0 (en) | 2006-02-01 | 2006-08-01 | Grois Dan | Method and system for advertising by means of a search engine over a data network |
US7818291B2 (en) | 2006-02-03 | 2010-10-19 | The General Electric Company | Data object access system and method using dedicated task object |
US8595041B2 (en) | 2006-02-07 | 2013-11-26 | Sap Ag | Task responsibility system |
EP1818837B1 (en) | 2006-02-10 | 2009-08-19 | Harman Becker Automotive Systems GmbH | System for a speech-driven selection of an audio file and method therefor |
US7836437B2 (en) | 2006-02-10 | 2010-11-16 | Microsoft Corporation | Semantic annotations for virtual objects |
US20070192293A1 (en) | 2006-02-13 | 2007-08-16 | Bing Swen | Method for presenting search results |
US8209063B2 (en) | 2006-02-13 | 2012-06-26 | Research In Motion Limited | Navigation tool with audible feedback on a handheld communication device |
US20070192027A1 (en) | 2006-02-13 | 2007-08-16 | Research In Motion Limited | Navigation tool with audible feedback on a wireless handheld communication device |
US8209181B2 (en) | 2006-02-14 | 2012-06-26 | Microsoft Corporation | Personal audio-video recorder for live meetings |
US8036894B2 (en) | 2006-02-16 | 2011-10-11 | Apple Inc. | Multi-unit approach to text-to-speech synthesis |
US7541940B2 (en) | 2006-02-16 | 2009-06-02 | International Business Machines Corporation | Proximity-based task alerts |
US20070198566A1 (en) | 2006-02-23 | 2007-08-23 | Matyas Sustik | Method and apparatus for efficient storage of hierarchical signal names |
US20070208726A1 (en) | 2006-03-01 | 2007-09-06 | Oracle International Corporation | Enhancing search results using ontologies |
US7599861B2 (en) | 2006-03-02 | 2009-10-06 | Convergys Customer Management Group, Inc. | System and method for closed loop decisionmaking in an automated care system |
KR100764174B1 (en) | 2006-03-03 | 2007-10-08 | 삼성전자주식회사 | Apparatus for providing voice dialogue service and method for operating the apparatus |
US7983910B2 (en) | 2006-03-03 | 2011-07-19 | International Business Machines Corporation | Communicating across voice and text channels with emotion preservation |
US8532678B2 (en) | 2006-03-08 | 2013-09-10 | Tomtom International B.V. | Portable GPS navigation device |
US9361299B2 (en) | 2006-03-09 | 2016-06-07 | International Business Machines Corporation | RSS content administration for rendering RSS content on a digital audio player |
US7752152B2 (en) | 2006-03-17 | 2010-07-06 | Microsoft Corporation | Using predictive user models for language modeling on a personal device with user behavior models based on statistical modeling |
ATE414975T1 (en) | 2006-03-17 | 2008-12-15 | Svox Ag | TEXT-TO-SPEECH SYNTHESIS |
US8185376B2 (en) | 2006-03-20 | 2012-05-22 | Microsoft Corporation | Identifying language origin of words |
DE102006037156A1 (en) | 2006-03-22 | 2007-09-27 | Volkswagen Ag | Interactive operating device and method for operating the interactive operating device |
US7720681B2 (en) | 2006-03-23 | 2010-05-18 | Microsoft Corporation | Digital voice profiles |
JP2007257336A (en) | 2006-03-23 | 2007-10-04 | Sony Corp | Information processor, information processing method and program thereof |
JP4734155B2 (en) | 2006-03-24 | 2011-07-27 | 株式会社東芝 | Speech recognition apparatus, speech recognition method, and speech recognition program |
US7930183B2 (en) | 2006-03-29 | 2011-04-19 | Microsoft Corporation | Automatic identification of dialog timing problems for an interactive speech dialog application using speech log data indicative of cases of barge-in and timing problems |
US7283072B1 (en) | 2006-03-30 | 2007-10-16 | International Business Machines Corporation | Methods of creating a dictionary for data compression |
US8244545B2 (en) | 2006-03-30 | 2012-08-14 | Microsoft Corporation | Dialog repair based on discrepancies between user model predictions and speech recognition results |
WO2007114226A1 (en) | 2006-03-31 | 2007-10-11 | Pioneer Corporation | Voice input support device, method thereof, program thereof, recording medium containing the program, and navigation device |
US7756708B2 (en) * | 2006-04-03 | 2010-07-13 | Google Inc. | Automatic language model update |
US20070233490A1 (en) | 2006-04-03 | 2007-10-04 | Texas Instruments, Incorporated | System and method for text-to-phoneme mapping with prior knowledge |
CN101449538A (en) | 2006-04-04 | 2009-06-03 | 约翰逊控制技术公司 | Text to grammar enhancements for media files |
US7870142B2 (en) | 2006-04-04 | 2011-01-11 | Johnson Controls Technology Company | Text to grammar enhancements for media files |
US7797629B2 (en) | 2006-04-05 | 2010-09-14 | Research In Motion Limited | Handheld electronic device and method for performing optimized spell checking during text entry by providing a sequentially ordered series of spell-check algorithms |
US7693717B2 (en) | 2006-04-12 | 2010-04-06 | Custom Speech Usa, Inc. | Session file modification with annotation using speech recognition or text to speech |
EP1845699B1 (en) | 2006-04-13 | 2009-11-11 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio signal decorrelator |
US7707027B2 (en) | 2006-04-13 | 2010-04-27 | Nuance Communications, Inc. | Identification and rejection of meaningless input during natural language classification |
US8046363B2 (en) | 2006-04-13 | 2011-10-25 | Lg Electronics Inc. | System and method for clustering documents |
US7475063B2 (en) | 2006-04-19 | 2009-01-06 | Google Inc. | Augmenting queries with synonyms selected using language statistics |
WO2007127695A2 (en) | 2006-04-25 | 2007-11-08 | Elmo Weber Frank | Prefernce based automatic media summarization |
US8214213B1 (en) | 2006-04-27 | 2012-07-03 | At&T Intellectual Property Ii, L.P. | Speech recognition based on pronunciation modeling |
US7676699B2 (en) | 2006-04-28 | 2010-03-09 | Microsoft Corporation | Event trace conditional logging |
US20070260595A1 (en) | 2006-05-02 | 2007-11-08 | Microsoft Corporation | Fuzzy string matching using tree data structure |
US20070260460A1 (en) | 2006-05-05 | 2007-11-08 | Hyatt Edward C | Method and system for announcing audio and video content to a user of a mobile radio terminal |
US7831786B2 (en) | 2006-05-08 | 2010-11-09 | Research In Motion Limited | Sharing memory resources of wireless portable electronic devices |
US20070265831A1 (en) | 2006-05-09 | 2007-11-15 | Itai Dinur | System-Level Correction Service |
BRPI0711317B8 (en) | 2006-05-10 | 2021-06-22 | Koninklijke Philips Nv | method for providing audible information from a defibrillator; and, automated external defibrillator |
US20070274468A1 (en) | 2006-05-11 | 2007-11-29 | Lucent Technologies, Inc. | Retrieval of voicemail |
US20070276714A1 (en) | 2006-05-15 | 2007-11-29 | Sap Ag | Business process map management |
EP1858005A1 (en) | 2006-05-19 | 2007-11-21 | Texthelp Systems Limited | Streaming speech with synchronized highlighting generated by a server |
US7779353B2 (en) | 2006-05-19 | 2010-08-17 | Microsoft Corporation | Error checking web documents |
US8032355B2 (en) | 2006-05-22 | 2011-10-04 | University Of Southern California | Socially cognizant translation by detecting and transforming elements of politeness and respect |
US20070276651A1 (en) | 2006-05-23 | 2007-11-29 | Motorola, Inc. | Grammar adaptation through cooperative client and server based speech recognition |
US20070276810A1 (en) | 2006-05-23 | 2007-11-29 | Joshua Rosen | Search Engine for Presenting User-Editable Search Listings and Ranking Search Results Based on the Same |
US7831423B2 (en) | 2006-05-25 | 2010-11-09 | Multimodal Technologies, Inc. | Replacing text representing a concept with an alternate written form of the concept |
US8423347B2 (en) | 2006-06-06 | 2013-04-16 | Microsoft Corporation | Natural language personal information management |
US7523108B2 (en) | 2006-06-07 | 2009-04-21 | Platformation, Inc. | Methods and apparatus for searching with awareness of geography and languages |
US7483894B2 (en) | 2006-06-07 | 2009-01-27 | Platformation Technologies, Inc | Methods and apparatus for entity search |
US20100257160A1 (en) | 2006-06-07 | 2010-10-07 | Yu Cao | Methods & apparatus for searching with awareness of different types of information |
US7853577B2 (en) | 2006-06-09 | 2010-12-14 | Ebay Inc. | Shopping context engine |
KR20060073574A (en) | 2006-06-09 | 2006-06-28 | 복세규 | The mobilephone user's schedule management and supplementary service applied system of speech recognition |
US7676371B2 (en) | 2006-06-13 | 2010-03-09 | Nuance Communications, Inc. | Oral modification of an ASR lexicon of an ASR engine |
US20070294263A1 (en) | 2006-06-16 | 2007-12-20 | Ericsson, Inc. | Associating independent multimedia sources into a conference call |
US20070291108A1 (en) | 2006-06-16 | 2007-12-20 | Ericsson, Inc. | Conference layout control and control protocol |
KR100776800B1 (en) | 2006-06-16 | 2007-11-19 | 한국전자통신연구원 | Method and system (apparatus) for user specific service using intelligent gadget |
US7548895B2 (en) | 2006-06-30 | 2009-06-16 | Microsoft Corporation | Communication-prompted user assistance |
US8050500B1 (en) | 2006-07-06 | 2011-11-01 | Senapps, LLC | Recognition method and system |
US20080031475A1 (en) | 2006-07-08 | 2008-02-07 | Personics Holdings Inc. | Personal audio assistant device and method |
US20080016575A1 (en) | 2006-07-14 | 2008-01-17 | Motorola, Inc. | Method and system of auto message deletion using expiration |
TWI312103B (en) | 2006-07-17 | 2009-07-11 | Asia Optical Co Inc | Image pickup systems and methods |
US20080013751A1 (en) | 2006-07-17 | 2008-01-17 | Per Hiselius | Volume dependent audio frequency gain profile |
US20080022208A1 (en) | 2006-07-18 | 2008-01-24 | Creative Technology Ltd | System and method for personalizing the user interface of audio rendering devices |
JP2008026381A (en) | 2006-07-18 | 2008-02-07 | Konica Minolta Business Technologies Inc | Image forming device |
US20080042970A1 (en) | 2006-07-24 | 2008-02-21 | Yih-Shiuan Liang | Associating a region on a surface with a sound or with another region |
US20080034044A1 (en) | 2006-08-04 | 2008-02-07 | International Business Machines Corporation | Electronic mail reader capable of adapting gender and emotions of sender |
US20080046948A1 (en) | 2006-08-07 | 2008-02-21 | Apple Computer, Inc. | Creation, management and delivery of personalized media items |
US20080040339A1 (en) | 2006-08-07 | 2008-02-14 | Microsoft Corporation | Learning question paraphrases from log data |
KR20080015567A (en) | 2006-08-16 | 2008-02-20 | 삼성전자주식회사 | Voice-enabled file information announcement system and method for portable device |
WO2008024797A2 (en) | 2006-08-21 | 2008-02-28 | Pinger, Inc. | Graphical user interface for managing voice messages |
DE102006039126A1 (en) | 2006-08-21 | 2008-03-06 | Robert Bosch Gmbh | Method for speech recognition and speech reproduction |
US20080059200A1 (en) | 2006-08-22 | 2008-03-06 | Accenture Global Services Gmbh | Multi-Lingual Telephonic Service |
US20080059190A1 (en) | 2006-08-22 | 2008-03-06 | Microsoft Corporation | Speech unit selection using HMM acoustic models |
US8239480B2 (en) | 2006-08-31 | 2012-08-07 | Sony Ericsson Mobile Communications Ab | Methods of searching using captured portions of digital audio content and additional information separate therefrom and related systems and computer program products |
US9552349B2 (en) | 2006-08-31 | 2017-01-24 | International Business Machines Corporation | Methods and apparatus for performing spelling corrections using one or more variant hash tables |
US8402499B2 (en) | 2006-08-31 | 2013-03-19 | Accenture Global Services Gmbh | Voicemail interface system and method |
US20080077393A1 (en) | 2006-09-01 | 2008-03-27 | Yuqing Gao | Virtual keyboard adaptation for multilingual input |
US7689408B2 (en) | 2006-09-01 | 2010-03-30 | Microsoft Corporation | Identifying language of origin for words using estimates of normalized appearance frequency |
US7683886B2 (en) | 2006-09-05 | 2010-03-23 | Research In Motion Limited | Disambiguated text message review function |
US8170790B2 (en) | 2006-09-05 | 2012-05-01 | Garmin Switzerland Gmbh | Apparatus for switching navigation device mode |
US8564544B2 (en) | 2006-09-06 | 2013-10-22 | Apple Inc. | Touch screen device, method, and graphical user interface for customizing display of content category icons |
US7771320B2 (en) | 2006-09-07 | 2010-08-10 | Nike, Inc. | Athletic performance sensing and/or tracking systems and methods |
TWI322610B (en) | 2006-09-08 | 2010-03-21 | Htc Corp | Handheld electronic device |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US8036766B2 (en) | 2006-09-11 | 2011-10-11 | Apple Inc. | Intelligent audio mixing among media playback and at least one other non-playback application |
EP2082395A2 (en) | 2006-09-14 | 2009-07-29 | Google, Inc. | Integrating voice-enabled local search and contact lists |
US8027837B2 (en) | 2006-09-15 | 2011-09-27 | Apple Inc. | Using non-speech sounds during text-to-speech synthesis |
US20100004931A1 (en) | 2006-09-15 | 2010-01-07 | Bin Ma | Apparatus and method for speech utterance verification |
WO2008031625A2 (en) | 2006-09-15 | 2008-03-20 | Exbiblio B.V. | Capture and display of annotations in paper and electronic documents |
US7865282B2 (en) | 2006-09-22 | 2011-01-04 | General Motors Llc | Methods of managing communications for an in-vehicle telematics system |
JP4393494B2 (en) | 2006-09-22 | 2010-01-06 | 株式会社東芝 | Machine translation apparatus, machine translation method, and machine translation program |
US20080077384A1 (en) | 2006-09-22 | 2008-03-27 | International Business Machines Corporation | Dynamically translating a software application to a user selected target language that is not natively provided by the software application |
KR100813170B1 (en) | 2006-09-27 | 2008-03-17 | 삼성전자주식회사 | Method and system for semantic event indexing by analyzing user annotation of digital photos |
US8214208B2 (en) | 2006-09-28 | 2012-07-03 | Reqall, Inc. | Method and system for sharing portable voice profiles |
US7649454B2 (en) | 2006-09-28 | 2010-01-19 | Ektimisi Semiotics Holdings, Llc | System and method for providing a task reminder based on historical travel information |
US7528713B2 (en) | 2006-09-28 | 2009-05-05 | Ektimisi Semiotics Holdings, Llc | Apparatus and method for providing a task reminder based on travel history |
US7930197B2 (en) | 2006-09-28 | 2011-04-19 | Microsoft Corporation | Personal data mining |
US20080082338A1 (en) | 2006-09-29 | 2008-04-03 | O'neil Michael P | Systems and methods for secure voice identification and medical device interface |
US7831432B2 (en) | 2006-09-29 | 2010-11-09 | International Business Machines Corporation | Audio menus describing media contents of media players |
US20080082390A1 (en) | 2006-10-02 | 2008-04-03 | International Business Machines Corporation | Methods for Generating Auxiliary Data Operations for a Role Based Personalized Business User Workplace |
US7801721B2 (en) | 2006-10-02 | 2010-09-21 | Google Inc. | Displaying original text in a user interface with translated text |
EP1909263B1 (en) | 2006-10-02 | 2009-01-28 | Harman Becker Automotive Systems GmbH | Exploitation of language identification of media file data in speech dialog systems |
US20080091426A1 (en) | 2006-10-12 | 2008-04-17 | Rod Rempel | Adaptive context for automatic speech recognition systems |
US8041568B2 (en) | 2006-10-13 | 2011-10-18 | Google Inc. | Business listing search |
US7793228B2 (en) | 2006-10-13 | 2010-09-07 | Apple Inc. | Method, system, and graphical user interface for text entry with partial word display |
US8073681B2 (en) | 2006-10-16 | 2011-12-06 | Voicebox Technologies, Inc. | System and method for a cooperative conversational voice user interface |
WO2008050225A2 (en) | 2006-10-24 | 2008-05-02 | Edgetech America, Inc. | Method for spell-checking location-bound words within a document |
US20080109222A1 (en) | 2006-11-04 | 2008-05-08 | Edward Liu | Advertising using extracted context sensitive information and data of interest from voice/audio transmissions and recordings |
US7873517B2 (en) | 2006-11-09 | 2011-01-18 | Volkswagen Of America, Inc. | Motor vehicle with a speech interface |
US8718538B2 (en) | 2006-11-13 | 2014-05-06 | Joseph Harb | Real-time remote purchase-list capture system |
US20080114841A1 (en) | 2006-11-14 | 2008-05-15 | Lambert Daniel T | System and method for interfacing with event management software |
US7904298B2 (en) | 2006-11-17 | 2011-03-08 | Rao Ashwin P | Predictive speech-to-text input |
US8090194B2 (en) | 2006-11-21 | 2012-01-03 | Mantis Vision Ltd. | 3D geometric modeling and motion capture using both single and dual imaging |
US8010338B2 (en) | 2006-11-27 | 2011-08-30 | Sony Ericsson Mobile Communications Ab | Dynamic modification of a messaging language |
US8600760B2 (en) | 2006-11-28 | 2013-12-03 | General Motors Llc | Correcting substitution errors during automatic speech recognition by accepting a second best when first best is confusable |
US8055502B2 (en) | 2006-11-28 | 2011-11-08 | General Motors Llc | Voice dialing using a rejection reference |
WO2008067562A2 (en) | 2006-11-30 | 2008-06-05 | Rao Ashwin P | Multimodal speech recognition system |
DE602006005830D1 (en) | 2006-11-30 | 2009-04-30 | Harman Becker Automotive Sys | Interactive speech recognition system |
GB2457855B (en) | 2006-11-30 | 2011-01-12 | Nat Inst Of Advanced Ind Scien | Speech recognition system and speech recognition system program |
GB0623915D0 (en) | 2006-11-30 | 2007-01-10 | Ibm | Phonetic decoding and concatentive speech synthesis |
US8571862B2 (en) | 2006-11-30 | 2013-10-29 | Ashwin P. Rao | Multimodal interface for input of text |
US20080129520A1 (en) | 2006-12-01 | 2008-06-05 | Apple Computer, Inc. | Electronic device with enhanced audio feedback |
US8045808B2 (en) | 2006-12-04 | 2011-10-25 | Trend Micro Incorporated | Pure adversarial approach for identifying text content in images |
US8208624B2 (en) | 2006-12-05 | 2012-06-26 | Hewlett-Packard Development Company, L.P. | Hearing aid compatible mobile phone |
EP2095250B1 (en) | 2006-12-05 | 2014-11-12 | Nuance Communications, Inc. | Wireless server based text to speech email |
US20080140413A1 (en) | 2006-12-07 | 2008-06-12 | Jonathan Travis Millman | Synchronization of audio to reading |
US20080140652A1 (en) | 2006-12-07 | 2008-06-12 | Jonathan Travis Millman | Authoring tool |
US10185779B2 (en) | 2008-03-03 | 2019-01-22 | Oath Inc. | Mechanisms for content aggregation, syndication, sharing, and updating |
US7783644B1 (en) | 2006-12-13 | 2010-08-24 | Google Inc. | Query-independent entity importance in books |
EP2103178A1 (en) | 2006-12-13 | 2009-09-23 | Phonak AG | Method and system for hearing device fitting |
US20080146290A1 (en) | 2006-12-18 | 2008-06-19 | Motorola, Inc. | Changing a mute state of a voice call from a bluetooth headset |
US7552045B2 (en) | 2006-12-18 | 2009-06-23 | Nokia Corporation | Method, apparatus and computer program product for providing flexible text based language identification |
US20080147411A1 (en) | 2006-12-19 | 2008-06-19 | International Business Machines Corporation | Adaptation of a speech processing system from external input that is not directly related to sounds in an operational acoustic environment |
US8204182B2 (en) | 2006-12-19 | 2012-06-19 | Nuance Communications, Inc. | Dialect translator for a speech application environment extended for interactive text exchanges |
GB0625642D0 (en) | 2006-12-21 | 2007-01-31 | Symbian Software Ltd | Mobile sensor feedback |
US20080154600A1 (en) | 2006-12-21 | 2008-06-26 | Nokia Corporation | System, Method, Apparatus and Computer Program Product for Providing Dynamic Vocabulary Prediction for Speech Recognition |
US7991724B2 (en) | 2006-12-21 | 2011-08-02 | Support Machines Ltd. | Method and a computer program product for providing a response to a statement of a user |
ATE527652T1 (en) | 2006-12-21 | 2011-10-15 | Harman Becker Automotive Sys | MULTI-LEVEL LANGUAGE RECOGNITION |
US20080154612A1 (en) | 2006-12-26 | 2008-06-26 | Voice Signal Technologies, Inc. | Local storage and use of search results for voice-enabled mobile communications devices |
US8019271B1 (en) | 2006-12-29 | 2011-09-13 | Nextel Communications, Inc. | Methods and systems for presenting information on mobile devices |
US8493330B2 (en) | 2007-01-03 | 2013-07-23 | Apple Inc. | Individual channel phase delay scheme |
DK2109934T3 (en) | 2007-01-04 | 2016-08-15 | Cvf Llc | CUSTOMIZED SELECTION OF AUDIO PROFILE IN SOUND SYSTEM |
US7957955B2 (en) | 2007-01-05 | 2011-06-07 | Apple Inc. | Method and system for providing word recommendations for text input |
US8074172B2 (en) | 2007-01-05 | 2011-12-06 | Apple Inc. | Method, system, and graphical user interface for providing word recommendations |
US7978176B2 (en) | 2007-01-07 | 2011-07-12 | Apple Inc. | Portrait-landscape rotation heuristics for a portable multifunction device |
US8553856B2 (en) | 2007-01-07 | 2013-10-08 | Apple Inc. | Voicemail systems and methods |
WO2008085742A2 (en) | 2007-01-07 | 2008-07-17 | Apple Inc. | Portable multifunction device, method and graphical user interface for interacting with user input elements in displayed content |
US20080165994A1 (en) | 2007-01-10 | 2008-07-10 | Magnadyne Corporation | Bluetooth enabled hearing aid |
KR100883657B1 (en) | 2007-01-26 | 2009-02-18 | 삼성전자주식회사 | Method and apparatus for searching a music using speech recognition |
JP2008185805A (en) | 2007-01-30 | 2008-08-14 | Internatl Business Mach Corp <Ibm> | Technology for creating high quality synthesis voice |
US20080189606A1 (en) | 2007-02-02 | 2008-08-07 | Michal Rybak | Handheld electronic device including predictive accent mechanism, and associated method |
US7818176B2 (en) | 2007-02-06 | 2010-10-19 | Voicebox Technologies, Inc. | System and method for selecting and presenting advertisements based on natural language processing of voice-based input |
US9465791B2 (en) | 2007-02-09 | 2016-10-11 | International Business Machines Corporation | Method and apparatus for automatic detection of spelling errors in one or more documents |
US7941133B2 (en) | 2007-02-14 | 2011-05-10 | At&T Intellectual Property I, L.P. | Methods, systems, and computer program products for schedule management based on locations of wireless devices |
US7853240B2 (en) | 2007-02-15 | 2010-12-14 | Research In Motion Limited | Emergency number selection for mobile communications device |
US20080204379A1 (en) | 2007-02-22 | 2008-08-28 | Microsoft Corporation | Display with integrated audio transducer device |
US7797265B2 (en) | 2007-02-26 | 2010-09-14 | Siemens Corporation | Document clustering that applies a locality sensitive hashing function to a feature vector to obtain a limited set of candidate clusters |
US7801728B2 (en) | 2007-02-26 | 2010-09-21 | Nuance Communications, Inc. | Document session replay for multimodal applications |
US7822608B2 (en) | 2007-02-27 | 2010-10-26 | Nuance Communications, Inc. | Disambiguating a speech recognition grammar in a multimodal application |
US7840409B2 (en) | 2007-02-27 | 2010-11-23 | Nuance Communications, Inc. | Ordering recognition results produced by an automatic speech recognition engine for a multimodal application |
CN101622664B (en) | 2007-03-02 | 2012-02-01 | 松下电器产业株式会社 | Adaptive sound source vector quantization device and adaptive sound source vector quantization method |
US20080221866A1 (en) | 2007-03-06 | 2008-09-11 | Lalitesh Katragadda | Machine Learning For Transliteration |
US20110060587A1 (en) | 2007-03-07 | 2011-03-10 | Phillips Michael S | Command and control utilizing ancillary information in a mobile voice-to-speech application |
US8886545B2 (en) | 2007-03-07 | 2014-11-11 | Vlingo Corporation | Dealing with switch latency in speech recognition |
US20080221900A1 (en) | 2007-03-07 | 2008-09-11 | Cerra Joseph P | Mobile local search environment speech processing facility |
GB0704772D0 (en) | 2007-03-12 | 2007-04-18 | Mongoose Ventures Ltd | Aural similarity measuring system for text |
US7801729B2 (en) | 2007-03-13 | 2010-09-21 | Sensory, Inc. | Using multiple attributes to create a voice search playlist |
US20080256613A1 (en) | 2007-03-13 | 2008-10-16 | Grover Noel J | Voice print identification portal |
US8924844B2 (en) | 2007-03-13 | 2014-12-30 | Visual Cues Llc | Object annotation |
JP4466666B2 (en) | 2007-03-14 | 2010-05-26 | 日本電気株式会社 | Minutes creation method, apparatus and program thereof |
US8626930B2 (en) | 2007-03-15 | 2014-01-07 | Apple Inc. | Multimedia content filtering |
US8219406B2 (en) | 2007-03-15 | 2012-07-10 | Microsoft Corporation | Speech-centric multimodal user interface design in mobile technology |
CN101636784B (en) | 2007-03-20 | 2011-12-28 | 富士通株式会社 | Speech recognition system, and speech recognition method |
US8886537B2 (en) | 2007-03-20 | 2014-11-11 | Nuance Communications, Inc. | Method and system for text-to-speech synthesis with personalized voice |
JP2008236448A (en) | 2007-03-22 | 2008-10-02 | Clarion Co Ltd | Sound signal processing device, hands-free calling device, sound signal processing method, and control program |
JP2008271481A (en) | 2007-03-27 | 2008-11-06 | Brother Ind Ltd | Telephone apparatus |
US8498628B2 (en) | 2007-03-27 | 2013-07-30 | Iocast Llc | Content delivery system and method |
JP2008250375A (en) | 2007-03-29 | 2008-10-16 | Toshiba Corp | Character input device, method, and program |
US7797269B2 (en) | 2007-03-29 | 2010-09-14 | Nokia Corporation | Method and apparatus using a context sensitive dictionary |
US8775931B2 (en) | 2007-03-30 | 2014-07-08 | Blackberry Limited | Spell check function that applies a preference to a spell check algorithm based upon extensive user selection of spell check results generated by the algorithm, and associated handheld electronic device |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US7920902B2 (en) | 2007-04-04 | 2011-04-05 | Carroll David W | Mobile personal audio device |
US7809610B2 (en) | 2007-04-09 | 2010-10-05 | Platformation, Inc. | Methods and apparatus for freshness and completeness of information |
DK1981253T3 (en) | 2007-04-10 | 2011-10-03 | Oticon As | User interfaces for a communication device |
US20080253577A1 (en) | 2007-04-13 | 2008-10-16 | Apple Inc. | Multi-channel sound panner |
US20100142740A1 (en) | 2007-04-16 | 2010-06-10 | Gn Resound A/S | Hearing aid wireless communication adaptor |
JP4412504B2 (en) | 2007-04-17 | 2010-02-10 | 本田技研工業株式会社 | Speech recognition apparatus, speech recognition method, and speech recognition program |
US7848924B2 (en) | 2007-04-17 | 2010-12-07 | Nokia Corporation | Method, apparatus and computer program product for providing voice conversion using temporal dynamic features |
US7953600B2 (en) | 2007-04-24 | 2011-05-31 | Novaspeech Llc | System and method for hybrid speech synthesis |
US8457946B2 (en) | 2007-04-26 | 2013-06-04 | Microsoft Corporation | Recognition architecture for generating Asian characters |
US8005664B2 (en) | 2007-04-30 | 2011-08-23 | Tachyon Technologies Pvt. Ltd. | System, method to generate transliteration and method for generating decision tree to obtain transliteration |
US7983915B2 (en) | 2007-04-30 | 2011-07-19 | Sonic Foundry, Inc. | Audio content search engine |
US8032383B1 (en) | 2007-05-04 | 2011-10-04 | Foneweb, Inc. | Speech controlled services and devices using internet |
US7899666B2 (en) | 2007-05-04 | 2011-03-01 | Expert System S.P.A. | Method and system for automatically extracting relations between concepts included in text |
US9292807B2 (en) | 2007-05-10 | 2016-03-22 | Microsoft Technology Licensing, Llc | Recommending actions based on context |
KR20090001716A (en) | 2007-05-14 | 2009-01-09 | 이병수 | System for operating of growing intelligence form cyber secretary and method thereof |
EG25474A (en) | 2007-05-21 | 2012-01-11 | Sherikat Link Letatweer Elbarmaguey At Sae | Method for translitering and suggesting arabic replacement for a given user input |
JP4203967B1 (en) | 2007-05-28 | 2009-01-07 | パナソニック株式会社 | Information search support method and information search support device |
US8762143B2 (en) | 2007-05-29 | 2014-06-24 | At&T Intellectual Property Ii, L.P. | Method and apparatus for identifying acoustic background environments based on time and speed to enhance automatic speech recognition |
US8189880B2 (en) | 2007-05-29 | 2012-05-29 | Microsoft Corporation | Interactive photo annotation based on face clustering |
US8055708B2 (en) | 2007-06-01 | 2011-11-08 | Microsoft Corporation | Multimedia spaces |
US8204238B2 (en) | 2007-06-08 | 2012-06-19 | Sensory, Inc | Systems and methods of sonic communication |
US8004493B2 (en) | 2007-06-08 | 2011-08-23 | Apple Inc. | Methods and systems for providing sensory information to devices and peripherals |
KR20080109322A (en) | 2007-06-12 | 2008-12-17 | 엘지전자 주식회사 | Method and apparatus for providing services by comprehended user's intuited intension |
WO2008151624A1 (en) | 2007-06-13 | 2008-12-18 | Widex A/S | Hearing aid system establishing a conversation group among hearing aids used by different users |
EP2153692B1 (en) | 2007-06-13 | 2010-12-08 | Widex A/S | A system and a method for establishing a conversation group among a number of hearing aids |
US20080313335A1 (en) | 2007-06-15 | 2008-12-18 | Searete Llc, A Limited Liability Corporation Of The State Of Delaware | Communicator establishing aspects with context identifying |
US8027834B2 (en) | 2007-06-25 | 2011-09-27 | Nuance Communications, Inc. | Technique for training a phonetic decision tree with limited phonetic exceptional terms |
KR100757496B1 (en) | 2007-06-26 | 2007-09-11 | 우영배 | Water tank with clean water treatment apparatus |
US7689421B2 (en) | 2007-06-27 | 2010-03-30 | Microsoft Corporation | Voice persona service for embedding text-to-speech features into software programs |
US8190627B2 (en) | 2007-06-28 | 2012-05-29 | Microsoft Corporation | Machine assisted query formulation |
US7861008B2 (en) | 2007-06-28 | 2010-12-28 | Apple Inc. | Media management and routing within an electronic device |
US8041438B2 (en) | 2007-06-28 | 2011-10-18 | Apple Inc. | Data-driven media management within an electronic device |
US9794605B2 (en) | 2007-06-28 | 2017-10-17 | Apple Inc. | Using time-stamped event entries to facilitate synchronizing data streams |
US9632561B2 (en) | 2007-06-28 | 2017-04-25 | Apple Inc. | Power-gating media decoders to reduce power consumption |
US8065624B2 (en) | 2007-06-28 | 2011-11-22 | Panasonic Corporation | Virtual keypad systems and methods |
US8290775B2 (en) | 2007-06-29 | 2012-10-16 | Microsoft Corporation | Pronunciation correction of text-to-speech systems between different spoken languages |
US8019606B2 (en) | 2007-06-29 | 2011-09-13 | Microsoft Corporation | Identification and selection of a software application via speech |
US7962344B2 (en) | 2007-06-29 | 2011-06-14 | Microsoft Corporation | Depicting a speech user interface via graphical elements |
JP4424382B2 (en) | 2007-07-04 | 2010-03-03 | ソニー株式会社 | Content reproduction apparatus and content automatic reception method |
US7617074B2 (en) | 2007-07-06 | 2009-11-10 | Microsoft Corporation | Suppressing repeated events and storing diagnostic information |
US8219399B2 (en) | 2007-07-11 | 2012-07-10 | Garmin Switzerland Gmbh | Automated speech recognition (ASR) tiling |
US8306235B2 (en) | 2007-07-17 | 2012-11-06 | Apple Inc. | Method and apparatus for using a sound sensor to adjust the audio output for a device |
CN101354746B (en) | 2007-07-23 | 2011-08-31 | 夏普株式会社 | Device and method for extracting character image |
ITFI20070177A1 (en) | 2007-07-26 | 2009-01-27 | Riccardo Vieri | SYSTEM FOR THE CREATION AND SETTING OF AN ADVERTISING CAMPAIGN DERIVING FROM THE INSERTION OF ADVERTISING MESSAGES WITHIN AN EXCHANGE OF MESSAGES AND METHOD FOR ITS FUNCTIONING. |
EP2183913A4 (en) | 2007-07-30 | 2011-06-22 | Lg Electronics Inc | Display device and speaker system for the display device |
JP2009036999A (en) | 2007-08-01 | 2009-02-19 | Infocom Corp | Interactive method using computer, interactive system, computer program and computer-readable storage medium |
CN101802812B (en) | 2007-08-01 | 2015-07-01 | 金格软件有限公司 | Automatic context sensitive language correction and enhancement using an internet corpus |
US20090043583A1 (en) | 2007-08-08 | 2009-02-12 | International Business Machines Corporation | Dynamic modification of voice selection based on user specific factors |
US7983919B2 (en) | 2007-08-09 | 2011-07-19 | At&T Intellectual Property Ii, L.P. | System and method for performing speech synthesis with a cache of phoneme sequences |
US8478598B2 (en) | 2007-08-17 | 2013-07-02 | International Business Machines Corporation | Apparatus, system, and method for voice chat transcription |
US20090055186A1 (en) | 2007-08-23 | 2009-02-26 | International Business Machines Corporation | Method to voice id tag content to ease reading for visually impaired |
KR101359715B1 (en) | 2007-08-24 | 2014-02-10 | 삼성전자주식회사 | Method and apparatus for providing mobile voice web |
US8190359B2 (en) | 2007-08-31 | 2012-05-29 | Proxpro, Inc. | Situation-aware personal information management for a mobile device |
US20090058823A1 (en) | 2007-09-04 | 2009-03-05 | Apple Inc. | Virtual Keyboards in Multi-Language Environment |
US8683378B2 (en) | 2007-09-04 | 2014-03-25 | Apple Inc. | Scrolling techniques for user interfaces |
US8683197B2 (en) | 2007-09-04 | 2014-03-25 | Apple Inc. | Method and apparatus for providing seamless resumption of video playback |
US8826132B2 (en) | 2007-09-04 | 2014-09-02 | Apple Inc. | Methods and systems for navigating content on a portable device |
US20090106397A1 (en) | 2007-09-05 | 2009-04-23 | O'keefe Sean Patrick | Method and apparatus for interactive content distribution |
US9812023B2 (en) | 2007-09-10 | 2017-11-07 | Excalibur Ip, Llc | Audible metadata |
US20090070109A1 (en) * | 2007-09-12 | 2009-03-12 | Microsoft Corporation | Speech-to-Text Transcription for Personal Communication Devices |
US20090074214A1 (en) | 2007-09-13 | 2009-03-19 | Bionica Corporation | Assistive listening system with plug in enhancement platform and communication port to download user preferred processing algorithms |
US20090076825A1 (en) | 2007-09-13 | 2009-03-19 | Bionica Corporation | Method of enhancing sound for hearing impaired individuals |
US8838760B2 (en) | 2007-09-14 | 2014-09-16 | Ricoh Co., Ltd. | Workflow-enabled provider |
KR100920267B1 (en) | 2007-09-17 | 2009-10-05 | 한국전자통신연구원 | System for voice communication analysis and method thereof |
US8706476B2 (en) | 2007-09-18 | 2014-04-22 | Ariadne Genomics, Inc. | Natural language processing method by analyzing primitive sentences, logical clauses, clause types and verbal blocks |
US8583438B2 (en) | 2007-09-20 | 2013-11-12 | Microsoft Corporation | Unnatural prosody detection in speech synthesis |
US8069051B2 (en) | 2007-09-25 | 2011-11-29 | Apple Inc. | Zero-gap playback using predictive mixing |
US20090083035A1 (en) | 2007-09-25 | 2009-03-26 | Ritchie Winson Huang | Text pre-processing for text-to-speech generation |
US8589397B2 (en) | 2007-09-28 | 2013-11-19 | Nec Corporation | Data classification method and data classification device |
US8165886B1 (en) | 2007-10-04 | 2012-04-24 | Great Northern Research LLC | Speech interface system and method for control and interaction with applications on a computing system |
US8515095B2 (en) | 2007-10-04 | 2013-08-20 | Apple Inc. | Reducing annoyance by managing the acoustic noise produced by a device |
US7995732B2 (en) | 2007-10-04 | 2011-08-09 | At&T Intellectual Property I, Lp | Managing audio in a multi-source audio environment |
US8462959B2 (en) | 2007-10-04 | 2013-06-11 | Apple Inc. | Managing acoustic noise produced by a device |
US8036901B2 (en) | 2007-10-05 | 2011-10-11 | Sensory, Incorporated | Systems and methods of performing speech recognition using sensory inputs of human position |
WO2009049049A1 (en) | 2007-10-09 | 2009-04-16 | Language Analytics Llc | Method and system for adaptive transliteration |
US8594996B2 (en) | 2007-10-17 | 2013-11-26 | Evri Inc. | NLP-based entity recognition and disambiguation |
JP2009098490A (en) | 2007-10-18 | 2009-05-07 | Kddi Corp | Device for editing speech recognition result, speech recognition device and computer program |
US8209384B2 (en) | 2007-10-23 | 2012-06-26 | Yahoo! Inc. | Persistent group-based instant messaging |
US20090112677A1 (en) | 2007-10-24 | 2009-04-30 | Rhett Randolph L | Method for automatically developing suggested optimal work schedules from unsorted group and individual task lists |
US8280885B2 (en) | 2007-10-29 | 2012-10-02 | Cornell University | System and method for automatically summarizing fine-grained opinions in digital text |
US20090112572A1 (en) | 2007-10-30 | 2009-04-30 | Karl Ola Thorn | System and method for input of text to an application operating on a device |
US7840447B2 (en) | 2007-10-30 | 2010-11-23 | Leonard Kleinrock | Pricing and auctioning of bundled items among multiple sellers and buyers |
US7983997B2 (en) | 2007-11-02 | 2011-07-19 | Florida Institute For Human And Machine Cognition, Inc. | Interactive complex task teaching system that allows for natural language input, recognizes a user's intent, and automatically performs tasks in document object model (DOM) nodes |
JP4926004B2 (en) | 2007-11-12 | 2012-05-09 | 株式会社リコー | Document processing apparatus, document processing method, and document processing program |
US7890525B2 (en) | 2007-11-14 | 2011-02-15 | International Business Machines Corporation | Foreign language abbreviation translation in an instant messaging system |
US8112280B2 (en) | 2007-11-19 | 2012-02-07 | Sensory, Inc. | Systems and methods of performing speech recognition with barge-in for use in a bluetooth system |
TWI373708B (en) | 2007-11-27 | 2012-10-01 | Htc Corp | Power management method for handheld electronic device |
US8213999B2 (en) | 2007-11-27 | 2012-07-03 | Htc Corporation | Controlling method and system for handheld communication device and recording medium using the same |
EP2226746B1 (en) | 2007-11-28 | 2012-01-11 | Fujitsu Limited | Metallic pipe managed by wireless ic tag, and the wireless ic tag |
US8140335B2 (en) | 2007-12-11 | 2012-03-20 | Voicebox Technologies, Inc. | System and method for providing a natural language voice user interface in an integrated voice navigation services environment |
US8385588B2 (en) | 2007-12-11 | 2013-02-26 | Eastman Kodak Company | Recording audio metadata for stored images |
US8275607B2 (en) | 2007-12-12 | 2012-09-25 | Microsoft Corporation | Semi-supervised part-of-speech tagging |
US20090158423A1 (en) | 2007-12-14 | 2009-06-18 | Symbol Technologies, Inc. | Locking mobile device cradle |
KR101300839B1 (en) | 2007-12-18 | 2013-09-10 | 삼성전자주식회사 | Voice query extension method and system |
US8145196B2 (en) | 2007-12-18 | 2012-03-27 | Apple Inc. | Creation and management of voicemail greetings for mobile communication devices |
US8595004B2 (en) | 2007-12-18 | 2013-11-26 | Nec Corporation | Pronunciation variation rule extraction apparatus, pronunciation variation rule extraction method, and pronunciation variation rule extraction program |
US8095680B2 (en) | 2007-12-20 | 2012-01-10 | Telefonaktiebolaget Lm Ericsson (Publ) | Real-time network transport protocol interface method and apparatus |
US10002189B2 (en) | 2007-12-20 | 2018-06-19 | Apple Inc. | Method and apparatus for searching using an active ontology |
US8675830B2 (en) | 2007-12-21 | 2014-03-18 | Bce Inc. | Method and apparatus for interrupting an active telephony session to deliver information to a subscriber |
KR20090071077A (en) | 2007-12-27 | 2009-07-01 | 엘지전자 주식회사 | Navigation apparatus and method for providing information of tbt(turn-by-turn position) |
US8219407B1 (en) | 2007-12-27 | 2012-07-10 | Great Northern Research, LLC | Method for processing the output of a speech recognizer |
US8583416B2 (en) | 2007-12-27 | 2013-11-12 | Fluential, Llc | Robust information extraction from utterances |
US8138896B2 (en) | 2007-12-31 | 2012-03-20 | Apple Inc. | Tactile feedback in an electronic device |
US7609179B2 (en) | 2008-01-08 | 2009-10-27 | International Business Machines Corporation | Method for compressed data with reduced dictionary sizes by coding value prefixes |
US8232973B2 (en) | 2008-01-09 | 2012-07-31 | Apple Inc. | Method, device, and graphical user interface providing word recommendations for text input |
US8478578B2 (en) | 2008-01-09 | 2013-07-02 | Fluential, Llc | Mobile speech-to-speech interpretation system |
US20090187577A1 (en) | 2008-01-20 | 2009-07-23 | Aviv Reznik | System and Method Providing Audio-on-Demand to a User's Personal Online Device as Part of an Online Audio Community |
ITPO20080002A1 (en) | 2008-01-22 | 2009-07-23 | Riccardo Vieri | SYSTEM AND METHOD FOR THE CONTEXTUAL ADVERTISING GENERATION DURING THE SENDING OF SMS, ITS DEVICE AND INTERFACE. |
US20090192782A1 (en) | 2008-01-28 | 2009-07-30 | William Drewes | Method for increasing the accuracy of statistical machine translation (SMT) |
US7840581B2 (en) | 2008-02-01 | 2010-11-23 | Realnetworks, Inc. | Method and system for improving the quality of deep metadata associated with media content |
KR20090085376A (en) | 2008-02-04 | 2009-08-07 | 삼성전자주식회사 | Service method and apparatus for using speech synthesis of text message |
KR101334066B1 (en) | 2008-02-11 | 2013-11-29 | 이점식 | Self-evolving Artificial Intelligent cyber robot system and offer method |
US8099289B2 (en) | 2008-02-13 | 2012-01-17 | Sensory, Inc. | Voice interface and search for electronic devices including bluetooth headsets and remote systems |
EP2094032A1 (en) | 2008-02-19 | 2009-08-26 | Deutsche Thomson OHG | Audio signal, method and apparatus for encoding or transmitting the same and method and apparatus for processing the same |
US8065143B2 (en) | 2008-02-22 | 2011-11-22 | Apple Inc. | Providing text input using speech data and non-speech data |
US8015144B2 (en) | 2008-02-26 | 2011-09-06 | Microsoft Corporation | Learning transportation modes from raw GPS data |
US20090228273A1 (en) | 2008-03-05 | 2009-09-10 | Microsoft Corporation | Handwriting-based user interface for correction of speech recognition errors |
US8255224B2 (en) | 2008-03-07 | 2012-08-28 | Google Inc. | Voice recognition grammar selection based on context |
US20090234655A1 (en) | 2008-03-13 | 2009-09-17 | Jason Kwon | Mobile electronic device with active speech recognition |
US20090239552A1 (en) | 2008-03-24 | 2009-09-24 | Yahoo! Inc. | Location-based opportunistic recommendations |
US7472061B1 (en) | 2008-03-31 | 2008-12-30 | International Business Machines Corporation | Systems and methods for building a native language phoneme lexicon having native pronunciations of non-native words derived from non-native pronunciations |
US20090249198A1 (en) | 2008-04-01 | 2009-10-01 | Yahoo! Inc. | Techniques for input recogniton and completion |
US8417298B2 (en) | 2008-04-01 | 2013-04-09 | Apple Inc. | Mounting structures for portable electronic devices |
US20090253457A1 (en) | 2008-04-04 | 2009-10-08 | Apple Inc. | Audio signal processing for certification enhancement in a handheld wireless communications device |
US8996376B2 (en) | 2008-04-05 | 2015-03-31 | Apple Inc. | Intelligent text-to-speech conversion |
US8958848B2 (en) | 2008-04-08 | 2015-02-17 | Lg Electronics Inc. | Mobile terminal and menu control method thereof |
KR20090107365A (en) | 2008-04-08 | 2009-10-13 | 엘지전자 주식회사 | Mobile terminal and its menu control method |
WO2009129315A1 (en) | 2008-04-15 | 2009-10-22 | Mobile Technologies, Llc | System and methods for maintaining speech-to-speech translation in the field |
US8666824B2 (en) | 2008-04-23 | 2014-03-04 | Dell Products L.P. | Digital media content location and purchasing system |
US8407049B2 (en) | 2008-04-23 | 2013-03-26 | Cogi, Inc. | Systems and methods for conversation enhancement |
US8594995B2 (en) | 2008-04-24 | 2013-11-26 | Nuance Communications, Inc. | Multilingual asynchronous communications of speech messages recorded in digital media files |
US8249857B2 (en) | 2008-04-24 | 2012-08-21 | International Business Machines Corporation | Multilingual administration of enterprise data with user selected target language translation |
US8249858B2 (en) | 2008-04-24 | 2012-08-21 | International Business Machines Corporation | Multilingual administration of enterprise data with default target languages |
US8693698B2 (en) | 2008-04-30 | 2014-04-08 | Qualcomm Incorporated | Method and apparatus to reduce non-linear distortion in mobile computing devices |
US8219115B1 (en) | 2008-05-12 | 2012-07-10 | Google Inc. | Location based reminders |
US10496753B2 (en) | 2010-01-18 | 2019-12-03 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US8285344B2 (en) | 2008-05-21 | 2012-10-09 | DP Technlogies, Inc. | Method and apparatus for adjusting audio for a user environment |
US8589161B2 (en) | 2008-05-27 | 2013-11-19 | Voicebox Technologies, Inc. | System and method for an integrated, multi-modal, multi-device natural language voice services environment |
US8082498B2 (en) | 2008-05-27 | 2011-12-20 | Appfolio, Inc. | Systems and methods for automatic spell checking of dynamically generated web pages |
US20090326938A1 (en) | 2008-05-28 | 2009-12-31 | Nokia Corporation | Multiword text correction |
US8694355B2 (en) | 2008-05-30 | 2014-04-08 | Sri International | Method and apparatus for automated assistance with task management |
US8233366B2 (en) | 2008-06-02 | 2012-07-31 | Apple Inc. | Context-based error indication methods and apparatus |
JP5377889B2 (en) | 2008-06-05 | 2013-12-25 | 日本放送協会 | Language processing apparatus and program |
US8831948B2 (en) | 2008-06-06 | 2014-09-09 | At&T Intellectual Property I, L.P. | System and method for synthetically generated speech describing media content |
KR100988397B1 (en) | 2008-06-09 | 2010-10-19 | 엘지전자 주식회사 | Mobile terminal and text correcting method in the same |
US20090306967A1 (en) | 2008-06-09 | 2009-12-10 | J.D. Power And Associates | Automatic Sentiment Analysis of Surveys |
US8527876B2 (en) | 2008-06-12 | 2013-09-03 | Apple Inc. | System and methods for adjusting graphical representations of media files based on previous usage |
US20090313564A1 (en) | 2008-06-12 | 2009-12-17 | Apple Inc. | Systems and methods for adjusting playback of media files based on previous usage |
WO2009156438A1 (en) | 2008-06-24 | 2009-12-30 | Llinxx | Method and system for entering an expression |
US9081590B2 (en) | 2008-06-24 | 2015-07-14 | Microsoft Technology Licensing, Llc | Multimodal input using scratchpad graphical user interface to edit speech text input with keyboard input |
WO2009156978A1 (en) | 2008-06-26 | 2009-12-30 | Intuitive User Interfaces Ltd | System and method for intuitive user interaction |
US8423288B2 (en) | 2009-11-30 | 2013-04-16 | Apple Inc. | Dynamic alerts for calendar events |
US8781833B2 (en) * | 2008-07-17 | 2014-07-15 | Nuance Communications, Inc. | Speech recognition semantic classification training |
US8166019B1 (en) | 2008-07-21 | 2012-04-24 | Sprint Communications Company L.P. | Providing suggested actions in response to textual communications |
US8041848B2 (en) | 2008-08-04 | 2011-10-18 | Apple Inc. | Media processing method and device |
US8589149B2 (en) | 2008-08-05 | 2013-11-19 | Nuance Communications, Inc. | Probability-based approach to recognition of user-entered data |
US20110131038A1 (en) | 2008-08-11 | 2011-06-02 | Satoshi Oyaizu | Exception dictionary creating unit, exception dictionary creating method, and program therefor, as well as speech recognition unit and speech recognition method |
US8117136B2 (en) | 2008-08-29 | 2012-02-14 | Hewlett-Packard Development Company, L.P. | Relationship management on a mobile computing device |
US8442248B2 (en) | 2008-09-03 | 2013-05-14 | Starkey Laboratories, Inc. | Systems and methods for managing wireless communication links for hearing assistance devices |
WO2010028169A2 (en) | 2008-09-05 | 2010-03-11 | Fotonauts, Inc. | Reverse tagging of images in system for managing and sharing digital images |
US8380959B2 (en) | 2008-09-05 | 2013-02-19 | Apple Inc. | Memory management system and method |
US20100063825A1 (en) | 2008-09-05 | 2010-03-11 | Apple Inc. | Systems and Methods for Memory Management and Crossfading in an Electronic Device |
US8098262B2 (en) | 2008-09-05 | 2012-01-17 | Apple Inc. | Arbitrary fractional pixel movement |
US8756519B2 (en) | 2008-09-12 | 2014-06-17 | Google Inc. | Techniques for sharing content on a web page |
US8326622B2 (en) | 2008-09-23 | 2012-12-04 | International Business Machines Corporation | Dialog filtering for filling out a form |
US8352268B2 (en) | 2008-09-29 | 2013-01-08 | Apple Inc. | Systems and methods for selective rate of speech and speech preferences for text to speech synthesis |
US8396714B2 (en) | 2008-09-29 | 2013-03-12 | Apple Inc. | Systems and methods for concatenation of words in text to speech synthesis |
US8712776B2 (en) | 2008-09-29 | 2014-04-29 | Apple Inc. | Systems and methods for selective text to speech synthesis |
US8352272B2 (en) | 2008-09-29 | 2013-01-08 | Apple Inc. | Systems and methods for text to speech synthesis |
US20100082327A1 (en) | 2008-09-29 | 2010-04-01 | Apple Inc. | Systems and methods for mapping phonemes for text to speech synthesis |
US8355919B2 (en) | 2008-09-29 | 2013-01-15 | Apple Inc. | Systems and methods for text normalization for text to speech synthesis |
US8583418B2 (en) | 2008-09-29 | 2013-11-12 | Apple Inc. | Systems and methods of detecting language and natural language strings for text to speech synthesis |
US20100082328A1 (en) | 2008-09-29 | 2010-04-01 | Apple Inc. | Systems and methods for speech preprocessing in text to speech synthesis |
US8411953B2 (en) | 2008-09-30 | 2013-04-02 | International Business Machines Corporation | Tagging images by determining a set of similar pre-tagged images and extracting prominent tags from that set |
US9077526B2 (en) | 2008-09-30 | 2015-07-07 | Apple Inc. | Method and system for ensuring sequential playback of digital media |
US8401178B2 (en) | 2008-09-30 | 2013-03-19 | Apple Inc. | Multiple microphone switching and configuration |
US8676904B2 (en) | 2008-10-02 | 2014-03-18 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US8285545B2 (en) | 2008-10-03 | 2012-10-09 | Volkswagen Ag | Voice command acquisition system and method |
US9200913B2 (en) | 2008-10-07 | 2015-12-01 | Telecommunication Systems, Inc. | User interface for predictive traffic |
US8364487B2 (en) | 2008-10-21 | 2013-01-29 | Microsoft Corporation | Speech recognition system with display information |
US8218397B2 (en) | 2008-10-24 | 2012-07-10 | Qualcomm Incorporated | Audio source proximity estimation using sensor array for noise reduction |
US8724829B2 (en) | 2008-10-24 | 2014-05-13 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for coherence detection |
US8412529B2 (en) | 2008-10-29 | 2013-04-02 | Verizon Patent And Licensing Inc. | Method and system for enhancing verbal communication sessions |
US8122353B2 (en) | 2008-11-07 | 2012-02-21 | Yahoo! Inc. | Composing a message in an online textbox using a non-latin script |
US8386261B2 (en) | 2008-11-14 | 2013-02-26 | Vocollect Healthcare Systems, Inc. | Training/coaching system for a voice-enabled work environment |
US8832319B2 (en) | 2008-11-18 | 2014-09-09 | Amazon Technologies, Inc. | Synchronization of digital content |
US8442824B2 (en) | 2008-11-26 | 2013-05-14 | Nuance Communications, Inc. | Device, system, and method of liveness detection utilizing voice biometrics |
US8140328B2 (en) | 2008-12-01 | 2012-03-20 | At&T Intellectual Property I, L.P. | User intention based on N-best list of recognition hypotheses for utterances in a dialog |
US8489599B2 (en) | 2008-12-02 | 2013-07-16 | Palo Alto Research Center Incorporated | Context and activity-driven content delivery and interaction |
US8117036B2 (en) | 2008-12-03 | 2012-02-14 | At&T Intellectual Property I, L.P. | Non-disruptive side conversation information retrieval |
US8589157B2 (en) | 2008-12-05 | 2013-11-19 | Microsoft Corporation | Replying to text messages via automated voice search techniques |
JP5257311B2 (en) | 2008-12-05 | 2013-08-07 | ソニー株式会社 | Information processing apparatus and information processing method |
US8160881B2 (en) | 2008-12-15 | 2012-04-17 | Microsoft Corporation | Human-assisted pronunciation generation |
US8447588B2 (en) | 2008-12-18 | 2013-05-21 | Palo Alto Research Center Incorporated | Region-matching transducers for natural language processing |
AU2009330073B2 (en) | 2008-12-22 | 2012-11-15 | Google Llc | Asynchronous distributed de-duplication for replicated content addressable storage clusters |
WO2010075623A1 (en) | 2008-12-31 | 2010-07-08 | Bce Inc. | System and method for unlocking a device |
US8456420B2 (en) | 2008-12-31 | 2013-06-04 | Intel Corporation | Audible list traversal |
US8447609B2 (en) | 2008-12-31 | 2013-05-21 | Intel Corporation | Adjustment of temporal acoustical characteristics |
EP2205010A1 (en) | 2009-01-06 | 2010-07-07 | BRITISH TELECOMMUNICATIONS public limited company | Messaging |
US8498866B2 (en) | 2009-01-15 | 2013-07-30 | K-Nfb Reading Technology, Inc. | Systems and methods for multiple language document narration |
US8428758B2 (en) | 2009-02-16 | 2013-04-23 | Apple Inc. | Dynamic audio ducking |
US8326637B2 (en) | 2009-02-20 | 2012-12-04 | Voicebox Technologies, Inc. | System and method for processing multi-modal device interactions in a natural language voice services environment |
US8280434B2 (en) | 2009-02-27 | 2012-10-02 | Research In Motion Limited | Mobile wireless communications device for hearing and/or speech impaired user |
US8239333B2 (en) | 2009-03-03 | 2012-08-07 | Microsoft Corporation | Media tag recommendation technologies |
US8165321B2 (en) | 2009-03-10 | 2012-04-24 | Apple Inc. | Intelligent clip mixing |
US8417526B2 (en) | 2009-03-13 | 2013-04-09 | Adacel, Inc. | Speech recognition learning system and method |
KR101078864B1 (en) | 2009-03-26 | 2011-11-02 | 한국과학기술원 | The query/document topic category transition analysis system and method and the query expansion based information retrieval system and method |
US20100250599A1 (en) | 2009-03-30 | 2010-09-30 | Nokia Corporation | Method and apparatus for integration of community-provided place data |
US8805823B2 (en) | 2009-04-14 | 2014-08-12 | Sri International | Content processing systems and methods |
US9761219B2 (en) | 2009-04-21 | 2017-09-12 | Creative Technology Ltd | System and method for distributed text-to-speech synthesis and intelligibility |
US8660970B1 (en) | 2009-04-23 | 2014-02-25 | The Boeing Company | Passive learning and autonomously interactive system for leveraging user knowledge in networked environments |
JP5911796B2 (en) | 2009-04-30 | 2016-04-27 | サムスン エレクトロニクス カンパニー リミテッド | User intention inference apparatus and method using multimodal information |
KR101032792B1 (en) | 2009-04-30 | 2011-05-06 | 주식회사 코오롱 | Polyester fabric for airbag and manufacturing method thereof |
KR101581883B1 (en) | 2009-04-30 | 2016-01-11 | 삼성전자주식회사 | Appratus for detecting voice using motion information and method thereof |
US8498857B2 (en) | 2009-05-19 | 2013-07-30 | Tata Consultancy Services Limited | System and method for rapid prototyping of existing speech recognition solutions in different languages |
US20100302056A1 (en) | 2009-05-27 | 2010-12-02 | Geodelic, Inc. | Location discovery system and method |
WO2010135837A1 (en) | 2009-05-28 | 2010-12-02 | Intelligent Mechatronic Systems Inc | Communication system with personal information management and remote vehicle monitoring and control features |
US20120310652A1 (en) | 2009-06-01 | 2012-12-06 | O'sullivan Daniel | Adaptive Human Computer Interface (AAHCI) |
EP2259252B1 (en) | 2009-06-02 | 2012-08-01 | Nuance Communications, Inc. | Speech recognition method for selecting a combination of list elements via a speech input |
US20120311585A1 (en) | 2011-06-03 | 2012-12-06 | Apple Inc. | Organizing task items that represent tasks to perform |
US10540976B2 (en) | 2009-06-05 | 2020-01-21 | Apple Inc. | Contextual voice commands |
US9858925B2 (en) | 2009-06-05 | 2018-01-02 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
KR101562792B1 (en) | 2009-06-10 | 2015-10-23 | 삼성전자주식회사 | Apparatus and method for providing goal predictive interface |
US8290777B1 (en) | 2009-06-12 | 2012-10-16 | Amazon Technologies, Inc. | Synchronizing the playing and displaying of digital content |
US8306238B2 (en) | 2009-06-17 | 2012-11-06 | Sony Ericsson Mobile Communications Ab | Method and circuit for controlling an output of an audio signal of a battery-powered device |
US9215212B2 (en) | 2009-06-22 | 2015-12-15 | Citrix Systems, Inc. | Systems and methods for providing a visualizer for rules of an application firewall |
US9754224B2 (en) | 2009-06-26 | 2017-09-05 | International Business Machines Corporation | Action based to-do list |
US8527278B2 (en) | 2009-06-29 | 2013-09-03 | Abraham Ben David | Intelligent home automation |
US20100332224A1 (en) | 2009-06-30 | 2010-12-30 | Nokia Corporation | Method and apparatus for converting text to audio and tactile output |
US20110002487A1 (en) | 2009-07-06 | 2011-01-06 | Apple Inc. | Audio Channel Assignment for Audio Output in a Movable Device |
KR101083540B1 (en) | 2009-07-08 | 2011-11-14 | 엔에이치엔(주) | System and method for transforming vernacular pronunciation with respect to hanja using statistical method |
US7953679B2 (en) | 2009-07-22 | 2011-05-31 | Xerox Corporation | Scalable indexing for layout based document retrieval and ranking |
US8239129B2 (en) | 2009-07-27 | 2012-08-07 | Robert Bosch Gmbh | Method and system for improving speech recognition accuracy by use of geographic information |
US8340312B2 (en) | 2009-08-04 | 2012-12-25 | Apple Inc. | Differential mode noise cancellation with active real-time control for microphone-speaker combinations used in two way audio communications |
US20110047072A1 (en) | 2009-08-07 | 2011-02-24 | Visa U.S.A. Inc. | Systems and Methods for Propensity Analysis and Validation |
US8768313B2 (en) | 2009-08-17 | 2014-07-01 | Digimarc Corporation | Methods and systems for image or audio recognition processing |
CN101996631B (en) | 2009-08-28 | 2014-12-03 | 国际商业机器公司 | Method and device for aligning texts |
EP2473916A4 (en) | 2009-09-02 | 2013-07-10 | Stanford Res Inst Int | Method and apparatus for exploiting human feedback in an intelligent automated assistant |
US8560300B2 (en) | 2009-09-09 | 2013-10-15 | International Business Machines Corporation | Error correction using fact repositories |
US8321527B2 (en) | 2009-09-10 | 2012-11-27 | Tribal Brands | System and method for tracking user location and associated activity and responsively providing mobile device updates |
US8768308B2 (en) | 2009-09-29 | 2014-07-01 | Deutsche Telekom Ag | Apparatus and method for creating and managing personal schedules via context-sensing and actuation |
KR20110036385A (en) | 2009-10-01 | 2011-04-07 | 삼성전자주식회사 | Apparatus for analyzing intention of user and method thereof |
US20110083079A1 (en) | 2009-10-02 | 2011-04-07 | International Business Machines Corporation | Apparatus, system, and method for improved type-ahead functionality in a type-ahead field based on activity of a user within a user interface |
US8335689B2 (en) | 2009-10-14 | 2012-12-18 | Cogi, Inc. | Method and system for efficient management of speech transcribers |
US8510103B2 (en) | 2009-10-15 | 2013-08-13 | Paul Angott | System and method for voice recognition |
US8255217B2 (en) | 2009-10-16 | 2012-08-28 | At&T Intellectual Property I, Lp | Systems and methods for creating and using geo-centric language models |
US8451112B2 (en) | 2009-10-19 | 2013-05-28 | Qualcomm Incorporated | Methods and apparatus for estimating departure time based on known calendar events |
US8554537B2 (en) | 2009-10-23 | 2013-10-08 | Samsung Electronics Co., Ltd | Method and device for transliteration |
US8326624B2 (en) | 2009-10-26 | 2012-12-04 | International Business Machines Corporation | Detecting and communicating biometrics of recorded voice during transcription process |
US9197736B2 (en) | 2009-12-31 | 2015-11-24 | Digimarc Corporation | Intuitive computing methods and systems |
US20110099507A1 (en) | 2009-10-28 | 2011-04-28 | Google Inc. | Displaying a collection of interactive elements that trigger actions directed to an item |
US8386574B2 (en) | 2009-10-29 | 2013-02-26 | Xerox Corporation | Multi-modality classification for one-class classification in social networks |
US20120137367A1 (en) | 2009-11-06 | 2012-05-31 | Cataphora, Inc. | Continuous anomaly detection based on behavior modeling and heterogeneous information analysis |
US8358747B2 (en) | 2009-11-10 | 2013-01-22 | International Business Machines Corporation | Real time automatic caller speech profiling |
WO2011059997A1 (en) | 2009-11-10 | 2011-05-19 | Voicebox Technologies, Inc. | System and method for providing a natural language content dedication service |
US9171541B2 (en) | 2009-11-10 | 2015-10-27 | Voicebox Technologies Corporation | System and method for hybrid processing in a natural language voice services environment |
WO2011057346A1 (en) | 2009-11-12 | 2011-05-19 | Robert Henry Frater | Speakerphone and/or microphone arrays and methods and systems of using the same |
US8712759B2 (en) | 2009-11-13 | 2014-04-29 | Clausal Computing Oy | Specializing disambiguation of a natural language expression |
TWI391915B (en) | 2009-11-17 | 2013-04-01 | Inst Information Industry | Method and apparatus for builiding phonetic variation models and speech recognition |
KR101960835B1 (en) | 2009-11-24 | 2019-03-21 | 삼성전자주식회사 | Schedule Management System Using Interactive Robot and Method Thereof |
US8396888B2 (en) | 2009-12-04 | 2013-03-12 | Google Inc. | Location-based searching using a search area that corresponds to a geographical location of a computing device |
KR101622111B1 (en) | 2009-12-11 | 2016-05-18 | 삼성전자 주식회사 | Dialog system and conversational method thereof |
US8892443B2 (en) | 2009-12-15 | 2014-11-18 | At&T Intellectual Property I, L.P. | System and method for combining geographic metadata in automatic speech recognition language and acoustic models |
US20110161309A1 (en) | 2009-12-29 | 2011-06-30 | Lx1 Technology Limited | Method Of Sorting The Result Set Of A Search Engine |
US8494852B2 (en) | 2010-01-05 | 2013-07-23 | Google Inc. | Word-level correction of speech input |
US8381107B2 (en) | 2010-01-13 | 2013-02-19 | Apple Inc. | Adaptive audio feedback system and method |
US20110179372A1 (en) | 2010-01-15 | 2011-07-21 | Bradford Allen Moore | Automatic Keyboard Layout Determination |
US8334842B2 (en) | 2010-01-15 | 2012-12-18 | Microsoft Corporation | Recognizing user intent in motion capture system |
US20110179002A1 (en) | 2010-01-19 | 2011-07-21 | Dell Products L.P. | System and Method for a Vector-Space Search Engine |
US8626511B2 (en) | 2010-01-22 | 2014-01-07 | Google Inc. | Multi-dimensional disambiguation of voice commands |
US8600967B2 (en) | 2010-02-03 | 2013-12-03 | Apple Inc. | Automatic organization of browsing histories |
US8645287B2 (en) | 2010-02-04 | 2014-02-04 | Microsoft Corporation | Image tagging based upon cross domain context |
US8179370B1 (en) | 2010-02-09 | 2012-05-15 | Google Inc. | Proximity based keystroke resolution |
US9413869B2 (en) | 2010-02-10 | 2016-08-09 | Qualcomm Incorporated | Mobile device having plurality of input modes |
US8782556B2 (en) | 2010-02-12 | 2014-07-15 | Microsoft Corporation | User-centric soft keyboard predictive technologies |
US20110218855A1 (en) | 2010-03-03 | 2011-09-08 | Platformation, Inc. | Offering Promotions Based on Query Analysis |
US8521513B2 (en) | 2010-03-12 | 2013-08-27 | Microsoft Corporation | Localization for interactive voice response systems |
EP2559030B1 (en) | 2010-03-19 | 2017-06-21 | Digimarc Corporation | Intuitive computing methods and systems |
US9323756B2 (en) | 2010-03-22 | 2016-04-26 | Lenovo (Singapore) Pte. Ltd. | Audio book and e-book synchronization |
US9378202B2 (en) | 2010-03-26 | 2016-06-28 | Virtuoz Sa | Semantic clustering |
KR101369810B1 (en) | 2010-04-09 | 2014-03-05 | 이초강 | Empirical Context Aware Computing Method For Robot |
US8140567B2 (en) | 2010-04-13 | 2012-03-20 | Microsoft Corporation | Measuring entity extraction complexity |
US8265928B2 (en) | 2010-04-14 | 2012-09-11 | Google Inc. | Geotagged environmental audio for enhanced speech recognition accuracy |
US20130238647A1 (en) | 2010-04-21 | 2013-09-12 | Proteus Digital Health, Inc. | Diagnostic System and Method |
US8452037B2 (en) | 2010-05-05 | 2013-05-28 | Apple Inc. | Speaker clip |
US8380504B1 (en) | 2010-05-06 | 2013-02-19 | Sprint Communications Company L.P. | Generation of voice profiles |
US8938436B2 (en) | 2010-05-10 | 2015-01-20 | Verizon Patent And Licensing Inc. | System for and method of providing reusable software service information based on natural language queries |
US20110279368A1 (en) | 2010-05-12 | 2011-11-17 | Microsoft Corporation | Inferring user intent to engage a motion capture system |
US8392186B2 (en) | 2010-05-18 | 2013-03-05 | K-Nfb Reading Technology, Inc. | Audio synchronization for document narration with user-selected playback |
US8694313B2 (en) | 2010-05-19 | 2014-04-08 | Google Inc. | Disambiguation of contact information using historical data |
US8522283B2 (en) | 2010-05-20 | 2013-08-27 | Google Inc. | Television remote control data transfer |
US8468012B2 (en) | 2010-05-26 | 2013-06-18 | Google Inc. | Acoustic model adaptation using geographic information |
WO2011150730A1 (en) | 2010-05-31 | 2011-12-08 | 百度在线网络技术(北京)有限公司 | Method and device for mixed input in english and another kind of language |
EP2397972B1 (en) | 2010-06-08 | 2015-01-07 | Vodafone Holding GmbH | Smart card with microphone |
US20110306426A1 (en) | 2010-06-10 | 2011-12-15 | Microsoft Corporation | Activity Participation Based On User Intent |
US8234111B2 (en) | 2010-06-14 | 2012-07-31 | Google Inc. | Speech and noise models for speech recognition |
US8504404B2 (en) | 2010-06-17 | 2013-08-06 | Google Inc. | Distance and location-aware scheduling assistance in a calendar system with notification of potential conflicts |
EP2400373A1 (en) | 2010-06-22 | 2011-12-28 | Vodafone Holding GmbH | Inputting symbols into an electronic device having a touch-screen |
US8375320B2 (en) | 2010-06-22 | 2013-02-12 | Microsoft Corporation | Context-based task generation |
US8581844B2 (en) | 2010-06-23 | 2013-11-12 | Google Inc. | Switching between a first operational mode and a second operational mode using a natural motion gesture |
US8411874B2 (en) | 2010-06-30 | 2013-04-02 | Google Inc. | Removing noise from audio |
BRPI1004128A2 (en) | 2010-08-04 | 2012-04-10 | Magneti Marelli Sist S Automotivos Ind E Com Ltda | Setting Top Level Key Parameters for Biodiesel Logic Sensor |
US8775156B2 (en) | 2010-08-05 | 2014-07-08 | Google Inc. | Translating languages in response to device motion |
US8359020B2 (en) | 2010-08-06 | 2013-01-22 | Google Inc. | Automatically monitoring for voice input based on context |
US8473289B2 (en) | 2010-08-06 | 2013-06-25 | Google Inc. | Disambiguating input based on context |
WO2012030838A1 (en) | 2010-08-30 | 2012-03-08 | Honda Motor Co., Ltd. | Belief tracking and action selection in spoken dialog systems |
US20120068937A1 (en) | 2010-09-16 | 2012-03-22 | Sony Ericsson Mobile Communications Ab | Quick input language/virtual keyboard/ language dictionary change on a touch screen device |
US8719014B2 (en) | 2010-09-27 | 2014-05-06 | Apple Inc. | Electronic device with text error correction based on voice recognition data |
US8644519B2 (en) | 2010-09-30 | 2014-02-04 | Apple Inc. | Electronic devices with improved audio |
US8812321B2 (en) | 2010-09-30 | 2014-08-19 | At&T Intellectual Property I, L.P. | System and method for combining speech recognition outputs from a plurality of domain-specific speech recognizers via machine learning |
US20120108221A1 (en) | 2010-10-28 | 2012-05-03 | Microsoft Corporation | Augmenting communication sessions with applications |
US20120158422A1 (en) | 2010-12-21 | 2012-06-21 | General Electric Company | Methods and systems for scheduling appointments in healthcare systems |
US20120158293A1 (en) | 2010-12-21 | 2012-06-21 | General Electric Company | Methods and systems for dynamically providing users with appointment reminders |
US8666726B2 (en) * | 2010-12-21 | 2014-03-04 | Nuance Communications, Inc. | Sample clustering to reduce manual transcriptions in speech recognition system |
CN102651217A (en) | 2011-02-25 | 2012-08-29 | 株式会社东芝 | Method and equipment for voice synthesis and method for training acoustic model used in voice synthesis |
US8862255B2 (en) | 2011-03-23 | 2014-10-14 | Audible, Inc. | Managing playback of synchronized content |
JP2014520297A (en) | 2011-04-25 | 2014-08-21 | ベベオ,インク. | System and method for advanced personal timetable assistant |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
US20120310642A1 (en) | 2011-06-03 | 2012-12-06 | Apple Inc. | Automatically creating a mapping between text data and audio data |
US8768707B2 (en) | 2011-09-27 | 2014-07-01 | Sensory Incorporated | Background speech recognition assistant using speaker verification |
EP2761860B1 (en) | 2011-09-30 | 2019-10-23 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US9042867B2 (en) | 2012-02-24 | 2015-05-26 | Agnitio S.L. | System and method for speaker recognition on mobile devices |
US8515750B1 (en) * | 2012-06-05 | 2013-08-20 | Google Inc. | Realtime acoustic adaptation using stability measures |
US9819786B2 (en) | 2012-12-05 | 2017-11-14 | Facebook, Inc. | Systems and methods for a symbol-adaptable keyboard |
-
2014
- 2014-03-14 WO PCT/US2014/029050 patent/WO2014144579A1/en active Application Filing
- 2014-04-28 US US14/263,869 patent/US9697822B1/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7272224B1 (en) | 2003-03-03 | 2007-09-18 | Apple Inc. | Echo cancellation |
US20130006633A1 (en) * | 2011-07-01 | 2013-01-03 | Qualcomm Incorporated | Learning speech models for mobile device users |
Cited By (246)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10318871B2 (en) | 2005-09-08 | 2019-06-11 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US11928604B2 (en) | 2005-09-08 | 2024-03-12 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US11671920B2 (en) | 2007-04-03 | 2023-06-06 | Apple Inc. | Method and system for operating a multifunction portable electronic device using voice-activation |
US11023513B2 (en) | 2007-12-20 | 2021-06-01 | Apple Inc. | Method and apparatus for searching using an active ontology |
US10381016B2 (en) | 2008-01-03 | 2019-08-13 | Apple Inc. | Methods and apparatus for altering audio output signals |
US9865248B2 (en) | 2008-04-05 | 2018-01-09 | Apple Inc. | Intelligent text-to-speech conversion |
US10108612B2 (en) | 2008-07-31 | 2018-10-23 | Apple Inc. | Mobile device having human language translation capability with positional feedback |
US10643611B2 (en) | 2008-10-02 | 2020-05-05 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US11900936B2 (en) | 2008-10-02 | 2024-02-13 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US11348582B2 (en) | 2008-10-02 | 2022-05-31 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US10795541B2 (en) | 2009-06-05 | 2020-10-06 | Apple Inc. | Intelligent organization of tasks items |
US11080012B2 (en) | 2009-06-05 | 2021-08-03 | Apple Inc. | Interface for a virtual digital assistant |
US10706841B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Task flow identification based on user intent |
US10741185B2 (en) | 2010-01-18 | 2020-08-11 | Apple Inc. | Intelligent automated assistant |
US11423886B2 (en) | 2010-01-18 | 2022-08-23 | Apple Inc. | Task flow identification based on user intent |
US10049675B2 (en) | 2010-02-25 | 2018-08-14 | Apple Inc. | User profiling for voice input processing |
US10692504B2 (en) | 2010-02-25 | 2020-06-23 | Apple Inc. | User profiling for voice input processing |
US10417405B2 (en) | 2011-03-21 | 2019-09-17 | Apple Inc. | Device access using voice authentication |
US11120372B2 (en) | 2011-06-03 | 2021-09-14 | Apple Inc. | Performing actions associated with task items that represent tasks to perform |
US11350253B2 (en) | 2011-06-03 | 2022-05-31 | Apple Inc. | Active transport based notifications |
US11069336B2 (en) | 2012-03-02 | 2021-07-20 | Apple Inc. | Systems and methods for name pronunciation |
US11321116B2 (en) | 2012-05-15 | 2022-05-03 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
US11269678B2 (en) | 2012-05-15 | 2022-03-08 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
US10079014B2 (en) | 2012-06-08 | 2018-09-18 | Apple Inc. | Name recognition system |
US9971774B2 (en) | 2012-09-19 | 2018-05-15 | Apple Inc. | Voice-based media searching |
US11557310B2 (en) | 2013-02-07 | 2023-01-17 | Apple Inc. | Voice trigger for a digital assistant |
US11862186B2 (en) | 2013-02-07 | 2024-01-02 | Apple Inc. | Voice trigger for a digital assistant |
US10714117B2 (en) | 2013-02-07 | 2020-07-14 | Apple Inc. | Voice trigger for a digital assistant |
US11636869B2 (en) | 2013-02-07 | 2023-04-25 | Apple Inc. | Voice trigger for a digital assistant |
US10978090B2 (en) | 2013-02-07 | 2021-04-13 | Apple Inc. | Voice trigger for a digital assistant |
US11388291B2 (en) | 2013-03-14 | 2022-07-12 | Apple Inc. | System and method for processing voicemail |
US11798547B2 (en) | 2013-03-15 | 2023-10-24 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
US9966060B2 (en) | 2013-06-07 | 2018-05-08 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
US10657961B2 (en) | 2013-06-08 | 2020-05-19 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
US10769385B2 (en) | 2013-06-09 | 2020-09-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
US11727219B2 (en) | 2013-06-09 | 2023-08-15 | Apple Inc. | System and method for inferring user intent from speech inputs |
US11048473B2 (en) | 2013-06-09 | 2021-06-29 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
US11314370B2 (en) | 2013-12-06 | 2022-04-26 | Apple Inc. | Method for extracting salient dialog usage from live data |
US11133008B2 (en) | 2014-05-30 | 2021-09-28 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US11257504B2 (en) | 2014-05-30 | 2022-02-22 | Apple Inc. | Intelligent assistant for home automation |
US10878809B2 (en) | 2014-05-30 | 2020-12-29 | Apple Inc. | Multi-command single utterance input method |
US10714095B2 (en) | 2014-05-30 | 2020-07-14 | Apple Inc. | Intelligent assistant for home automation |
US10417344B2 (en) | 2014-05-30 | 2019-09-17 | Apple Inc. | Exemplar-based natural language processing |
US10699717B2 (en) | 2014-05-30 | 2020-06-30 | Apple Inc. | Intelligent assistant for home automation |
US10497365B2 (en) | 2014-05-30 | 2019-12-03 | Apple Inc. | Multi-command single utterance input method |
US10083690B2 (en) | 2014-05-30 | 2018-09-25 | Apple Inc. | Better resolution when referencing to concepts |
US10657966B2 (en) | 2014-05-30 | 2020-05-19 | Apple Inc. | Better resolution when referencing to concepts |
US11699448B2 (en) | 2014-05-30 | 2023-07-11 | Apple Inc. | Intelligent assistant for home automation |
US11670289B2 (en) | 2014-05-30 | 2023-06-06 | Apple Inc. | Multi-command single utterance input method |
US11810562B2 (en) | 2014-05-30 | 2023-11-07 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US11516537B2 (en) | 2014-06-30 | 2022-11-29 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10904611B2 (en) | 2014-06-30 | 2021-01-26 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US9668024B2 (en) | 2014-06-30 | 2017-05-30 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US11838579B2 (en) | 2014-06-30 | 2023-12-05 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10431204B2 (en) | 2014-09-11 | 2019-10-01 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US9986419B2 (en) | 2014-09-30 | 2018-05-29 | Apple Inc. | Social reminders |
US10438595B2 (en) | 2014-09-30 | 2019-10-08 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US10453443B2 (en) | 2014-09-30 | 2019-10-22 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US10390213B2 (en) | 2014-09-30 | 2019-08-20 | Apple Inc. | Social reminders |
US11231904B2 (en) | 2015-03-06 | 2022-01-25 | Apple Inc. | Reducing response latency of intelligent automated assistants |
US10529332B2 (en) | 2015-03-08 | 2020-01-07 | Apple Inc. | Virtual assistant activation |
US11087759B2 (en) | 2015-03-08 | 2021-08-10 | Apple Inc. | Virtual assistant activation |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US10930282B2 (en) | 2015-03-08 | 2021-02-23 | Apple Inc. | Competing devices responding to voice triggers |
US10311871B2 (en) | 2015-03-08 | 2019-06-04 | Apple Inc. | Competing devices responding to voice triggers |
US11842734B2 (en) | 2015-03-08 | 2023-12-12 | Apple Inc. | Virtual assistant activation |
US11468282B2 (en) | 2015-05-15 | 2022-10-11 | Apple Inc. | Virtual assistant in a communication session |
US10482883B2 (en) | 2015-05-27 | 2019-11-19 | Google Llc | Context-sensitive dynamic update of voice to text model in a voice-enabled electronic device |
US11087762B2 (en) | 2015-05-27 | 2021-08-10 | Google Llc | Context-sensitive dynamic update of voice to text model in a voice-enabled electronic device |
US10334080B2 (en) | 2015-05-27 | 2019-06-25 | Google Llc | Local persisting of data for selectively offline capable voice action in a voice-enabled electronic device |
US10083697B2 (en) | 2015-05-27 | 2018-09-25 | Google Llc | Local persisting of data for selectively offline capable voice action in a voice-enabled electronic device |
US10986214B2 (en) | 2015-05-27 | 2021-04-20 | Google Llc | Local persisting of data for selectively offline capable voice action in a voice-enabled electronic device |
WO2016191318A1 (en) * | 2015-05-27 | 2016-12-01 | Google Inc. | Context-sensitive dynamic update of voice to text model in a voice-enabled electronic device |
US11070949B2 (en) | 2015-05-27 | 2021-07-20 | Apple Inc. | Systems and methods for proactively identifying and surfacing relevant content on an electronic device with a touch-sensitive display |
CN107430855A (en) * | 2015-05-27 | 2017-12-01 | 谷歌公司 | The sensitive dynamic of context for turning text model to voice in the electronic equipment for supporting voice updates |
US9870196B2 (en) | 2015-05-27 | 2018-01-16 | Google Llc | Selective aborting of online processing of voice inputs in a voice-enabled electronic device |
US11127397B2 (en) | 2015-05-27 | 2021-09-21 | Apple Inc. | Device voice control |
US11676606B2 (en) | 2015-05-27 | 2023-06-13 | Google Llc | Context-sensitive dynamic update of voice to text model in a voice-enabled electronic device |
US9966073B2 (en) | 2015-05-27 | 2018-05-08 | Google Llc | Context-sensitive dynamic update of voice to text model in a voice-enabled electronic device |
CN107430855B (en) * | 2015-05-27 | 2020-11-24 | 谷歌有限责任公司 | Context sensitive dynamic update of a speech to text model in a speech enabled electronic device |
US10356243B2 (en) | 2015-06-05 | 2019-07-16 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US10681212B2 (en) | 2015-06-05 | 2020-06-09 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US11947873B2 (en) | 2015-06-29 | 2024-04-02 | Apple Inc. | Virtual assistant for media playback |
US11010127B2 (en) | 2015-06-29 | 2021-05-18 | Apple Inc. | Virtual assistant for media playback |
CN107949823A (en) * | 2015-09-08 | 2018-04-20 | 苹果公司 | Zero-lag digital assistants |
US11126400B2 (en) | 2015-09-08 | 2021-09-21 | Apple Inc. | Zero latency digital assistant |
WO2017044160A1 (en) * | 2015-09-08 | 2017-03-16 | Apple Inc. | Zero latency digital assistant |
US11550542B2 (en) | 2015-09-08 | 2023-01-10 | Apple Inc. | Zero latency digital assistant |
US11809483B2 (en) | 2015-09-08 | 2023-11-07 | Apple Inc. | Intelligent automated assistant for media search and playback |
US11954405B2 (en) | 2015-09-08 | 2024-04-09 | Apple Inc. | Zero latency digital assistant |
US11853536B2 (en) | 2015-09-08 | 2023-12-26 | Apple Inc. | Intelligent automated assistant in a media environment |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US11500672B2 (en) | 2015-09-08 | 2022-11-15 | Apple Inc. | Distributed personal assistant |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US11809886B2 (en) | 2015-11-06 | 2023-11-07 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US11526368B2 (en) | 2015-11-06 | 2022-12-13 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US11886805B2 (en) | 2015-11-09 | 2024-01-30 | Apple Inc. | Unconventional virtual assistant interactions |
US10354652B2 (en) | 2015-12-02 | 2019-07-16 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10942703B2 (en) | 2015-12-23 | 2021-03-09 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
ES2644887A1 (en) * | 2016-05-31 | 2017-11-30 | Xesol I Mas D Mas I, S.L. | Method of interaction through voice for during communication\rdriving vehicles and device that implements it (Machine-translation by Google Translate, not legally binding) |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US11227589B2 (en) | 2016-06-06 | 2022-01-18 | Apple Inc. | Intelligent list reading |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
US11069347B2 (en) | 2016-06-08 | 2021-07-20 | Apple Inc. | Intelligent automated assistant for media exploration |
US10354011B2 (en) | 2016-06-09 | 2019-07-16 | Apple Inc. | Intelligent automated assistant in a home environment |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
US10733993B2 (en) | 2016-06-10 | 2020-08-04 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US11037565B2 (en) | 2016-06-10 | 2021-06-15 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US11657820B2 (en) | 2016-06-10 | 2023-05-23 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US11809783B2 (en) | 2016-06-11 | 2023-11-07 | Apple Inc. | Intelligent device arbitration and control |
US10580409B2 (en) | 2016-06-11 | 2020-03-03 | Apple Inc. | Application integration with a digital assistant |
US10942702B2 (en) | 2016-06-11 | 2021-03-09 | Apple Inc. | Intelligent device arbitration and control |
US11749275B2 (en) | 2016-06-11 | 2023-09-05 | Apple Inc. | Application integration with a digital assistant |
US11152002B2 (en) | 2016-06-11 | 2021-10-19 | Apple Inc. | Application integration with a digital assistant |
US10297253B2 (en) | 2016-06-11 | 2019-05-21 | Apple Inc. | Application integration with a digital assistant |
US10089072B2 (en) | 2016-06-11 | 2018-10-02 | Apple Inc. | Intelligent device arbitration and control |
US10269345B2 (en) | 2016-06-11 | 2019-04-23 | Apple Inc. | Intelligent task discovery |
US10521466B2 (en) | 2016-06-11 | 2019-12-31 | Apple Inc. | Data driven natural language event detection and classification |
US10474753B2 (en) | 2016-09-07 | 2019-11-12 | Apple Inc. | Language identification using recurrent neural networks |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US10553215B2 (en) | 2016-09-23 | 2020-02-04 | Apple Inc. | Intelligent automated assistant |
KR20180040426A (en) * | 2016-10-12 | 2018-04-20 | 삼성전자주식회사 | Electronic apparatus and Method for controlling electronic apparatus thereof |
US10418027B2 (en) | 2016-10-12 | 2019-09-17 | Samsung Electronics Co., Ltd. | Electronic device and method for controlling the same |
WO2018070780A1 (en) * | 2016-10-12 | 2018-04-19 | Samsung Electronics Co., Ltd. | Electronic device and method for controlling the same |
KR102623272B1 (en) | 2016-10-12 | 2024-01-11 | 삼성전자주식회사 | Electronic apparatus and Method for controlling electronic apparatus thereof |
US20180102125A1 (en) * | 2016-10-12 | 2018-04-12 | Samsung Electronics Co., Ltd. | Electronic device and method for controlling the same |
US11281993B2 (en) | 2016-12-05 | 2022-03-22 | Apple Inc. | Model and ensemble compression for metric learning |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US11656884B2 (en) | 2017-01-09 | 2023-05-23 | Apple Inc. | Application integration with a digital assistant |
US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
US10741181B2 (en) | 2017-05-09 | 2020-08-11 | Apple Inc. | User interface for correcting recognition errors |
US10332518B2 (en) | 2017-05-09 | 2019-06-25 | Apple Inc. | User interface for correcting recognition errors |
US10417266B2 (en) | 2017-05-09 | 2019-09-17 | Apple Inc. | Context-aware ranking of intelligent response suggestions |
US10847142B2 (en) | 2017-05-11 | 2020-11-24 | Apple Inc. | Maintaining privacy of personal information |
US10755703B2 (en) | 2017-05-11 | 2020-08-25 | Apple Inc. | Offline personal assistant |
US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
US11467802B2 (en) | 2017-05-11 | 2022-10-11 | Apple Inc. | Maintaining privacy of personal information |
US10395654B2 (en) | 2017-05-11 | 2019-08-27 | Apple Inc. | Text normalization based on a data-driven learning network |
US11599331B2 (en) | 2017-05-11 | 2023-03-07 | Apple Inc. | Maintaining privacy of personal information |
US11837237B2 (en) | 2017-05-12 | 2023-12-05 | Apple Inc. | User-specific acoustic models |
US11580990B2 (en) | 2017-05-12 | 2023-02-14 | Apple Inc. | User-specific acoustic models |
US11405466B2 (en) | 2017-05-12 | 2022-08-02 | Apple Inc. | Synchronization and task delegation of a digital assistant |
US10791176B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Synchronization and task delegation of a digital assistant |
US10789945B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Low-latency intelligent automated assistant |
US10410637B2 (en) | 2017-05-12 | 2019-09-10 | Apple Inc. | User-specific acoustic models |
US11380310B2 (en) | 2017-05-12 | 2022-07-05 | Apple Inc. | Low-latency intelligent automated assistant |
US11538469B2 (en) | 2017-05-12 | 2022-12-27 | Apple Inc. | Low-latency intelligent automated assistant |
US11862151B2 (en) | 2017-05-12 | 2024-01-02 | Apple Inc. | Low-latency intelligent automated assistant |
US11301477B2 (en) | 2017-05-12 | 2022-04-12 | Apple Inc. | Feedback analysis of a digital assistant |
US10810274B2 (en) | 2017-05-15 | 2020-10-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
US10482874B2 (en) | 2017-05-15 | 2019-11-19 | Apple Inc. | Hierarchical belief states for digital assistants |
US10303715B2 (en) | 2017-05-16 | 2019-05-28 | Apple Inc. | Intelligent automated assistant for media exploration |
US10403278B2 (en) | 2017-05-16 | 2019-09-03 | Apple Inc. | Methods and systems for phonetic matching in digital assistant services |
US10311144B2 (en) | 2017-05-16 | 2019-06-04 | Apple Inc. | Emoji word sense disambiguation |
US11675829B2 (en) | 2017-05-16 | 2023-06-13 | Apple Inc. | Intelligent automated assistant for media exploration |
US10748546B2 (en) | 2017-05-16 | 2020-08-18 | Apple Inc. | Digital assistant services based on device capabilities |
US10909171B2 (en) | 2017-05-16 | 2021-02-02 | Apple Inc. | Intelligent automated assistant for media exploration |
US11532306B2 (en) | 2017-05-16 | 2022-12-20 | Apple Inc. | Detecting a trigger of a digital assistant |
US11217255B2 (en) | 2017-05-16 | 2022-01-04 | Apple Inc. | Far-field extension for digital assistant services |
US10657328B2 (en) | 2017-06-02 | 2020-05-19 | Apple Inc. | Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling |
US10445429B2 (en) | 2017-09-21 | 2019-10-15 | Apple Inc. | Natural language understanding using vocabularies with compressed serialized tries |
US10755051B2 (en) | 2017-09-29 | 2020-08-25 | Apple Inc. | Rule-based natural language processing |
US10636424B2 (en) | 2017-11-30 | 2020-04-28 | Apple Inc. | Multi-turn canned dialog |
US10733982B2 (en) | 2018-01-08 | 2020-08-04 | Apple Inc. | Multi-directional dialog |
US10733375B2 (en) | 2018-01-31 | 2020-08-04 | Apple Inc. | Knowledge-based framework for improving natural language understanding |
US10789959B2 (en) | 2018-03-02 | 2020-09-29 | Apple Inc. | Training speaker recognition models for digital assistants |
US10592604B2 (en) | 2018-03-12 | 2020-03-17 | Apple Inc. | Inverse text normalization for automatic speech recognition |
US11264049B2 (en) | 2018-03-12 | 2022-03-01 | Cypress Semiconductor Corporation | Systems and methods for capturing noise for pattern recognition processing |
US10332543B1 (en) | 2018-03-12 | 2019-06-25 | Cypress Semiconductor Corporation | Systems and methods for capturing noise for pattern recognition processing |
WO2019177699A1 (en) * | 2018-03-12 | 2019-09-19 | Cypress Semiconductor Corporation | Systems and methods for capturing noise for pattern recognition processing |
US11710482B2 (en) | 2018-03-26 | 2023-07-25 | Apple Inc. | Natural assistant interaction |
US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
US10909331B2 (en) | 2018-03-30 | 2021-02-02 | Apple Inc. | Implicit identification of translation payload with neural machine translation |
US11487364B2 (en) | 2018-05-07 | 2022-11-01 | Apple Inc. | Raise to speak |
US11169616B2 (en) | 2018-05-07 | 2021-11-09 | Apple Inc. | Raise to speak |
US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
US11854539B2 (en) | 2018-05-07 | 2023-12-26 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
US11900923B2 (en) | 2018-05-07 | 2024-02-13 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
US11907436B2 (en) | 2018-05-07 | 2024-02-20 | Apple Inc. | Raise to speak |
EP3779966A4 (en) * | 2018-05-10 | 2021-11-17 | Llsollu Co., Ltd. | Artificial intelligence service method and device therefor |
JP2021529978A (en) * | 2018-05-10 | 2021-11-04 | エル ソルー カンパニー, リミテッドLlsollu Co., Ltd. | Artificial intelligence service method and equipment for it |
US10984780B2 (en) | 2018-05-21 | 2021-04-20 | Apple Inc. | Global semantic word embeddings using bi-directional recurrent neural networks |
US11630525B2 (en) | 2018-06-01 | 2023-04-18 | Apple Inc. | Attention aware virtual assistant dismissal |
US11386266B2 (en) | 2018-06-01 | 2022-07-12 | Apple Inc. | Text correction |
US11360577B2 (en) | 2018-06-01 | 2022-06-14 | Apple Inc. | Attention aware virtual assistant dismissal |
US11009970B2 (en) | 2018-06-01 | 2021-05-18 | Apple Inc. | Attention aware virtual assistant dismissal |
US11495218B2 (en) | 2018-06-01 | 2022-11-08 | Apple Inc. | Virtual assistant operation in multi-device environments |
US11431642B2 (en) | 2018-06-01 | 2022-08-30 | Apple Inc. | Variable latency device coordination |
US10684703B2 (en) | 2018-06-01 | 2020-06-16 | Apple Inc. | Attention aware virtual assistant dismissal |
US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
US10720160B2 (en) | 2018-06-01 | 2020-07-21 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
US10403283B1 (en) | 2018-06-01 | 2019-09-03 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
US10984798B2 (en) | 2018-06-01 | 2021-04-20 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
US10496705B1 (en) | 2018-06-03 | 2019-12-03 | Apple Inc. | Accelerated task performance |
US10504518B1 (en) | 2018-06-03 | 2019-12-10 | Apple Inc. | Accelerated task performance |
US10944859B2 (en) | 2018-06-03 | 2021-03-09 | Apple Inc. | Accelerated task performance |
CN109302528A (en) * | 2018-08-21 | 2019-02-01 | 努比亚技术有限公司 | A kind of photographic method, mobile terminal and computer readable storage medium |
US11010561B2 (en) | 2018-09-27 | 2021-05-18 | Apple Inc. | Sentiment prediction from textual data |
US10839159B2 (en) | 2018-09-28 | 2020-11-17 | Apple Inc. | Named entity normalization in a spoken dialog system |
US11893992B2 (en) | 2018-09-28 | 2024-02-06 | Apple Inc. | Multi-modal inputs for voice commands |
US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
US11170166B2 (en) | 2018-09-28 | 2021-11-09 | Apple Inc. | Neural typographical error modeling via generative adversarial networks |
US11475898B2 (en) | 2018-10-26 | 2022-10-18 | Apple Inc. | Low-latency multi-speaker speech recognition |
CN109412900A (en) * | 2018-12-04 | 2019-03-01 | 腾讯科技(深圳)有限公司 | A kind of network state knows the method and device of method for distinguishing, model training |
CN109412900B (en) * | 2018-12-04 | 2020-08-21 | 腾讯科技(深圳)有限公司 | Network state recognition method, model training method and model training device |
US11638059B2 (en) | 2019-01-04 | 2023-04-25 | Apple Inc. | Content playback on multiple devices |
US11783815B2 (en) | 2019-03-18 | 2023-10-10 | Apple Inc. | Multimodality in digital assistant systems |
US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
US11423908B2 (en) | 2019-05-06 | 2022-08-23 | Apple Inc. | Interpreting spoken requests |
US11475884B2 (en) | 2019-05-06 | 2022-10-18 | Apple Inc. | Reducing digital assistant latency when a language is incorrectly determined |
US11675491B2 (en) | 2019-05-06 | 2023-06-13 | Apple Inc. | User configurable task triggers |
US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
US11705130B2 (en) | 2019-05-06 | 2023-07-18 | Apple Inc. | Spoken notifications |
US11217251B2 (en) | 2019-05-06 | 2022-01-04 | Apple Inc. | Spoken notifications |
US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
US11888791B2 (en) | 2019-05-21 | 2024-01-30 | Apple Inc. | Providing message response suggestions |
US11289073B2 (en) | 2019-05-31 | 2022-03-29 | Apple Inc. | Device text to speech |
US11496600B2 (en) | 2019-05-31 | 2022-11-08 | Apple Inc. | Remote execution of machine-learned models |
US11237797B2 (en) | 2019-05-31 | 2022-02-01 | Apple Inc. | User activity shortcut suggestions |
US11360739B2 (en) | 2019-05-31 | 2022-06-14 | Apple Inc. | User activity shortcut suggestions |
US11657813B2 (en) | 2019-05-31 | 2023-05-23 | Apple Inc. | Voice identification in digital assistant systems |
US11790914B2 (en) | 2019-06-01 | 2023-10-17 | Apple Inc. | Methods and user interfaces for voice-based control of electronic devices |
US11360641B2 (en) | 2019-06-01 | 2022-06-14 | Apple Inc. | Increasing the relevance of new available information |
US11488406B2 (en) | 2019-09-25 | 2022-11-01 | Apple Inc. | Text detection using global geometry estimators |
CN113261056A (en) * | 2019-12-04 | 2021-08-13 | 谷歌有限责任公司 | Speaker perception using speaker-dependent speech models |
US11914848B2 (en) | 2020-05-11 | 2024-02-27 | Apple Inc. | Providing relevant data items based on context |
US11765209B2 (en) | 2020-05-11 | 2023-09-19 | Apple Inc. | Digital assistant hardware abstraction |
US11924254B2 (en) | 2020-05-11 | 2024-03-05 | Apple Inc. | Digital assistant hardware abstraction |
US11755276B2 (en) | 2020-05-12 | 2023-09-12 | Apple Inc. | Reducing description length based on confidence |
US11838734B2 (en) | 2020-07-20 | 2023-12-05 | Apple Inc. | Multi-device audio adjustment coordination |
US11696060B2 (en) | 2020-07-21 | 2023-07-04 | Apple Inc. | User identification using headphones |
US11750962B2 (en) | 2020-07-21 | 2023-09-05 | Apple Inc. | User identification using headphones |
EP4123639A3 (en) * | 2021-11-08 | 2023-02-22 | Apollo Intelligent Connectivity (Beijing) Technology Co., Ltd. | Wake-up control for a speech controlled device |
Also Published As
Publication number | Publication date |
---|---|
US9697822B1 (en) | 2017-07-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11557310B2 (en) | Voice trigger for a digital assistant | |
US9697822B1 (en) | System and method for updating an adaptive speech recognition model | |
US20200365155A1 (en) | Voice activated device for use with a voice-based digital assistant | |
AU2015101078B4 (en) | Voice trigger for a digital assistant | |
AU2022203177B2 (en) | Voice trigger for a digital assistant |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14726454 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14726454 Country of ref document: EP Kind code of ref document: A1 |