US20090172546A1 - Search-based dynamic voice activation - Google Patents

Search-based dynamic voice activation Download PDF

Info

Publication number
US20090172546A1
US20090172546A1 US12/126,077 US12607708A US2009172546A1 US 20090172546 A1 US20090172546 A1 US 20090172546A1 US 12607708 A US12607708 A US 12607708A US 2009172546 A1 US2009172546 A1 US 2009172546A1
Authority
US
United States
Prior art keywords
gui
voice
items
user
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/126,077
Inventor
Yan Ming CHANG
Changxue Ma
Ted Mazurkiewicz
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google Technology Holdings LLC
Original Assignee
Motorola Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Inc filed Critical Motorola Inc
Priority to US12/126,077 priority Critical patent/US20090172546A1/en
Assigned to MOTOROLA, INC. reassignment MOTOROLA, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MAZURKIEWICZ, TED, CHANG, YAN MING, MA, CHANGXUE
Publication of US20090172546A1 publication Critical patent/US20090172546A1/en
Assigned to Motorola Mobility, Inc reassignment Motorola Mobility, Inc ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MOTOROLA, INC
Assigned to MOTOROLA MOBILITY LLC reassignment MOTOROLA MOBILITY LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MOTOROLA MOBILITY, INC.
Priority to US14/464,016 priority patent/US10664229B2/en
Assigned to Google Technology Holdings LLC reassignment Google Technology Holdings LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MOTOROLA MOBILITY LLC
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/038Control and interface arrangements therefor, e.g. drivers or device-embedded control circuitry
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/04817Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance using icons
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04842Selection of displayed objects or displayed text elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/04Electrically-operated educational appliances with audible presentation of the material to be studied
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/0018Speech coding using phonetic or linguistical decoding of the source; Reconstruction using text-to-speech synthesis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/228Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context

Definitions

  • the present invention relates to a method and system for voice navigation.
  • the present invention further relates to voice navigation as relating to graphical user interface items.
  • Voice recognition software has historically performed less than ideally. Most software programs that perform voice recognition based navigation have previously done so by constructing a voice dialogue application statically for each view of a graphical user interface (GUI). To do this, for each view of a GUI, a dialogue application has to anticipate every grammar and vocabulary choice of the user. This process may significantly impede browsing and navigation.
  • GUI graphical user interface
  • VoiceXML® Web content providers may currently use VoiceXML® for voice navigation or browsing by voice enabling web pages.
  • VoiceXML® uses a static voice navigation system, which does not allow for much flexibility. VoiceXML® coverage may not extend to the entire webpage.
  • a voice input mechanism may receive a verbal input from a user to a voice user interface program invisible to the user.
  • a processor may identify in a graphical user interface (GUI) a set of GUI items.
  • the processor may convert the set of GUI items to a set of voice searchable indices.
  • the processor may correlate a matching GUI item of the set of GUI items to a phonemic representation of the verbal input.
  • GUI graphical user interface
  • FIG. 1 illustrates in a block diagram one embodiment of a computing device that may be used to implement the communication protocol management method.
  • FIG. 2 illustrates in a block diagram one embodiment of a graphical user interface.
  • FIG. 3 illustrates in a block diagram one embodiment of verbal user interface software application.
  • FIG. 4 illustrates in a block diagram one embodiment of voice searchable indices.
  • FIG. 5 illustrates in a flowchart one embodiment of a method for developing voice searchable indices.
  • FIG. 6 illustrates in a flowchart one embodiment of a method for invisible voice navigation.
  • FIG. 7 may illustrate in a block diagram one embodiment of a graphical voice navigation response.
  • the present invention comprises a variety of embodiments, such as a method, an apparatus, and an electronic device, and other embodiments that relate to the basic concepts of the invention.
  • the electronic device may be any manner of computer, mobile device, or wireless communication device.
  • a voice input mechanism may receive a verbal input from a user to a voice user interface program invisible to the user.
  • a processor may identify in a graphical user interface (GUI) a set of GUI items.
  • the processor may convert the set of GUI items to a set of voice searchable indices.
  • the processor may correlate a matching GUI item of the set of GUI items to a phonemic representation of the verbal input.
  • GUI graphical user interface
  • FIG. 1 illustrates in a block diagram one embodiment of a computing device 100 that may be used to implement a voice navigation method.
  • Any computing device such as a desktop computer, handheld device, or a server, may implement the voice navigation method.
  • the computing device 100 may access the information or data stored in a network.
  • the computing device 100 may support one or more applications for performing various communications with the network.
  • the computing device 100 may implement any operating system, such as Windows or UNIX, for example.
  • Client and server software may be written in any programming language, such as C, C++, Java or Visual Basic, for example.
  • the computing device 100 may be a mobile phone, a laptop, a personal digital assistant (PDA), or other portable device.
  • PDA personal digital assistant
  • the computing device 100 may be a WiFi capable device, which may be used to access the network for data or by voice using voice over internet protocol (VoIP).
  • the computing device 100 may include a network interface 102 , such as a transceiver, to send and receive data over the network.
  • the computing device 100 may include a controller or processor 104 that executes stored programs.
  • the controller or processor 104 may be any programmed processor known to one of skill in the art.
  • the decision support method may also be implemented on a general-purpose or a special purpose computer, a programmed microprocessor or microcontroller, peripheral integrated circuit elements, an application-specific integrated circuit or other integrated circuits, hardware/electronic logic circuits, such as a discrete element circuit, a programmable logic device, such as a programmable logic array, field programmable gate-array, or the like.
  • any device or devices capable of implementing the decision support method as described herein can be used to implement the decision support system functions of this invention.
  • the computing device 100 may also include a volatile memory 106 and a non-volatile memory 108 to be used by the processor 104 .
  • the volatile 106 and nonvolatile data storage 108 may include one or more electrical, magnetic or optical memories such as a random access memory (RAM), cache, hard drive, or other memory device.
  • RAM random access memory
  • the memory may have a cache to speed access to specific data.
  • the memory may also be connected to a compact disc-read only memory (CD-ROM), digital video disc-read only memory (DVD-ROM, DVD read write input, tape drive or other removable memory device that allows media content to be directly uploaded into the system.
  • CD-ROM compact disc-read only memory
  • DVD-ROM digital video disc-read only memory
  • DVD-ROM DVD read write input, tape drive or other removable memory device that allows media content to be directly uploaded into the system.
  • the computing device 100 may include a user input interface 110 that may comprise elements such as a keypad, display, touch screen, or any other device that accepts input.
  • the computing device 100 may also include a user output device that may comprise a display screen and an audio interface 112 that may comprise elements such as a microphone, earphone, and speaker.
  • the computing device 100 also may include a component interface 114 to which additional elements may be attached, for example, a universal serial bus (USB) interface or an audio-video capture mechanism.
  • the computing device 100 may include a power supply 116 .
  • Client software and databases may be accessed by the controller or processor 104 from the memory, and may include, for example, database applications, word processing applications, video processing applications as well as components that embody the decision support functionality of the present invention.
  • the user access data may be stored in either a database accessible through a database interface or in the memory.
  • the computing device 100 may implement any operating system, such as Windows or UNIX, for example.
  • Client and server software may be written in any programming language, such as C, C++, Java or Visual Basic, for example.
  • a graphical user interface may allow the user to interact with a series of data objects stored in a computer or on the internet.
  • a data object may be a file, webpage, an application, or other coherent set of computer data.
  • the term “computer data” may refer to data found on the internet.
  • the GUI may represent each data object with a GUI item, such as a hyperlink, soft-button, image, icon, or other representation of the data object. The GUI need not distinguish between GUI-viewed data objects from a computer or the internet.
  • FIG. 2 illustrates in a block diagram one embodiment of a GUI.
  • the user interface 110 of the computing device 100 may be a display 202 .
  • the computing device 100 may interact with the user using a graphical user interface 204 .
  • a standard GUI 204 may present to a user one or more GUI items, such as icons 206 representing one or more data file objects on the display 202 .
  • a GUI item may be any representation shown in a GUI that acts as an input signal to open some type of data object.
  • the GUI may be a browser 208 to present a webpage to a user.
  • the webpage may have images 210 that link to other web pages.
  • the web pages may have an icon or button 212 to activate a web application.
  • the webpage may have hyperlinks 214 linking to other web pages buried within the set of text 216 presented on the webpage.
  • GUIs with a large number of GUI items may be impractical for prompted voice navigation.
  • the GUI items from a view of a GUI may be harvested and dynamically translated into voice search indices.
  • a voice user interface (VUI) may use the search indices to form a view-specific searchable database.
  • the view of the display 202 may be voice-enabled just in time.
  • FIG. 3 illustrates in a block diagram one embodiment of an invisible verbal user interface program 300 .
  • the display 202 may show a GUI 302 to the user.
  • a GUI items harvester module 304 may search the GUI 302 for GUI items.
  • GUI items may include hyperlinks 214 , images 210 , application icons 206 , and other graphic images that lead to a data object.
  • a data object may be a file, webpage, an application, or other coherent set of computer data.
  • the GUI items harvester module 304 may collect all the GUI items in the GUI 302 , as well as any contextual data associated with the GUI items.
  • a parser 306 such as a text normalization module or a grapheme to phoneme module, may convert each GUI item in the GUI 302 into a searchable index in the form of a linguistic document.
  • the parser 306 may take into account linguistic surface form, surrounding texts, hyperlinked webpage titles, metadata, and other data associated with the GUI item.
  • a database of GUI item indices 308 may organize the linguistic documents into a searchable database to facilitate searching.
  • the VUI may convert a verbal input into a phoneme lattice to match against the searchable indices from the view-specific searchable database.
  • a voice input mechanism 310 may receive a verbal input from a user.
  • a phoneme decoder 312 or other voice recognition technology, may convert the verbal input into a phoneme lattice.
  • a search term generator 314 may extract linguistic search terms from the phoneme lattice, such as a phoneme, syllable, or word string.
  • a GUI items search engine 316 may take the linguistic search term and search the GUI items index 308 . The GUI items search engine 316 may select a GUI item and may perform a navigation action associated with the matching GUI item to the GUI 302 .
  • FIG. 4 illustrates in a block diagram one embodiment of voice searchable indices 400 .
  • the voice searchable indices 400 may be initially sorted by number of words (WRD) 402 .
  • the voice searchable indices 400 may be further sorted by phonemes (PHO) 404 , the phonemes arranged in spoken order.
  • the voice searchable indices 400 may include a GUI item type 406 , such as image, hyperlink, application icon, or other GUI item type.
  • the voice searchable indices 400 may also include an associated grapheme or commonly used name of the GUI item (GRPH) 408 , such as picture, button, arrow, or other names.
  • GRPH grapheme or commonly used name of the GUI item
  • the voice searchable indices 400 may have a set of alternate linguistic labels (ALT) 410 to identify the GUI item, especially if the GUI item is an image or other GUI item that may be thought to have multiple label names by the user.
  • the voice searchable indices 400 may include a link to the computer object (OBJ) 412 represented by the GUI item.
  • OBJ computer object
  • the VUI 300 may create a just-in-time, voice-enabled searchable database from a view of the GUI.
  • FIG. 5 illustrates in a flowchart one embodiment of a method 500 for developing a voice searchable indices 400 .
  • the computing device 100 may display a GUI to the user (Block 502 ).
  • the computing device 100 may identify a GUI item (GUII) of the GUI (Block 504 ). If the GUII is a non-textual GUII (Block 506 ), such as an image or unlabeled icon, the computing device 100 may develop alternate linguistic labels for the GUII (Block 508 ).
  • GUII GUI item
  • the computing device 100 may create textual description based on metadata for a GUII, commonly depicted names, surrounding text, labels, grapheme, and other data.
  • the computing device 100 may convert the GUII to a linguistic document (LD) (Block 510 ).
  • the computing device 100 may organize the LDs into a searchable database of GUII indices (Block 512 ).
  • FIG. 6 illustrates in a flowchart one embodiment of a method 600 for invisible voice navigation.
  • the VUI 300 may receive a verbal input from the user (Block 602 ).
  • the VUI 300 may identify a set of possible matching GUIs (Block 604 ).
  • the VUI 300 may designate a primary matching GUII, or closest verbal match, and a set of one or more alternate GUIs from the set of possible matching GUIs (Block 606 ).
  • the VUI 300 may identify a primary matching GUII and one or more alternate GUIs (Block 608 ).
  • the VUI 300 may present a computer object (CO) associated with the primary matching GUII (Block 610 ).
  • the VUI 300 may present the alternate GUIs to the user (Block 612 ).
  • the VUI 300 may present an approximation of the computer objects associated with the alternate GUIs. If the user selects one of the alternate GUIs (Block 614 ), the VUI 300 may present the computer object associated with the selected alternate GUII (Block 616 ).
  • the VUI 300 may keep a history of various users in order to determine which GUII to present as the primary matching GUII and which GUIs to present as the alternates during repeated uses of the VUI 300 .
  • the VUI 300 may track if a specific verbal input is repeatedly used when referring to a specific GUII of a specific GUI.
  • the VUI 300 may then present that GUII as the primary matching GUII.
  • the VUI 300 may use the histories of other users to determine a primary matching GUII when multiple GUIs have a similar linguistic document.
  • the VUI 300 may briefly present the alternate GUI items to the user in a pop-up window.
  • the pop-up window may be removed if no item is selected after a set period of time. If one of the alternate GUI items is selected, the VUI 300 may execute the navigation action associated with the selected GUI item and override the initially presented view.
  • FIG. 7 may illustrate in a block diagram one embodiment of a graphical voice navigation response 702 with alternate computer objects.
  • the browser 208 may present a computer object 702 associated with the matching GUII.
  • the browser 208 may also present approximate representations of the computer objects 704 associated with the next closest matches to the LST. If the user does not select one of the alternates after a set period of time, the alternate computer objects may be removed from the browser 208 .
  • Embodiments may also be practiced in distributed computing environments where tasks are performed by local and remote processing devices that are linked (either by hardwired links, wireless links, or by a combination thereof through a communications network.
  • Embodiments within the scope of the present invention may also include computer-readable media for carrying or having computer-executable instructions or data structures stored thereon.
  • Such computer-readable media can be any available media that can be accessed by a general purpose or special purpose computer.
  • Such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store desired program code means in the form of computer-executable instructions or data structures.
  • a network or another communications connection either hardwired, wireless, or combination thereof to a computer, the computer properly views the connection as a computer-readable medium.
  • any such connection is properly termed a computer-readable medium. Combinations of the above should also be included within the scope of the computer-readable media.
  • Computer-executable instructions include, for example, instructions and data which cause a general purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions.
  • Computer-executable instructions also include program modules that are executed by computers in stand-alone or network environments.
  • program modules include routines, programs, objects, components, and data structures, etc. that perform particular tasks or implement particular abstract data types.
  • Computer-executable instructions, associated data structures, and program modules represent examples of the program code means for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps.

Abstract

A method, apparatus, and electronic device for voice navigation are disclosed. A voice input mechanism 310 may receive a verbal input from a user to a voice user interface program invisible to the user. A processor 104 may identify in a graphical user interface (GUI) a set of GUI items. The processor 104 may convert the set of GUI items to a set of voice searchable indices 400. The processor 104 may correlate a matching GUI item of the set of GUI items to a phonemic representation of the verbal input.

Description

    1. FIELD OF THE INVENTION
  • The present invention relates to a method and system for voice navigation. The present invention further relates to voice navigation as relating to graphical user interface items.
  • 2. INTRODUCTION
  • Voice recognition software has historically performed less than ideally. Most software programs that perform voice recognition based navigation have previously done so by constructing a voice dialogue application statically for each view of a graphical user interface (GUI). To do this, for each view of a GUI, a dialogue application has to anticipate every grammar and vocabulary choice of the user. This process may significantly impede browsing and navigation.
  • Web content providers may currently use VoiceXML® for voice navigation or browsing by voice enabling web pages. VoiceXML® uses a static voice navigation system, which does not allow for much flexibility. VoiceXML® coverage may not extend to the entire webpage.
  • SUMMARY OF THE INVENTION
  • A method, apparatus, and electronic device for voice navigation are disclosed. A voice input mechanism may receive a verbal input from a user to a voice user interface program invisible to the user. A processor may identify in a graphical user interface (GUI) a set of GUI items. The processor may convert the set of GUI items to a set of voice searchable indices. The processor may correlate a matching GUI item of the set of GUI items to a phonemic representation of the verbal input.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • Understanding that these drawings depict only typical embodiments of the invention and are not therefore to be considered to be limiting of its scope, the invention will be described and explained with additional specificity and detail through the use of the accompanying drawings in which:
  • FIG. 1 illustrates in a block diagram one embodiment of a computing device that may be used to implement the communication protocol management method.
  • FIG. 2 illustrates in a block diagram one embodiment of a graphical user interface.
  • FIG. 3 illustrates in a block diagram one embodiment of verbal user interface software application.
  • FIG. 4 illustrates in a block diagram one embodiment of voice searchable indices.
  • FIG. 5 illustrates in a flowchart one embodiment of a method for developing voice searchable indices.
  • FIG. 6 illustrates in a flowchart one embodiment of a method for invisible voice navigation.
  • FIG. 7 may illustrate in a block diagram one embodiment of a graphical voice navigation response.
  • DETAILED DESCRIPTION OF THE INVENTION
  • Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The features and advantages of the invention may be realized and obtained by means of the instruments and combinations particularly pointed out in the appended claims. These and other features of the present invention will become more fully apparent from the following description and appended claims, or may be learned by the practice of the invention as set forth herein.
  • Various embodiments of the invention are discussed in detail below. While specific implementations are discussed, it should be understood that this is done for illustration purposes only. A person skilled in the relevant art will recognize that other components and configurations may be used without parting from the spirit and scope of the invention.
  • The present invention comprises a variety of embodiments, such as a method, an apparatus, and an electronic device, and other embodiments that relate to the basic concepts of the invention. The electronic device may be any manner of computer, mobile device, or wireless communication device.
  • A method, apparatus, and electronic device for voice navigation are disclosed. A voice input mechanism may receive a verbal input from a user to a voice user interface program invisible to the user. A processor may identify in a graphical user interface (GUI) a set of GUI items. The processor may convert the set of GUI items to a set of voice searchable indices. The processor may correlate a matching GUI item of the set of GUI items to a phonemic representation of the verbal input.
  • FIG. 1 illustrates in a block diagram one embodiment of a computing device 100 that may be used to implement a voice navigation method. Any computing device, such as a desktop computer, handheld device, or a server, may implement the voice navigation method. The computing device 100 may access the information or data stored in a network. The computing device 100 may support one or more applications for performing various communications with the network. The computing device 100 may implement any operating system, such as Windows or UNIX, for example. Client and server software may be written in any programming language, such as C, C++, Java or Visual Basic, for example. The computing device 100 may be a mobile phone, a laptop, a personal digital assistant (PDA), or other portable device. For some embodiments of the present invention, the computing device 100 may be a WiFi capable device, which may be used to access the network for data or by voice using voice over internet protocol (VoIP). The computing device 100 may include a network interface 102, such as a transceiver, to send and receive data over the network.
  • The computing device 100 may include a controller or processor 104 that executes stored programs. The controller or processor 104 may be any programmed processor known to one of skill in the art. However, the decision support method may also be implemented on a general-purpose or a special purpose computer, a programmed microprocessor or microcontroller, peripheral integrated circuit elements, an application-specific integrated circuit or other integrated circuits, hardware/electronic logic circuits, such as a discrete element circuit, a programmable logic device, such as a programmable logic array, field programmable gate-array, or the like. In general, any device or devices capable of implementing the decision support method as described herein can be used to implement the decision support system functions of this invention.
  • The computing device 100 may also include a volatile memory 106 and a non-volatile memory 108 to be used by the processor 104. The volatile 106 and nonvolatile data storage 108 may include one or more electrical, magnetic or optical memories such as a random access memory (RAM), cache, hard drive, or other memory device. The memory may have a cache to speed access to specific data. The memory may also be connected to a compact disc-read only memory (CD-ROM), digital video disc-read only memory (DVD-ROM, DVD read write input, tape drive or other removable memory device that allows media content to be directly uploaded into the system.
  • The computing device 100 may include a user input interface 110 that may comprise elements such as a keypad, display, touch screen, or any other device that accepts input. The computing device 100 may also include a user output device that may comprise a display screen and an audio interface 112 that may comprise elements such as a microphone, earphone, and speaker. The computing device 100 also may include a component interface 114 to which additional elements may be attached, for example, a universal serial bus (USB) interface or an audio-video capture mechanism. Finally, the computing device 100 may include a power supply 116.
  • Client software and databases may be accessed by the controller or processor 104 from the memory, and may include, for example, database applications, word processing applications, video processing applications as well as components that embody the decision support functionality of the present invention. The user access data may be stored in either a database accessible through a database interface or in the memory. The computing device 100 may implement any operating system, such as Windows or UNIX, for example. Client and server software may be written in any programming language, such as C, C++, Java or Visual Basic, for example.
  • A graphical user interface (GUI) may allow the user to interact with a series of data objects stored in a computer or on the internet. A data object may be a file, webpage, an application, or other coherent set of computer data. The term “computer data” may refer to data found on the internet. The GUI may represent each data object with a GUI item, such as a hyperlink, soft-button, image, icon, or other representation of the data object. The GUI need not distinguish between GUI-viewed data objects from a computer or the internet. FIG. 2 illustrates in a block diagram one embodiment of a GUI. The user interface 110 of the computing device 100 may be a display 202. The computing device 100 may interact with the user using a graphical user interface 204. A standard GUI 204 may present to a user one or more GUI items, such as icons 206 representing one or more data file objects on the display 202. A GUI item may be any representation shown in a GUI that acts as an input signal to open some type of data object. For a computing device 100 connected to a network, such as the internet, the GUI may be a browser 208 to present a webpage to a user. The webpage may have images 210 that link to other web pages. The web pages may have an icon or button 212 to activate a web application. Further, the webpage may have hyperlinks 214 linking to other web pages buried within the set of text 216 presented on the webpage. For items such as these browsers 208, where a number of new hyperlinks 214 are presented each time the browser is reloaded, voice recognition software that requires the construction of grammars to reflect the various ways user pronounce the hyperlinks may be impractical. GUIs with a large number of GUI items may be impractical for prompted voice navigation.
  • The GUI items from a view of a GUI may be harvested and dynamically translated into voice search indices. A voice user interface (VUI) may use the search indices to form a view-specific searchable database. The view of the display 202 may be voice-enabled just in time. FIG. 3 illustrates in a block diagram one embodiment of an invisible verbal user interface program 300. The display 202 may show a GUI 302 to the user. A GUI items harvester module 304 may search the GUI 302 for GUI items. GUI items may include hyperlinks 214, images 210, application icons 206, and other graphic images that lead to a data object. A data object may be a file, webpage, an application, or other coherent set of computer data. The GUI items harvester module 304 may collect all the GUI items in the GUI 302, as well as any contextual data associated with the GUI items. A parser 306, such as a text normalization module or a grapheme to phoneme module, may convert each GUI item in the GUI 302 into a searchable index in the form of a linguistic document. The parser 306 may take into account linguistic surface form, surrounding texts, hyperlinked webpage titles, metadata, and other data associated with the GUI item. A database of GUI item indices 308 may organize the linguistic documents into a searchable database to facilitate searching.
  • The VUI may convert a verbal input into a phoneme lattice to match against the searchable indices from the view-specific searchable database. A voice input mechanism 310 may receive a verbal input from a user. A phoneme decoder 312, or other voice recognition technology, may convert the verbal input into a phoneme lattice. A search term generator 314 may extract linguistic search terms from the phoneme lattice, such as a phoneme, syllable, or word string. A GUI items search engine 316 may take the linguistic search term and search the GUI items index 308. The GUI items search engine 316 may select a GUI item and may perform a navigation action associated with the matching GUI item to the GUI 302.
  • FIG. 4 illustrates in a block diagram one embodiment of voice searchable indices 400. The voice searchable indices 400 may be initially sorted by number of words (WRD) 402. The voice searchable indices 400 may be further sorted by phonemes (PHO) 404, the phonemes arranged in spoken order. The voice searchable indices 400 may include a GUI item type 406, such as image, hyperlink, application icon, or other GUI item type. The voice searchable indices 400 may also include an associated grapheme or commonly used name of the GUI item (GRPH) 408, such as picture, button, arrow, or other names. The voice searchable indices 400 may have a set of alternate linguistic labels (ALT) 410 to identify the GUI item, especially if the GUI item is an image or other GUI item that may be thought to have multiple label names by the user. The voice searchable indices 400 may include a link to the computer object (OBJ) 412 represented by the GUI item.
  • The VUI 300 may create a just-in-time, voice-enabled searchable database from a view of the GUI. FIG. 5 illustrates in a flowchart one embodiment of a method 500 for developing a voice searchable indices 400. The computing device 100 may display a GUI to the user (Block 502). The computing device 100 may identify a GUI item (GUII) of the GUI (Block 504). If the GUII is a non-textual GUII (Block 506), such as an image or unlabeled icon, the computing device 100 may develop alternate linguistic labels for the GUII (Block 508). The computing device 100 may create textual description based on metadata for a GUII, commonly depicted names, surrounding text, labels, grapheme, and other data. The computing device 100 may convert the GUII to a linguistic document (LD) (Block 510). The computing device 100 may organize the LDs into a searchable database of GUII indices (Block 512).
  • Upon receiving a verbal input from the user, the VUI 300 may use the GUI item index 400 to select the GUI item best matched with the verbal input to the user. The VUI 300 may also select and present to the user a set of alternate GUI items that are next best matched to the verbal input. FIG. 6 illustrates in a flowchart one embodiment of a method 600 for invisible voice navigation. The VUI 300 may receive a verbal input from the user (Block 602). The VUI 300 may identify a set of possible matching GUIs (Block 604). The VUI 300 may designate a primary matching GUII, or closest verbal match, and a set of one or more alternate GUIs from the set of possible matching GUIs (Block 606). The VUI 300 may identify a primary matching GUII and one or more alternate GUIs (Block 608). The VUI 300 may present a computer object (CO) associated with the primary matching GUII (Block 610). The VUI 300 may present the alternate GUIs to the user (Block 612). Alternatively, the VUI 300 may present an approximation of the computer objects associated with the alternate GUIs. If the user selects one of the alternate GUIs (Block 614), the VUI 300 may present the computer object associated with the selected alternate GUII (Block 616).
  • To account for the verbal tics of a user, the VUI 300 may keep a history of various users in order to determine which GUII to present as the primary matching GUII and which GUIs to present as the alternates during repeated uses of the VUI 300. The VUI 300 may track if a specific verbal input is repeatedly used when referring to a specific GUII of a specific GUI. The VUI 300 may then present that GUII as the primary matching GUII. Further, for an initial use of a GUI by a user, the VUI 300 may use the histories of other users to determine a primary matching GUII when multiple GUIs have a similar linguistic document.
  • The VUI 300 may briefly present the alternate GUI items to the user in a pop-up window. The pop-up window may be removed if no item is selected after a set period of time. If one of the alternate GUI items is selected, the VUI 300 may execute the navigation action associated with the selected GUI item and override the initially presented view. FIG. 7 may illustrate in a block diagram one embodiment of a graphical voice navigation response 702 with alternate computer objects. The browser 208 may present a computer object 702 associated with the matching GUII. The browser 208 may also present approximate representations of the computer objects 704 associated with the next closest matches to the LST. If the user does not select one of the alternates after a set period of time, the alternate computer objects may be removed from the browser 208.
  • Embodiments may also be practiced in distributed computing environments where tasks are performed by local and remote processing devices that are linked (either by hardwired links, wireless links, or by a combination thereof through a communications network.
  • Embodiments within the scope of the present invention may also include computer-readable media for carrying or having computer-executable instructions or data structures stored thereon. Such computer-readable media can be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store desired program code means in the form of computer-executable instructions or data structures. When information is transferred or provided over a network or another communications connection (either hardwired, wireless, or combination thereof to a computer, the computer properly views the connection as a computer-readable medium. Thus, any such connection is properly termed a computer-readable medium. Combinations of the above should also be included within the scope of the computer-readable media.
  • Computer-executable instructions include, for example, instructions and data which cause a general purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions. Computer-executable instructions also include program modules that are executed by computers in stand-alone or network environments. Generally, program modules include routines, programs, objects, components, and data structures, etc. that perform particular tasks or implement particular abstract data types. Computer-executable instructions, associated data structures, and program modules represent examples of the program code means for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps.
  • Although the above description may contain specific details, they should not be construed as limiting the claims in any way. Other configurations of the described embodiments of the invention are part of the scope of this invention. For example, the principles of the invention may be applied to each individual user where each user may individually deploy such a system. This enables each user to utilize the benefits of the invention even if any one of the large number of possible applications do not need the functionality described herein. In other words, there may be multiple instances of the electronic devices each processing the content in various possible ways. It does not necessarily need to be one system used by all end users. Accordingly, the appended claims and their legal equivalents should only define the invention, rather than any specific examples given.

Claims (20)

1. A method for voice navigation, comprising:
identifying in a view of a graphical user interface (GUI) a set of GUI items;
converting the set of GUI items to a set of voice searchable indices;
creating at least one phonemic representation of a verbal input via a search-based voice user interface program invisible to the user; and
identifying a matching GUI item of the set of GUI items to the phonemic representation.
2. The method of claim 1, further comprising:
presenting a matching computer object associated with the matching GUI item to the user via the graphical user interface.
3. The method of claim 2, further comprising:
identifying an alternate matching GUI item of the set of GUI items;
presenting the alternate matching GUI item to the user; and
receiving a user input.
4. The method of claim 1, further comprising:
identifying a non-textual GUI item in the set of GUI items; and
developing an alternate linguistic label for the non-textual icon.
5. The method of claim 1, wherein the set of GUI items includes at least one of a hyperlink, application icon, file name, or image.
6. The method of claim 1, wherein the phonemic representation is a linguistic search term.
7. The method of claim 1, further comprising
converting a GUI item of the set of GUI items to a corresponding linguistic document; and
organizing each corresponding linguistic document into the set of voice searchable indices.
8. A telecommunications apparatus for voice navigation, comprising:
voice input mechanism that receives a verbal input from a user to a voice user interface program invisible to the user; and
a processor that identifies in a graphical user interface (GUI) a set of GUI items, converts the set of GUI items to a set of voice searchable indices, and correlates a matching GUI item of the set of GUI items to at least one phonemic representation of the verbal input.
9. The telecommunications apparatus of claim 8, further comprising:
a display that presents a matching computer object associated with the matching GUI item to the user via the graphical user interface.
10. The telecommunications apparatus of claim 9, wherein:
the processor identifies an alternate matching GUI item of the set of GUI items;
the display presents the alternate matching GUI item to the user; and
the voice input mechanism receives a user input.
11. The telecommunications apparatus of claim 8, wherein the processor identifies a non-textual GUI item in the set of GUI items and develops an alternate linguistic label for the non-textual icon.
12. The telecommunications apparatus of claim 8, wherein the set of GUI items includes at least one of a hyperlink, application icon, file name, or image.
13. The telecommunications apparatus of claim 8, wherein the phonemic representation is a linguistic search term.
14. The telecommunications apparatus of claim 8, wherein the processor converts a GUI item of the set of GUI items to a corresponding linguistic document and organizes each corresponding linguistic document into the set of voice searchable indices.
15. An electronic device for voice navigation, comprising:
voice input mechanism that receives a verbal input from a user to a voice user interface program invisible to the user; and
a processor that identifies in a graphical user interface (GUI) a set of GUI items, converts the set of GUI items to a set of voice searchable indices, and correlates a matching GUI item of the set of GUI items to at least one phonemic representation of the verbal input.
16. The electronic device of claim 15, further comprising:
a display that presents a matching computer object associated with the matching GUI item to the user via the graphical user interface.
17. The electronic device of claim 16, wherein:
the processor identifies an alternate matching GUI item of the set of GUI items;
the display presents the alternate matching GUI item to the user; and
the voice input mechanism receives a user input.
18. The electronic device of claim 15, wherein the processor identifies a non-textual GUI item in the set of GUI items and develops an alternate linguistic label for the non-textual icon.
19. The electronic device of claim 15, wherein the set of GUI items includes at least one of a hyperlink, application icon, file name, or image.
20. The electronic device of claim 15, wherein the processor converts a GUI item of the set of GUI items to a corresponding linguistic document and organizes each corresponding linguistic document into the set of voice searchable indices.
US12/126,077 2007-12-31 2008-05-23 Search-based dynamic voice activation Abandoned US20090172546A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US12/126,077 US20090172546A1 (en) 2007-12-31 2008-05-23 Search-based dynamic voice activation
US14/464,016 US10664229B2 (en) 2007-12-31 2014-08-20 Search-based dynamic voice activation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US1811207P 2007-12-31 2007-12-31
US12/126,077 US20090172546A1 (en) 2007-12-31 2008-05-23 Search-based dynamic voice activation

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/464,016 Continuation US10664229B2 (en) 2007-12-31 2014-08-20 Search-based dynamic voice activation

Publications (1)

Publication Number Publication Date
US20090172546A1 true US20090172546A1 (en) 2009-07-02

Family

ID=40800198

Family Applications (2)

Application Number Title Priority Date Filing Date
US12/126,077 Abandoned US20090172546A1 (en) 2007-12-31 2008-05-23 Search-based dynamic voice activation
US14/464,016 Active 2031-05-03 US10664229B2 (en) 2007-12-31 2014-08-20 Search-based dynamic voice activation

Family Applications After (1)

Application Number Title Priority Date Filing Date
US14/464,016 Active 2031-05-03 US10664229B2 (en) 2007-12-31 2014-08-20 Search-based dynamic voice activation

Country Status (1)

Country Link
US (2) US20090172546A1 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110138286A1 (en) * 2009-08-07 2011-06-09 Viktor Kaptelinin Voice assisted visual search
US20110158605A1 (en) * 2009-12-18 2011-06-30 Bliss John Stuart Method and system for associating an object to a moment in time in a digital video
US20110167354A1 (en) * 2008-06-06 2011-07-07 Google Inc. Rich media notice board
US20110176788A1 (en) * 2009-12-18 2011-07-21 Bliss John Stuart Method and System for Associating an Object to a Moment in Time in a Digital Video
US20120215543A1 (en) * 2011-02-18 2012-08-23 Nuance Communications, Inc. Adding Speech Capabilities to Existing Computer Applications with Complex Graphical User Interfaces
WO2012142323A1 (en) * 2011-04-12 2012-10-18 Captimo, Inc. Method and system for gesture based searching
US20130041666A1 (en) * 2011-08-08 2013-02-14 Samsung Electronics Co., Ltd. Voice recognition apparatus, voice recognition server, voice recognition system and voice recognition method
CN103828379A (en) * 2011-09-12 2014-05-28 英特尔公司 Using gestures to capture multimedia clips
US20140270258A1 (en) * 2013-03-15 2014-09-18 Pantech Co., Ltd. Apparatus and method for executing object using voice command
US11614917B1 (en) * 2021-04-06 2023-03-28 Suki AI, Inc. Systems and methods to implement commands based on selection sequences to a user interface
US11934741B2 (en) * 2019-10-10 2024-03-19 T-Mobile Usa, Inc. Enhanced voice user interface experience via preview services on an external assistance channel

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
USD762229S1 (en) * 2013-09-13 2016-07-26 Axis Ab Display screen or portion thereof including graphical user interface for access control
USD768141S1 (en) 2013-09-16 2016-10-04 Airbus Operations (S.A.S.) Display screen with a graphical user interface
USD766278S1 (en) * 2013-09-16 2016-09-13 Airbus Operations (S.A.S.) Display screen with a graphical user interface
IT202000005716A1 (en) * 2020-03-18 2021-09-18 Mediavoice S R L A method of navigating a resource using voice interaction

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5884266A (en) * 1997-04-02 1999-03-16 Motorola, Inc. Audio interface for document based information resource navigation and method therefor
US20040006478A1 (en) * 2000-03-24 2004-01-08 Ahmet Alpdemir Voice-interactive marketplace providing promotion and promotion tracking, loyalty reward and redemption, and other features
US20040102973A1 (en) * 2002-11-21 2004-05-27 Lott Christopher B. Process, apparatus, and system for phonetic dictation and instruction
US20040215456A1 (en) * 2000-07-31 2004-10-28 Taylor George W. Two-way speech recognition and dialect system
US7020841B2 (en) * 2001-06-07 2006-03-28 International Business Machines Corporation System and method for generating and presenting multi-modal applications from intent-based markup scripts
US20070061132A1 (en) * 2005-09-14 2007-03-15 Bodin William K Dynamically generating a voice navigable menu for synthesized data
US20080005127A1 (en) * 2002-01-05 2008-01-03 Eric Schneider Sitemap Access Method, Product, And Apparatus
US20090013255A1 (en) * 2006-12-30 2009-01-08 Matthew John Yuschik Method and System for Supporting Graphical User Interfaces
US20090106228A1 (en) * 2007-10-23 2009-04-23 Weinman Jr Joseph B Method and apparatus for providing a user traffic weighted search
US20090119587A1 (en) * 2007-11-02 2009-05-07 Allen James F Interactive complex task teaching system

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7181692B2 (en) * 1994-07-22 2007-02-20 Siegel Steven H Method for the auditory navigation of text
US5878421A (en) * 1995-07-17 1999-03-02 Microsoft Corporation Information map
US6230173B1 (en) * 1995-07-17 2001-05-08 Microsoft Corporation Method for creating structured documents in a publishing system
US6101510A (en) * 1997-01-29 2000-08-08 Microsoft Corporation Web browser control for incorporating web browser functionality into application programs
US6357010B1 (en) * 1998-02-17 2002-03-12 Secure Computing Corporation System and method for controlling access to documents stored on an internal network
US6587822B2 (en) * 1998-10-06 2003-07-01 Lucent Technologies Inc. Web-based platform for interactive voice response (IVR)
US6667751B1 (en) * 2000-07-13 2003-12-23 International Business Machines Corporation Linear web browser history viewer
US7970437B2 (en) * 2000-11-29 2011-06-28 Nokia Corporation Wireless terminal device with user interaction system
US8702504B1 (en) * 2001-11-05 2014-04-22 Rovi Technologies Corporation Fantasy sports contest highlight segments systems and methods
US7493259B2 (en) * 2002-01-04 2009-02-17 Siebel Systems, Inc. Method for accessing data via voice
DE60231844D1 (en) * 2002-12-20 2009-05-14 Nokia Corp NEW RELEASE INFORMATION WITH META INFORMATION
US7280967B2 (en) * 2003-07-30 2007-10-09 International Business Machines Corporation Method for detecting misaligned phonetic units for a concatenative text-to-speech voice
US7412726B1 (en) * 2003-12-08 2008-08-12 Advanced Micro Devices, Inc. Method and apparatus for out of order writing of status fields for receive IPsec processing
US9300790B2 (en) * 2005-06-24 2016-03-29 Securus Technologies, Inc. Multi-party conversation analyzer and logger
TWI305345B (en) * 2006-04-13 2009-01-11 Delta Electronics Inc System and method of the user interface for text-to-phone conversion
KR100801895B1 (en) * 2006-08-08 2008-02-11 삼성전자주식회사 Web service providing system and method for providing web service to digital broadcasting receiving terminal

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5884266A (en) * 1997-04-02 1999-03-16 Motorola, Inc. Audio interface for document based information resource navigation and method therefor
US20040006478A1 (en) * 2000-03-24 2004-01-08 Ahmet Alpdemir Voice-interactive marketplace providing promotion and promotion tracking, loyalty reward and redemption, and other features
US20040215456A1 (en) * 2000-07-31 2004-10-28 Taylor George W. Two-way speech recognition and dialect system
US7020841B2 (en) * 2001-06-07 2006-03-28 International Business Machines Corporation System and method for generating and presenting multi-modal applications from intent-based markup scripts
US20080005127A1 (en) * 2002-01-05 2008-01-03 Eric Schneider Sitemap Access Method, Product, And Apparatus
US20040102973A1 (en) * 2002-11-21 2004-05-27 Lott Christopher B. Process, apparatus, and system for phonetic dictation and instruction
US20070061132A1 (en) * 2005-09-14 2007-03-15 Bodin William K Dynamically generating a voice navigable menu for synthesized data
US20090013255A1 (en) * 2006-12-30 2009-01-08 Matthew John Yuschik Method and System for Supporting Graphical User Interfaces
US20090106228A1 (en) * 2007-10-23 2009-04-23 Weinman Jr Joseph B Method and apparatus for providing a user traffic weighted search
US20090119587A1 (en) * 2007-11-02 2009-05-07 Allen James F Interactive complex task teaching system

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110167354A1 (en) * 2008-06-06 2011-07-07 Google Inc. Rich media notice board
US9870539B2 (en) * 2008-06-06 2018-01-16 Google Llc Establishing communication in a rich media notice board
US20110138286A1 (en) * 2009-08-07 2011-06-09 Viktor Kaptelinin Voice assisted visual search
US8724963B2 (en) 2009-12-18 2014-05-13 Captimo, Inc. Method and system for gesture based searching
US20110158605A1 (en) * 2009-12-18 2011-06-30 Bliss John Stuart Method and system for associating an object to a moment in time in a digital video
US20110176788A1 (en) * 2009-12-18 2011-07-21 Bliss John Stuart Method and System for Associating an Object to a Moment in Time in a Digital Video
US9449107B2 (en) 2009-12-18 2016-09-20 Captimo, Inc. Method and system for gesture based searching
US20120215543A1 (en) * 2011-02-18 2012-08-23 Nuance Communications, Inc. Adding Speech Capabilities to Existing Computer Applications with Complex Graphical User Interfaces
US9081550B2 (en) * 2011-02-18 2015-07-14 Nuance Communications, Inc. Adding speech capabilities to existing computer applications with complex graphical user interfaces
WO2012142323A1 (en) * 2011-04-12 2012-10-18 Captimo, Inc. Method and system for gesture based searching
US20130041666A1 (en) * 2011-08-08 2013-02-14 Samsung Electronics Co., Ltd. Voice recognition apparatus, voice recognition server, voice recognition system and voice recognition method
CN103828379A (en) * 2011-09-12 2014-05-28 英特尔公司 Using gestures to capture multimedia clips
US20140270258A1 (en) * 2013-03-15 2014-09-18 Pantech Co., Ltd. Apparatus and method for executing object using voice command
US11934741B2 (en) * 2019-10-10 2024-03-19 T-Mobile Usa, Inc. Enhanced voice user interface experience via preview services on an external assistance channel
US11614917B1 (en) * 2021-04-06 2023-03-28 Suki AI, Inc. Systems and methods to implement commands based on selection sequences to a user interface
US11853652B2 (en) * 2021-04-06 2023-12-26 Suki AI, Inc. Systems and methods to implement commands based on selection sequences to a user interface

Also Published As

Publication number Publication date
US10664229B2 (en) 2020-05-26
US20140358903A1 (en) 2014-12-04

Similar Documents

Publication Publication Date Title
US10664229B2 (en) Search-based dynamic voice activation
KR101359715B1 (en) Method and apparatus for providing mobile voice web
US20080256033A1 (en) Method and apparatus for distributed voice searching
KR100661687B1 (en) Web-based platform for interactive voice responseivr
Reddy et al. Speech to text conversion using android platform
US7650284B2 (en) Enabling voice click in a multimodal page
US8949132B2 (en) System and method of providing a spoken dialog interface to a website
US8650031B1 (en) Accuracy improvement of spoken queries transcription using co-occurrence information
US9684741B2 (en) Presenting search results according to query domains
US8725492B2 (en) Recognizing multiple semantic items from single utterance
US20080162472A1 (en) Method and apparatus for voice searching in a mobile communication device
US20070208561A1 (en) Method and apparatus for searching multimedia data using speech recognition in mobile device
US20020062216A1 (en) Method and system for gathering information by voice input
WO2022105861A1 (en) Method and apparatus for recognizing voice, electronic device and medium
US20110145214A1 (en) Voice web search
US8805871B2 (en) Cross-lingual audio search
Koumpis et al. Content-based access to spoken audio
Thennattil et al. Phonetic engine for continuous speech in Malayalam
Di Fabbrizio et al. AT&t help desk.
US20060149545A1 (en) Method and apparatus of speech template selection for speech recognition
JP2007213554A (en) Method for rendering rank-ordered result set for probabilistic query, executed by computer
US20080133240A1 (en) Spoken dialog system, terminal device, speech information management device and recording medium with program recorded thereon
Leavitt Two technologies vie for recognition in speech market
JP7257010B2 (en) SEARCH SUPPORT SERVER, SEARCH SUPPORT METHOD, AND COMPUTER PROGRAM
Jeevitha et al. A study on innovative trends in multimedia library using speech enabled softwares

Legal Events

Date Code Title Description
AS Assignment

Owner name: MOTOROLA, INC., ILLINOIS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:CHANG, YAN MING;MA, CHANGXUE;MAZURKIEWICZ, TED;REEL/FRAME:020990/0331;SIGNING DATES FROM 20080507 TO 20080509

AS Assignment

Owner name: MOTOROLA MOBILITY, INC, ILLINOIS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MOTOROLA, INC;REEL/FRAME:025673/0558

Effective date: 20100731

AS Assignment

Owner name: MOTOROLA MOBILITY LLC, ILLINOIS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MOTOROLA MOBILITY, INC.;REEL/FRAME:028829/0856

Effective date: 20120622

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE

AS Assignment

Owner name: GOOGLE TECHNOLOGY HOLDINGS LLC, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MOTOROLA MOBILITY LLC;REEL/FRAME:034244/0014

Effective date: 20141028