WO2002011120A1 - System and method for voice-activated web content navigation - Google Patents

System and method for voice-activated web content navigation Download PDF

Info

Publication number
WO2002011120A1
WO2002011120A1 PCT/US2001/024486 US0124486W WO0211120A1 WO 2002011120 A1 WO2002011120 A1 WO 2002011120A1 US 0124486 W US0124486 W US 0124486W WO 0211120 A1 WO0211120 A1 WO 0211120A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
content
navigational
objects
natural language
Prior art date
Application number
PCT/US2001/024486
Other languages
French (fr)
Inventor
Brian Ty Graham
Original Assignee
Speaklink, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Speaklink, Inc. filed Critical Speaklink, Inc.
Priority to AU2001284713A priority Critical patent/AU2001284713A1/en
Publication of WO2002011120A1 publication Critical patent/WO2002011120A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/30Definitions, standards or architectural aspects of layered protocol stacks
    • H04L69/32Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
    • H04L69/322Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
    • H04L69/329Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer [OSI layer 7]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
    • H04M3/4938Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals comprising a voice browser which renders and interprets, e.g. VoiceXML
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output

Definitions

  • the present invention pertains to voice access and interaction with web- based services and content, and, more particularly, to a system and method, including software and software architecture, that enables verbal interaction with computers, mobile devices, voice sites, voice content, and data and web-based services, all via a local and worldwide computer network, such as the Internet.
  • Telephonic communication is convenient, rapid, and provides accessibility to many private and commercial services.
  • Mobile and cellular telephones provide increased access to telephonic communication and, hence have resulted in increased use and reliance on telephonic communications.
  • An area where voice communication has been underutilized is in accessing remote devices, such as computers, electronic messaging, and information and service providers doing business over public computer networks, such as the
  • the present invention provides a system and method for enabling verbal access to and interaction with computers, mobile devices, and voice and data communication services via a computer network, such as the Internet.
  • a computer network such as the Internet.
  • users access the system via a telephonic link, such as by dialing a telephone number to access a voice portal.
  • verbal commands such as keywords
  • the user may access voice messages, electronic mail (e-mail), and computer networks, such as the Internet.
  • e-mail electronic mail
  • computer networks such as the Internet.
  • Access to web sites via existing bookmarks or keywords, e-mail addresses for messaging, as well as utility applications, such as reconfiguring a user's personal parameters in the system is also provided.
  • a method for navigating retrieved web-based content includes translating the web-based content into a natural language voice content for voice applications; dividing the natural language voice content into navigational objects; and associating each navigational object with pre-built navigation options.
  • the method can further include selecting any one of the navigation options by voice to perform the navigation option associated with the navigational object.
  • the method can further include sequentially playing the voice content of the navigational object resulting from the selected navigation option and, at the choice of the user, playing the voice content associated with the navigational object next in sequence or selecting another navigational obj ect.
  • the navigation options can include at least one application option such as e-mail, facsimile, printing, and telephone notification.
  • a method for enabling a caller to transfer to a third-party telephone number after hearing voice advertisements includes providing the voice advertisement for an advertiser; providing a keyword associated with an action of connecting to the advertiser's voice site; providing a keyword to bookmark a link to the advertiser's voice site; carrying out an action associated with the keyword spoken by the user to connect to the advertiser's voice site or to bookmark a link to the advertiser's voice site; and returning to the advertisement.
  • a method for jumping to a link inside a voice portal includes receiving a voice command containing at least two natural language phrases; dividing the at least two natural language phrases into objects that represent an action and an item; translating the objects into corresponding text objects; concatenating the text objects into a hyperlink to a voice site; and executing the hyperlink to the identified link in the voice site.
  • the voice command contains at least three natural language phrases that are divided into three objects, an action object, an item object, and a supplier object.
  • a method of interacting with a voice home page via telephone for providing voice portal access via subscriber voice-mail numbers includes storing user and advertiser information in a voice-content memory; receiving a telephone call from a user; providing the user with options to leave a message or to access voice-based links to a voice website; and responding to the selection by executing the link associated with a command spoken by the user.
  • the method includes capturing the DID number to retrieve user information and to provide the same information to the recipient of the call.
  • a system in accordance with another aspect of the foregoing method, includes a web-based server with a voice browser configured to provide interaction with a voice homepage and to jump to a link in accordance with the foregoing methods.
  • Figure 1 is a diagram of the system architecture formed in accordance with one embodiment of the present invention
  • Figure 2 is a diagram of a voice portal architecture formed in accordance with the present invention
  • Figure 3 is a diagram illustrating the telephonic connection between a caller and a hosting company
  • Figure 4 is a diagram illustrating a representative example of user login to a system formed in accordance with the present invention
  • Figures 5A-5F are diagrams illustrating the Drivelt! application for obtaining driving directions
  • Figure 6 is a diagram illustrating user access to travel-related sites
  • Figure 7 is a diagram illustrating user access to electronic mail and messaging
  • Figure 8 is a diagram illustrating user access to electronic purchasing
  • FIG. 9 is a diagram illustrating the Advertiselt! application
  • Figures 10A-10C are diagrams illustrating the Purchaselt! application.
  • Figures 11 A-l IL are diagrams illustrating the My.Speaklink application.
  • the system 10 of the present invention includes in one embodiment a voice portal architecture 12 that provides interconnectivity with web-based providers and others via a voice portal application 14, a web portal application 16, and with specialized applications that include the Advertiselt! application 18, the Purchaselt! application 20, and the Drivelt! application 22.
  • the My.Speaklink application 24 and the Traffic Speak application 26 provide user customization and subscriber information sharing. It is to be understood that for purposes of this description, the terms "caller,” “user,” and “subscriber” are used interchangeably unless otherwise indicated or as understood from the context.
  • a user can connect to voice site menus 28 via call management software 30 and to various applications, including, as one example, the Drivelt! application 22, as well as connecting to website menus 32 via the Web parser application 34 and the Language application 36.
  • Information and data such as subscriber and caller information, are obtained through the Register/Login application 40, the Purchaselt! application 20, and the Traffic Speak application 26 are stored in the database 44.
  • FIG. 3 An overview of a telephonic connection between a user telephone 46 and a host site 56 is shown in Figure 3.
  • the telephone 46 which can be a land line or a wireless device, is initially connected to a local area network LAN 48 through a telephony server/voice browser 50.
  • a Tl line 42 from the LAN 48 connection is made to the Internet 52 and thence through a hosting company backbone 54 to a hosting company's website 56. This connection is essentially transparent to the caller.
  • a user logs in at the site.
  • One example of login is shown in Figure 4.
  • the user will hear an initial welcome ad 58, which can play for approximately 15 seconds.
  • the user speaks a keyword, which may be a word as indicated in the welcome ad 58, at which point the user is transferred to a site that provides more information about the welcome ad 58.
  • site can be the host site for the vendor of the ad where the user can obtain more information or order services or product.
  • the user then speaks another keyword, such as "Return” or "Back" to return to the previous point in the login process, which in this case would be the welcome ad 58.
  • the user is then connected to the front page or main menu 60 of the host site 56.
  • a submenu ad 62 is played when the user selects a submenu 64 from the front page 60.
  • the user may also hear a targeted "Keyword Ad" 66.
  • additional information is obtained by speaking the keyword, at which time the user is redirected to the keyword's voice site 68.
  • Return to the main menu 60 is effectuated by speaking the appropriate command, e.g., "Return" or Back.”
  • the Web parser application 34 is configured to gather information from the web site and transform it into voice XML.
  • the voice XML can then be played to the user by telephone.
  • the language used i.e., English, French, German, etc., is based on a user's preference. The language selection can be made by the user dynamically or at the time of login.
  • the register/login application 40 identifies a caller using their Caller ID number, which is stored in the database 44.
  • the user registers by entering an address and profile of him/herself.
  • a "voice print" may be recorded of their voice for later voice verification as part of the security procedure.
  • the user can then send feedback to the service.
  • the Purchaselt! application 20 is configured to allow a user to request a product from a vendor's voice site.
  • the user provides necessary information in the fields requested by the voice site in order to complete the purchase.
  • the voice site is required by the system 10 to submit an authorization code in order to complete the transaction.
  • the host site 56 identified as "Speaklink data center" utilizes the stored subscriber profile 70, tracked subscriber purchases 72, and stored subscriber authorizations 74 to provide quick, secure access and seamless transactions.
  • the Purchaselt! architecture is shown more clearly in Figure 10A where the Purchaselt! application 20 resides at the host site data center 56, which includes the stored subscriber profile 70, track subscriber purchases 72, and stored subscriber authorization 74.
  • a partner voice site 76 promotes the product 78 that the subscriber desires to purchase.
  • Authorization for product purchase occurs when the partner voice site 76 sends the partner request code 80 that requests authorization from the host site data center 56.
  • each partner voice site 76 that desires to use the Purchaselt! application 20 service will need to have an authorization code.
  • the code authorizes the transaction.
  • the host site 56 will send the partner voice site 76 the purchasing and profiling information required to make the purchase from the stored subscriber authorizations 74.
  • the subscriber may verify the information using speaker verification.
  • FIG. 5A-5F shown therein is a graphic representation of the voice portal architecture for the Drivelt! application 22.
  • the user initially accesses the service by speaking a key word, such as "drive it.”
  • the Drivelt! application 22 begins the process by asking the user for a starting block number 82.
  • the user then states the numeric block number 82 that they are either located at or are starting from.
  • the application 22 then asks the user for the geographical direction or the front direction 84 of the street if there is a direction preceding the street name.
  • the user responds by speaking direction, e.g., North, South, East, West, Southeast, Southwest, Northwest, or Northeast.
  • the user is then asked to verify whether the street has a numerical or alphabetical street name 86.
  • street name 86 if the user says "alphabetical,” then the application 22 asks for the spelling of the street name or for the user to specify if the street name is already on file. Street names are stored and retrieved from the database 44.
  • the user is asked for the street suffix 88, to which the user responds by stating the appropriate suffix, such as street, court, road, avenue, etc.
  • the user is asked to provide the geographical direction following the street suffix, if there is one, to which the user responds appropriately, e.g., North, South, East, West, Southeast, Southwest, Northwest, or Northeast.
  • the address is then compiled into a complete address by concatenating the block number 82, the front direction 84, street name 86, street suffix 88, and the back direction 90.
  • the end result at this point is a complete USA street address 92, such as 123 North Main Street.
  • the user is then asked for the spelling of the city they are starting from or for the user to say the city name 94 if it is already stored in the database 44. If a new name is spoken, it is stored in the database 44 for future reference.
  • the user is asked to identify the state 96 they are starting from, and after the response is received from the user, the process is complete for obtaining the starting from information.
  • the application 22 then repeats this process to obtain the destination location.
  • the starting and destination addresses are then provided to a submit engine 98, which retrieves returned directions from a map provider, such as MapQuest.com via a convert engine 100 and converge engine 102.
  • the full address 104 is submitted to the map provider via a licensed MapQuest CGI-BIN application.
  • the primary focus of the CGI-BIN application that MapQuest sells via their website is to request the driving directions from the MapQuest.com webservers.
  • the CGI-BIN application 106 returns turn-by-turn driving directions back in raw code 108 form, such as "Turn RT on Main ST N" as shown in Figure 10.
  • the converge engine 102 receives the raw code 108, it is compared against a list of converters such as to make "RT" become “Right,” and so forth.
  • the converge engine 102 contains a master list that does the verbal translation of the converting on-the-fly of the raw code 108, making it plain English for the text-to-speech engine 110 that includes a verbal translation parser 112 that returns the plain English code as shown in Figure 5D.
  • the returned plain English code 114 from the verbal translation parser 112 is then placed in an HTML page Tripplus.htm 116.
  • the user retrieves the turn-by-turn directions via a sequencer 118 that enables the user to say "play,” “rewind,” “forward,” “fast forward,” “skip,” or “pause” to control the playback of the directions.
  • the converge engine 102 is configured to make necessary changes on each turn-by-tum direction to sequence the animation of the directions.
  • the converge engine 102 is thus configured to primarily put in the Tripplus.htm 116 the numbers that correlate to the turns, such as "turn 1,” “turn 2,” “turn 3,” etc.
  • the code found in Tripplus.htm 116 will control the playback to go to "turn 1," “turn 2,” “turn 3,” etc.
  • the CGI-BIN application 106 returns the raw code 108 to the Tripplus.htm 116 HTML page, and it is the function of all three engines - the submit engine 98, the convert engine 100, and the converge engine 102 - to function together to bring the translation into plain English and provide the VCR-like capabilities when piped through a text-to-speech engine 110.
  • the user can direct the output via the Drivelt! engine 120 to external devices such as a phone 122, e-mail 124, fax 126, or a pager 128.
  • the user specifies the device, such as by saying "e-mail," the user is then taken to another page that asks for the name of the user, the username of the user, and the domain name of the user.
  • the directions are then sent to the specified device, such as e-mail 124 via the e-mail address just collected.
  • the user can return to where they left off in the hearing of the directions by speaking a keyword.
  • the user will say "fax,” and they are taken to another page that asks for the fax number and the username.
  • the application 22 then sends the facsimile to the specified number using an outside facsimile-over-internet vendor. If the user says, "pager,” the user is then sent to another page to provide the pager number, PIN, and carrier, and then the application 22 sends an e-mail to the pager provider or dials the pager number as necessary.
  • the system 10 provides a number of functional features that enable users to access virtually website by voice and hear content or transact business.
  • Figures 6 through 8 are graphical representations of three examples, specifically travel, mail, and shopping, respectively.
  • users are able to plan travel and make travel arrangements for flights, buses, trains, ferries, hotels, vacation, car rental, and obtain directions from the websites of the providers, as indicated.
  • users can enter content information by voice to read and reply to e-mail POP accounts to read and reply to voicemail accounts, and, for example, to send a voice greeting via telephone, e-mail, or voicemail.
  • Shopping can be accomplished, as shown in Figure 8, for a variety of goods by accessing the websites of the appropriate vendors.
  • the call flow is a representative example of a user's movement through the system 10 by voice. Voice commands are shown within quotes.
  • the system 10 also provides global keywords or variables that, during any prompt, a subscriber or user can say to get additional information, such as "help,” “menu,” and “options,” as well as those described in more detail below under heading number 4.
  • Speaklink application design a The following are a list of representative specifications for voice portal applications. i. VoxSurf/POP email application 1. ⁇ If new messages ⁇ a. You have [number of new messages]. Here is the first message i. [READ E-MAIL]
  • the system 10 also enables users to directly contact advertisers through
  • the Advertiselt! application 18 As shown in Figure 9, after the user listens to an advertisement through a voice portal, the user says a keyword, such as "Speaklink,” to voice click the ad. At this time the user's responses are tracked by the system 10 as the user elects to either directly connect to the advertiser's voice site or to the advertiser's telephone, or in the alternative, to leave a message on the advertiser's telephone or voice site requesting a call back. In this way, instant connects can be made between users and advertisers with simple spoken commands.
  • This type of advertising through welcome ads, keyword ads, and submenu ads as described above provides high click- throughs, brand exposure, market exposure, instant connection, demographic marketing, and marketing to targeted subjects.
  • the My.Speaklink application 24 contains a number of features to enhance user voice navigation and transactions. For example, a user can have all of their messages such as e-mail and voicemail, as well as their faxes, music, and document files stored in a universal message box 122. As shown in Figure 11 A, this message box 122 can then be accessed by the user through the voice portal application 14 or the web portal application 16. Long-distance messages can be sent via the Internet through VoIP to nonsubscribers, as shown in Figure 1 IB. Home automation is also possible through microphones placed within a user's home and configured to recognize the user's speech. This application can also be used in a user's vehicle. Through the Web parser application 34 and the bookmarking feature, a user can consolidate their personal, work, and other calendars as well as aggregate material provided by outside calendar content vendors 132 as shown in Figure 1 ID.
  • Figures HE and 11F illustrate the user's ability to customize their greetings by speaking a verbal title and sending the greeting to a recipient's voicemail or telephone number.
  • the user may choose from a music element library 134 to choose background sounds, such as a famous star singing Happy Birthday or something similar. Though a user can provide a voice attachment to play a personalized message.
  • Buddy lists shown in Figure 11F, allow a user to send greetings and participate in file sharing with selected friends via a buddy list connector 136.
  • users may bookmark voice sites, as shown in
  • FIG. 111 illustrates the feature of extended mail accounts, which enables a user to send e-mails to multiple POP e-mail accounts 148.
  • FIG 11J illustrates another important feature of the present invention, the voice homepage for a user.
  • a consumer calls a subscriber, such as a business, and the telephone is answered by the host site 56 through the host site voice portal 150.
  • the voice portal 150 replaces the traditional voicemail.
  • the consumer can then leave a message, enter into a business transaction, or retrieve information from the subscriber site.
  • the voice homepage checks to see if the caller is a user of the host site voice portal 150 and, if so, they do not have to log in to become part of the same network. This facilitates faster connections and transactions among subscribers.
  • the Traffic Speak application 26 is shown in Figure 1 IL where the voice portal 14 and web portal 16 cooperate to track the consumer and build demographics about the consumer, as described throughout above.
  • the system of the present invention also provides a method for navigating retrieved web-based content.
  • the method includes steps of translating the web-based content into a natural language voice content for voice applications; dividing the natural language voice content into navigational objects; associating each navigational object with the pre-built navigation options; and selecting any one of the navigation options by voice to perform the navigation option associated with the navigational objects.
  • each phrase of a natural language voice content would comprise a navigational object.
  • Each object would then be associated with certain options available to the user for either reviewing the content of that object or moving to other objects within the natural language voice content. These could include VCR-like capabilities, such as start, pause, fast forward, reverse, fast reverse, stop, return to the beginning, and fast forward to the end.
  • the user may also bookmark the object for later retrieval.
  • the navigational objects can include applications, such as voice-mail, e-mail, facsimile, telephone, and the like.
  • the system is configured to return a user to the calling application once a called application is completed. For instance, after a navigational object is sent to e-mail, the system automatically returns the user to the navigational object, and additional applications or navigational objects can be chosen, such as sending the navigational object or the related voice content, as appropriate, to a facsimile or a printer or even to a telephone.

Abstract

A system and method for enabling verbal access (30) to an interaction (22,30,34,36) with computers, mobile devices, voice sites (28), voice content, data (44) and web-based services (32). The system includes a navigation engine to enable users to review (38) web-based content with VCR-like capabilities, including stop, pause, fast forward, and reverse; also included are bookmarking and quick jumping to links within other sites. Subscribers, such as customers and vendors, can quickly and efficiently transact business (20,26,40) through the system.

Description

SYSTEM AND METHOD FOR VOICE- ACTIVATED WEB CONTENT NAVIGATION
BACKGROUND OF THE INVENTION
Technical Field
The present invention pertains to voice access and interaction with web- based services and content, and, more particularly, to a system and method, including software and software architecture, that enables verbal interaction with computers, mobile devices, voice sites, voice content, and data and web-based services, all via a local and worldwide computer network, such as the Internet.
Description of the Related Art Telephonic communication is convenient, rapid, and provides accessibility to many private and commercial services. Mobile and cellular telephones provide increased access to telephonic communication and, hence have resulted in increased use and reliance on telephonic communications.
An area where voice communication has been underutilized is in accessing remote devices, such as computers, electronic messaging, and information and service providers doing business over public computer networks, such as the
Internet.
BRIEF SUMMARY OF THE INVENTION
The present invention provides a system and method for enabling verbal access to and interaction with computers, mobile devices, and voice and data communication services via a computer network, such as the Internet. In accordance with a method of the present invention, users access the system via a telephonic link, such as by dialing a telephone number to access a voice portal. Using verbal commands, such as keywords, the user may access voice messages, electronic mail (e-mail), and computer networks, such as the Internet. Access to web sites via existing bookmarks or keywords, e-mail addresses for messaging, as well as utility applications, such as reconfiguring a user's personal parameters in the system is also provided.
In accordance with a system of the present invention, users can access web-based content and services with spoken natural language commands. In accordance with one embodiment of the present invention, a method for navigating retrieved web-based content is provided that includes translating the web-based content into a natural language voice content for voice applications; dividing the natural language voice content into navigational objects; and associating each navigational object with pre-built navigation options. The method can further include selecting any one of the navigation options by voice to perform the navigation option associated with the navigational object. Moreover, the method can further include sequentially playing the voice content of the navigational object resulting from the selected navigation option and, at the choice of the user, playing the voice content associated with the navigational object next in sequence or selecting another navigational obj ect.
In accordance with another aspect of the foregoing method, the navigation options can include at least one application option such as e-mail, facsimile, printing, and telephone notification.
In accordance with another embodiment of the invention, a method for enabling a caller to transfer to a third-party telephone number after hearing voice advertisements is provided. The method includes providing the voice advertisement for an advertiser; providing a keyword associated with an action of connecting to the advertiser's voice site; providing a keyword to bookmark a link to the advertiser's voice site; carrying out an action associated with the keyword spoken by the user to connect to the advertiser's voice site or to bookmark a link to the advertiser's voice site; and returning to the advertisement.
In accordance with another embodiment of the invention, a method for jumping to a link inside a voice portal is provided. The method includes receiving a voice command containing at least two natural language phrases; dividing the at least two natural language phrases into objects that represent an action and an item; translating the objects into corresponding text objects; concatenating the text objects into a hyperlink to a voice site; and executing the hyperlink to the identified link in the voice site. In accordance with another aspect of the foregoing method, the voice command contains at least three natural language phrases that are divided into three objects, an action object, an item object, and a supplier object.
In accordance with another embodiment of the invention, a method of interacting with a voice home page via telephone for providing voice portal access via subscriber voice-mail numbers is provided. The method includes storing user and advertiser information in a voice-content memory; receiving a telephone call from a user; providing the user with options to leave a message or to access voice-based links to a voice website; and responding to the selection by executing the link associated with a command spoken by the user. Ideally, the method includes capturing the DID number to retrieve user information and to provide the same information to the recipient of the call.
In accordance with another aspect of the foregoing method, a system is provided that includes a web-based server with a voice browser configured to provide interaction with a voice homepage and to jump to a link in accordance with the foregoing methods.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
The foregoing features and advantages of the present invention will be more readily appreciated as the same become better understood from the following detailed description when taken in conjunction with the accompanying drawings, wherein:
Figure 1 is a diagram of the system architecture formed in accordance with one embodiment of the present invention; Figure 2 is a diagram of a voice portal architecture formed in accordance with the present invention;
Figure 3 is a diagram illustrating the telephonic connection between a caller and a hosting company; Figure 4 is a diagram illustrating a representative example of user login to a system formed in accordance with the present invention;
Figures 5A-5F are diagrams illustrating the Drivelt! application for obtaining driving directions;
Figure 6 is a diagram illustrating user access to travel-related sites; Figure 7 is a diagram illustrating user access to electronic mail and messaging;
Figure 8 is a diagram illustrating user access to electronic purchasing;
Figures 9 is a diagram illustrating the Advertiselt! application;
Figures 10A-10C are diagrams illustrating the Purchaselt! application; and
Figures 11 A-l IL are diagrams illustrating the My.Speaklink application.
DETAILED DESCRIPTION OF THE INVENTION
Referring initially to Figures 1 and 2, the system 10 of the present invention includes in one embodiment a voice portal architecture 12 that provides interconnectivity with web-based providers and others via a voice portal application 14, a web portal application 16, and with specialized applications that include the Advertiselt! application 18, the Purchaselt! application 20, and the Drivelt! application 22. The My.Speaklink application 24 and the Traffic Speak application 26 provide user customization and subscriber information sharing. It is to be understood that for purposes of this description, the terms "caller," "user," and "subscriber" are used interchangeably unless otherwise indicated or as understood from the context.
As shown in Figure 2, a user can connect to voice site menus 28 via call management software 30 and to various applications, including, as one example, the Drivelt! application 22, as well as connecting to website menus 32 via the Web parser application 34 and the Language application 36. Information and data, such as subscriber and caller information, are obtained through the Register/Login application 40, the Purchaselt! application 20, and the Traffic Speak application 26 are stored in the database 44.
An overview of a telephonic connection between a user telephone 46 and a host site 56 is shown in Figure 3. The telephone 46, which can be a land line or a wireless device, is initially connected to a local area network LAN 48 through a telephony server/voice browser 50. Using as an example a Tl line 42 from the LAN 48, connection is made to the Internet 52 and thence through a hosting company backbone 54 to a hosting company's website 56. This connection is essentially transparent to the caller.
Once connected to the host site 56, a user logs in at the site. One example of login is shown in Figure 4. The user will hear an initial welcome ad 58, which can play for approximately 15 seconds. If the user desires more information, the user speaks a keyword, which may be a word as indicated in the welcome ad 58, at which point the user is transferred to a site that provides more information about the welcome ad 58. Such site can be the host site for the vendor of the ad where the user can obtain more information or order services or product. The user then speaks another keyword, such as "Return" or "Back" to return to the previous point in the login process, which in this case would be the welcome ad 58.
After the welcome ad 58 concludes, the user is then connected to the front page or main menu 60 of the host site 56. At this point a submenu ad 62 is played when the user selects a submenu 64 from the front page 60. While at the front page or main menu 60, the user may also hear a targeted "Keyword Ad" 66. If the user desires, additional information is obtained by speaking the keyword, at which time the user is redirected to the keyword's voice site 68. Return to the main menu 60 is effectuated by speaking the appropriate command, e.g., "Return" or Back." The Web parser application 34 is configured to gather information from the web site and transform it into voice XML. The voice XML can then be played to the user by telephone. The language used, i.e., English, French, German, etc., is based on a user's preference. The language selection can be made by the user dynamically or at the time of login.
At the time of login, the register/login application 40 identifies a caller using their Caller ID number, which is stored in the database 44. At the initial step the user registers by entering an address and profile of him/herself. A "voice print" may be recorded of their voice for later voice verification as part of the security procedure. After the user registers, the user can then send feedback to the service.
The Purchaselt! application 20 is configured to allow a user to request a product from a vendor's voice site. The user provides necessary information in the fields requested by the voice site in order to complete the purchase. The voice site is required by the system 10 to submit an authorization code in order to complete the transaction. This is shown graphically in Figure 10B where the host site 56, identified as "Speaklink data center" utilizes the stored subscriber profile 70, tracked subscriber purchases 72, and stored subscriber authorizations 74 to provide quick, secure access and seamless transactions.
The Purchaselt! architecture is shown more clearly in Figure 10A where the Purchaselt! application 20 resides at the host site data center 56, which includes the stored subscriber profile 70, track subscriber purchases 72, and stored subscriber authorization 74. A partner voice site 76 promotes the product 78 that the subscriber desires to purchase. Authorization for product purchase occurs when the partner voice site 76 sends the partner request code 80 that requests authorization from the host site data center 56.
As shown in Figure 10C, each partner voice site 76 that desires to use the Purchaselt! application 20 service will need to have an authorization code. The code authorizes the transaction. The host site 56 will send the partner voice site 76 the purchasing and profiling information required to make the purchase from the stored subscriber authorizations 74. The subscriber may verify the information using speaker verification.
Referring next to Figures 5A-5F, shown therein is a graphic representation of the voice portal architecture for the Drivelt! application 22. The user initially accesses the service by speaking a key word, such as "drive it." The Drivelt! application 22 begins the process by asking the user for a starting block number 82. The user then states the numeric block number 82 that they are either located at or are starting from. The application 22 then asks the user for the geographical direction or the front direction 84 of the street if there is a direction preceding the street name. The user responds by speaking direction, e.g., North, South, East, West, Southeast, Southwest, Northwest, or Northeast. The user is then asked to verify whether the street has a numerical or alphabetical street name 86.
In identifying the street name 86, if the user says "alphabetical," then the application 22 asks for the spelling of the street name or for the user to specify if the street name is already on file. Street names are stored and retrieved from the database 44.
Next, the user is asked for the street suffix 88, to which the user responds by stating the appropriate suffix, such as street, court, road, avenue, etc. Finally, the user is asked to provide the geographical direction following the street suffix, if there is one, to which the user responds appropriately, e.g., North, South, East, West, Southeast, Southwest, Northwest, or Northeast. Once this back direction 90 is received, the address is then compiled into a complete address by concatenating the block number 82, the front direction 84, street name 86, street suffix 88, and the back direction 90. The end result at this point is a complete USA street address 92, such as 123 North Main Street.
Referring next to Figure 5B, after the street address 92 is collected through the user's responses, the user is then asked for the spelling of the city they are starting from or for the user to say the city name 94 if it is already stored in the database 44. If a new name is spoken, it is stored in the database 44 for future reference. Next, the user is asked to identify the state 96 they are starting from, and after the response is received from the user, the process is complete for obtaining the starting from information. The application 22 then repeats this process to obtain the destination location. The starting and destination addresses are then provided to a submit engine 98, which retrieves returned directions from a map provider, such as MapQuest.com via a convert engine 100 and converge engine 102.
As shown more clearly in Figure 5C, the full address 104 is submitted to the map provider via a licensed MapQuest CGI-BIN application. The primary focus of the CGI-BIN application that MapQuest sells via their website is to request the driving directions from the MapQuest.com webservers. The CGI-BIN application 106 returns turn-by-turn driving directions back in raw code 108 form, such as "Turn RT on Main ST N" as shown in Figure 10. When the converge engine 102 receives the raw code 108, it is compared against a list of converters such as to make "RT" become "Right," and so forth. The converge engine 102 contains a master list that does the verbal translation of the converting on-the-fly of the raw code 108, making it plain English for the text-to-speech engine 110 that includes a verbal translation parser 112 that returns the plain English code as shown in Figure 5D.
The returned plain English code 114 from the verbal translation parser 112 is then placed in an HTML page Tripplus.htm 116. The user then retrieves the turn-by-turn directions via a sequencer 118 that enables the user to say "play," "rewind," "forward," "fast forward," "skip," or "pause" to control the playback of the directions. In order for this to occur, the converge engine 102 is configured to make necessary changes on each turn-by-tum direction to sequence the animation of the directions. The converge engine 102 is thus configured to primarily put in the Tripplus.htm 116 the numbers that correlate to the turns, such as "turn 1," "turn 2," "turn 3," etc. Also, when the user says "rewind," the code found in Tripplus.htm 116 will control the playback to go to "turn 1," "turn 2," "turn 3," etc. The CGI-BIN application 106 returns the raw code 108 to the Tripplus.htm 116 HTML page, and it is the function of all three engines - the submit engine 98, the convert engine 100, and the converge engine 102 - to function together to bring the translation into plain English and provide the VCR-like capabilities when piped through a text-to-speech engine 110. After processing the directions, the user can direct the output via the Drivelt! engine 120 to external devices such as a phone 122, e-mail 124, fax 126, or a pager 128. Once the user specifies the device, such as by saying "e-mail," the user is then taken to another page that asks for the name of the user, the username of the user, and the domain name of the user. The directions are then sent to the specified device, such as e-mail 124 via the e-mail address just collected.
After sending the e-mail, the user can return to where they left off in the hearing of the directions by speaking a keyword. Should the user desire to send the output to another device, such as a facsimile machine, the user will say "fax," and they are taken to another page that asks for the fax number and the username. The application 22 then sends the facsimile to the specified number using an outside facsimile-over-internet vendor. If the user says, "pager," the user is then sent to another page to provide the pager number, PIN, and carrier, and then the application 22 sends an e-mail to the pager provider or dials the pager number as necessary.
The system 10 provides a number of functional features that enable users to access virtually website by voice and hear content or transact business. Figures 6 through 8 are graphical representations of three examples, specifically travel, mail, and shopping, respectively. As shown in Figure 6, users are able to plan travel and make travel arrangements for flights, buses, trains, ferries, hotels, vacation, car rental, and obtain directions from the websites of the providers, as indicated. Similarly, as shown in Figure 7, users can enter content information by voice to read and reply to e-mail POP accounts to read and reply to voicemail accounts, and, for example, to send a voice greeting via telephone, e-mail, or voicemail. Shopping can be accomplished, as shown in Figure 8, for a variety of goods by accessing the websites of the appropriate vendors.
Set forth below in pseudocode format is the call flow for the voice portal system of the present invention. Here, the term "Speaklink" refers to the host site provider. The call flow is a representative example of a user's movement through the system 10 by voice. Voice commands are shown within quotes. The system 10 also provides global keywords or variables that, during any prompt, a subscriber or user can say to get additional information, such as "help," "menu," and "options," as well as those described in more detail below under heading number 4.
Call Flow for Speaklink Voice Portal
1. [Speaklink Audio Logo] Welcome to Speaklink, the loudest revolution on the planet.
2. [Welcome Advertisement Voice Ad] 3. Say a keyword, or say quick jump to begin. If you would like to hear the front page say menu, a. Keyword i. "Voice Browser"
1. Please spell the voice site's URL after the tone. [Speaklink Audio Logo] ii. "Mail"
1. Would you like to check your e-mail, voice mail, or send a Speaklink greeting? a. {If E-mail} i. Would you like to check your Speaklink e- mail, or your other e-mail accounts?
1. {If Speaklink E-mail} |KWD: "My e-mail" a. [POP email application]
2. {If Other Accounts} a. [POP email external application] b. {If Voice-mail} |KWD: "My Voice mail" i. Would you like to check your Speaklink voice mail or your other voice mail accounts?
1. {If Speaklink Voice mail} a. [Voice Mail Program]
2. {If Other Accounts} a. [Voice Mail Dialer] c. {If My Speaklink{If Greetings}} | KWD:
"Greetings" i. [Speaklink Greetings Program] iii. "News"
1. [Menu Voice Ad: Did you know if you sign up for My Speaklink you can filter news only you want to hear, keyword, "My News".] 2. {If Basic} Would you like to hear world, national, or local news? a. {If World news} i. [World news] b. {If National news} i. [National news] c. {If Local news} i. [Local news]
3. {If My Speaklink} | KWD: "My news" a. {Would you like to hear your top 5 headlines from your news box, or review all new stories? i. {If Top 5} 1. [Top 5] ii. {If review all new stories} 1. Say the category you wish to hear or for a listing say help a. [Categories] iv. "Weather"
1. [Menu Voice Ad: Did you know if you sign up for My Speaklink you can filter weather only you want to hear, keyword, "My Weather".]
2. {If Basic} Would you like to hear world, national, or local weather? a. {If World weather} i. [World Weather] b. {If National weather} i. [National weather] c. {If Local weather} i. [Local weather]
3. {If My Speaklink} | KWD: "My weather" a. [Current conditions] b. [5 day forecast] c. [2 Conditions in city of choice] v. "Sports"
1. [Menu Voice Ad: Did you know if you sign up for My Speaklink you can filter sports news and scores only you want to hear, keyword, "My Sports".]
2. {If Basic} Would you like to hear sports scores or sports news? a. {If Scores} i. Say the team or league? a. [Scores] b. {IfNews} i. Say the team or league? a. [News] 3. {If My Speaklink} I KWD: "My Sports" a. [Team scores] b. [Team news] vi. "Stocks"
1. [Menu Voice Ad: Did you know if you sign up for My Speaklink you can filter stock news and quotes only you want to hear, keyword, "My Sports".]
2. {If Basic} a. Would you like to hear a stock quote or stock news? i. {If quote} 1. Say the company name or spell symbol a. [Stock application] ii. {If news} 1. Say the company name or spell the symbol a. [Stock news application] 3. {If My Speaklink} I KWD: "MY Sports" a. Would you like to hear your stock wrap up or local stock information? b. {If stock wrap up} i. [Stock wrap up] c. {If local} i. [Local info] vii. "Health"
1. [Menu Voice Ad: Did you know if you sign up for My Speaklink you can filter stock news and quotes only you want to hear, keyword, "My Health".]
2. {If Basic} a. Say the heath condition you are interested in or say help to hear the current health topics, i. [Health Application] 3. {If My Speaklink} | KWD: "My health" a. [Current related health news] viii. "Shopping"
1. [Menu Voice Ad: Did you know if you sign up for My Speaklink you can get specials deals and promotions delivered to your message box?]
2. Say a category you would like to browse or say help to hear your options. a. [Shopping application] ix. "Traveling" 1. [Menu Voice Ad: Did you know if you sign up for My
Speaklink you can automate your home while you travel?] 2. Say a category such as driving directions or say help to hear your options. a. {If Drive it!/Directions} | KWD: "Drive it!/Driving directions" i. [Drive it! application] b. {If Flights/airlines} KWD: "Flights/Airlines" i. [Flights application] c. {If Cars} | KWD: "Cars" i. [Cars application] d. {If Bus} I KWD: "Bus" i. [Bus application] e. {If Trains} | KWD: "Trains" i. [Train application] f. {If Ferries} | KWD: "Ferries" i. [Ferries application] g. {If Hotels} I KWD: "Hotels" i. [Hotel application] h. {If vacations} | KWD: "Vacations" i. [Vacation application] x. {If My Speaklink {Calendar}} | KWD: "Calendar"
1. [Menu Voice Ad: Did you know if you sign up for My Speaklink plus you can connect to more calendar providers?] 2. [Calendar application] xi. {If My Speaklink {Buddy lists}} | KWD: "Buddy lists/Buddy list/My Buddy list"
1. [Menu Voice Ad: Did you know if you sign up for My Speaklink plus you can add up to 50 people on your buddy list?]
2. Say the buddy name i. [Buddy list application] xii. {If My Speaklink {Web Bookmarks}} ] KWD: "Bookmarks"
1. [Menu Voice Ad: Did you know if you sign up for My Speaklink plus you can add up to 25 web bookmarks?]
2. Say the name of the bookmark to view i. [Bookmark application] xiii. {If My Speaklink {File storage}} | KWD: "My files"
1. [Menu Voice Ad: Did you know if you sign up for My Speaklink plus you can add more room to your storage box?]
2. Say help for a listing of files in your box, or say the file name a. [Message Box Application] xiv. {If My Speaklink {Dialer/Long distance}} | KWD: "Dialer/Long distance"
1. [Menu Voice Ad: Did you know if you sign up for My Speaklink plus you can have more long distance minutes of air time?] 2. Would you like to call a buddy? a. {If yes} i. [Buddy list application] b. {Ifno} i. Say the number you wish you call after the tone 1. [Dialer application] xv. {If My Speaklink {Home automation/My home}} | KWD: "My home/Home automation"
1. [Menu Voice Ad: Did you know if you sign up for My Speaklink plus you can add up to 2 home settings?] 2. What home setting would you like to use? a. [Homeproxy application] b. {If My Speaklink{ Quick Jump} } i. [Menu Voice Ad: Did you know if you sign up for My Speaklink plus you can add more functionality to your existing my
Speaklink settings?]
1. Say the quick jump phrase after the tone, a. [Quick jump application]
Global keywords (variables)
4. During any prompt a subscriber can say the following keywords to get more information a. Help i. Disclose the help during any prompt and receive help related to that part of the portal you are in. b. Exit i. Exit the current prompt and go to the main menu c. Menu i. Exit the current prompt and go to the main menu d. Goodbye i. "Would you like to end your call?"
1. {If yes} a. [Hang up application]
2. {Ifno} a. [Return where they left off] e. Options i. Would you like to change the your menu options? 1. {If yes} a. [Options application] 2. {Ifno} a. [Return where they left off] f. Language i. Would you like to change your language settings?
1. {If yes} a. What language do you prefer to use, English, French, German, or Spanish? i. [Language application] 2. {Ifno} a. [Return where they left off] g- Tip i. Disclose an example for how to use the current prompt h. Choices i. Disclose the possible choices for the current prompt
5. Speaklink application design a. The following are a list of representative specifications for voice portal applications. i. VoxSurf/POP email application 1. {If new messages} a. You have [number of new messages]. Here is the first message i. [READ E-MAIL]
2. {If no new messages} a. You have no new messages. Here is the first saved old message. i. [READ E-MAIL]
3. {If Options} a. Would you like to read new e-mail, read old e- mail or send new e-mail? i. {If send new e-mail} a. Say the buddy you want to email or say new to send an e-mail to a other e-mail address i. [SEND E-MAIL] ii. External POP email application
1. Which account would you like to check, account 1 or 2? a. {Say account and do standard e-mail application} iii. Voice mail application
1. Which account would you like to check account 1 or 2? a. {Say account number and dial the number that it references} iv. Greetings application 1. Is this greeting dedicated to a special event or are you sending an invitation? a. {If special event} i. Say the greeting category you wish to use or say help for the list of occasions to choose from. ii. {Ifhelp} !6 a. Is the occasion for birthday wishes, holidays, special days, weddings, a thank you, romantic greeting, funny greeting, entertainment, business greeting or gay & lesbian related? iii. {If [occasion]} a. Say the name of the greeting you want to send or say help to hear your choices. i. [Display Greeting application] ii. [Buddy list application] iii. [Messenger application] b. {If invitation} i. [Voice site client] a. [Display Greeting application] b. [Buddy list application] c. [Messenger application]
The system 10 also enables users to directly contact advertisers through
, the Advertiselt! application 18. As shown in Figure 9, after the user listens to an advertisement through a voice portal, the user says a keyword, such as "Speaklink," to voice click the ad. At this time the user's responses are tracked by the system 10 as the user elects to either directly connect to the advertiser's voice site or to the advertiser's telephone, or in the alternative, to leave a message on the advertiser's telephone or voice site requesting a call back. In this way, instant connects can be made between users and advertisers with simple spoken commands. This type of advertising through welcome ads, keyword ads, and submenu ads as described above, provides high click- throughs, brand exposure, market exposure, instant connection, demographic marketing, and marketing to targeted subjects. As a convenience to the user, the user may "bookmark" the ad or have it stored in their profile for later instant connection. The My.Speaklink application 24 contains a number of features to enhance user voice navigation and transactions. For example, a user can have all of their messages such as e-mail and voicemail, as well as their faxes, music, and document files stored in a universal message box 122. As shown in Figure 11 A, this message box 122 can then be accessed by the user through the voice portal application 14 or the web portal application 16. Long-distance messages can be sent via the Internet through VoIP to nonsubscribers, as shown in Figure 1 IB. Home automation is also possible through microphones placed within a user's home and configured to recognize the user's speech. This application can also be used in a user's vehicle. Through the Web parser application 34 and the bookmarking feature, a user can consolidate their personal, work, and other calendars as well as aggregate material provided by outside calendar content vendors 132 as shown in Figure 1 ID.
Figures HE and 11F illustrate the user's ability to customize their greetings by speaking a verbal title and sending the greeting to a recipient's voicemail or telephone number. The user may choose from a music element library 134 to choose background sounds, such as a famous star singing Happy Birthday or something similar. Though a user can provide a voice attachment to play a personalized message. Buddy lists, shown in Figure 11F, allow a user to send greetings and participate in file sharing with selected friends via a buddy list connector 136. As mentioned above, users may bookmark voice sites, as shown in
Figure 11G, to enable quick access to favorite voice sites.
An important feature of the present invention enables users to quickly jump to a particular voice site. This is done by a concatenation of spoken keywords. An example of the quick-jump feature would be the sentence: "I want to bid on computers at e-Bay." The My.Speaklink application 24 is configured to break this sentence into phrases or action items such as the first phrase 138 "I want to," followed by a second action phrase 140 "bid," followed by an item phrase 142 "computers," and concluding with a supplier 142 "e-Bay." The voice browser 146 then takes the user directly to the site for bidding on computers. Figure 111 illustrates the feature of extended mail accounts, which enables a user to send e-mails to multiple POP e-mail accounts 148.
Figure 11J illustrates another important feature of the present invention, the voice homepage for a user. In this case a consumer calls a subscriber, such as a business, and the telephone is answered by the host site 56 through the host site voice portal 150. The voice portal 150 replaces the traditional voicemail. The consumer can then leave a message, enter into a business transaction, or retrieve information from the subscriber site. The voice homepage checks to see if the caller is a user of the host site voice portal 150 and, if so, they do not have to log in to become part of the same network. This facilitates faster connections and transactions among subscribers. There is no need to change the setup, targeted short ads can be provided, there is instant voice commerce, and it is controlled by the consumer as shown in Figure 1 IK.
The Traffic Speak application 26 is shown in Figure 1 IL where the voice portal 14 and web portal 16 cooperate to track the consumer and build demographics about the consumer, as described throughout above.
The system of the present invention also provides a method for navigating retrieved web-based content. The method includes steps of translating the web-based content into a natural language voice content for voice applications; dividing the natural language voice content into navigational objects; associating each navigational object with the pre-built navigation options; and selecting any one of the navigation options by voice to perform the navigation option associated with the navigational objects. For example, each phrase of a natural language voice content would comprise a navigational object. Each object would then be associated with certain options available to the user for either reviewing the content of that object or moving to other objects within the natural language voice content. These could include VCR-like capabilities, such as start, pause, fast forward, reverse, fast reverse, stop, return to the beginning, and fast forward to the end. The user may also bookmark the object for later retrieval. The navigational objects can include applications, such as voice-mail, e-mail, facsimile, telephone, and the like. Ideally, the system is configured to return a user to the calling application once a called application is completed. For instance, after a navigational object is sent to e-mail, the system automatically returns the user to the navigational object, and additional applications or navigational objects can be chosen, such as sending the navigational object or the related voice content, as appropriate, to a facsimile or a printer or even to a telephone.
From the foregoing it will be appreciated that, although specific embodiments of the invention have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the invention as set forth in the appended claims and the equivalents thereof.

Claims

1. A method for navigating retrieved web-based content, comprising: translating the web-based content into a natural language voice content for voice applications; dividing the natural language voice content into navigational objects; associating each navigational object with pre-built navigation options; and selecting any one of the navigation options by voice to perform the navigation option associated with the navigational objects.
2. The method of claim 1, comprising sequentially playing the voice content of the navigational object resulting from the selected navigation option and, at the choice of the user, playing the voice content associated with the navigational object next in sequence or selecting another navigational object.
3. The method of claim 1, comprising forming navigational objects to include at least one application option.
4. The method of claim 3, wherein at least one application option includes at least one from among an e-mail application, a facsimile application, a printing application, and a telephone notification application.
5. The method of claim 4, comprising returning to the navigational object from which the navigation option was selected at the conclusion of the selected navigation object.
6. The method of claim 5, comprising bookmarking a navigation object to store a link to the navigation object in a memory for later use.
7. The method of claim 6, wherein bookmarking comprises speaking a voice command.
8. A method for navigating retrieved web-based content by voice, comprising: accessing a web server that is configured to: translate web-based content into a natural language voice content for voice applications; divide the natural language voice content into navigational objects; associate each navigational object with pre-built voice navigation options; and accessing a voice browser and selecting a navigation option associated with a navigational object; and playing the voice content of the navigational object resulting from the selection of the associated navigation option.
9. A voice application server, comprising: a navigation engine configured to translate web-based content into a natural language voice content for voice applications; divide the natural language voice content into navigational objects; associate the navigational object with pre-built navigation options; enable selecting any one of the navigation objects by voice to perform the navigation option associated with the navigational objects; and provide reusable voice XML-based voice applications accessible by a user through the navigational engine.
10. A method for repurposing textual step-based driving directions into voice content for playback, comprising: retrieving the textual step-based driving direction content; translating the textual step-based driving direction content into natural language voice content by translating in the order received each step into natural language voice content, dividing the voice content into navigational objects, and associating navigational options with each navigational object for selection by the user during playback of each navigational object.
11. A method for enabling a caller to transfer to a third party telephone number after hearing voice advertisements, comprising: providing the voice advertisement; providing a key word associated with an action of connecting to an advertiser's voice site and providing a key word to bookmark a link to the advertiser's voice site; carrying out the action associated with the keyword spoken by the user; and returning to the advertisement.
12. A method for jumping to a link inside a voice portal by voice command, comprising: receiving a voice command containing at least two natural language phrases; dividing the at least two natural language phrases into objects that represent an action and an item; translating the objects into corresponding text objects; concatenating the corresponding text objects into a hyperlink to a voice site; and executing the hyperlink to the identified link in the voice site.
13. A method for jumping to a link inside a voice portal by voice command, comprising: receiving a voice command containing at least two natural language phrases; dividing the at least two natural language phrases into objects that represent an action and an item; translating the objects into corresponding text objects; and concatenating the corresponding text objects into a hyperlink to a voice site.
14. A method for jumping to a link inside a voice portal, comprising: receiving a voice command containing at least three natural language phrases; dividing the at least three natural language phrases into objects that represent an action, an item, and a supplier; translating the objects into corresponding text objects; concatenating the corresponding text objects into a hyperlink to a voice site; and executing the hyperlink to the identified link in the voice site.
15. A method of interacting with a voice home page via a telephone for providing voice portal access via subscriber voice-mail number, comprising: storing the subscriber's advertising information in a voice content memory; receiving a telephone call from a caller; providing the caller with options to leave a message or to access a voice-based link to a voice website; responding to the caller's selection by executing the selected link; and capturing the DID number to retrieve subscriber information to provide to the recipient of the call.
16. A system for providing voice portal access via a subscriber voice-mail number, comprising: a web-based server having a voice browser configured to receive a call and to provide a caller with options to leave a message or to access voice-based links to a voice website, to respond to a caller's selection by executing the selected link, and provide the subscriber information to the recipient of the call.
PCT/US2001/024486 2000-08-02 2001-08-02 System and method for voice-activated web content navigation WO2002011120A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001284713A AU2001284713A1 (en) 2000-08-02 2001-08-02 System and method for voice-activated web content navigation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US22269000P 2000-08-02 2000-08-02
US60/222,690 2000-08-02

Publications (1)

Publication Number Publication Date
WO2002011120A1 true WO2002011120A1 (en) 2002-02-07

Family

ID=22833277

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/024486 WO2002011120A1 (en) 2000-08-02 2001-08-02 System and method for voice-activated web content navigation

Country Status (2)

Country Link
AU (1) AU2001284713A1 (en)
WO (1) WO2002011120A1 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7289960B2 (en) 2001-10-24 2007-10-30 Agiletv Corporation System and method for speech activated internet browsing using open vocabulary enhancement
US7324947B2 (en) 2001-10-03 2008-01-29 Promptu Systems Corporation Global speech user interface
US20080086303A1 (en) * 2006-09-15 2008-04-10 Yahoo! Inc. Aural skimming and scrolling
US7428273B2 (en) 2003-09-18 2008-09-23 Promptu Systems Corporation Method and apparatus for efficient preamble detection in digital data receivers
US7519534B2 (en) 2002-10-31 2009-04-14 Agiletv Corporation Speech controlled access to content on a presentation medium
US7653748B2 (en) * 2000-08-10 2010-01-26 Simplexity, Llc Systems, methods and computer program products for integrating advertising within web content
US7729910B2 (en) 2003-06-26 2010-06-01 Agiletv Corporation Zero-search, zero-memory vector quantization
US8321427B2 (en) 2002-10-31 2012-11-27 Promptu Systems Corporation Method and apparatus for generation and augmentation of search terms from external and internal sources
USRE44326E1 (en) 2000-06-08 2013-06-25 Promptu Systems Corporation System and method of voice recognition near a wireline node of a network supporting cable television and/or video delivery
US9978366B2 (en) 2015-10-09 2018-05-22 Xappmedia, Inc. Event-based speech interactive media player
US10152975B2 (en) 2013-05-02 2018-12-11 Xappmedia, Inc. Voice-based interactive content and user interface
US10175060B2 (en) 2016-09-06 2019-01-08 Microsoft Technology Licensing, Llc Translation of verbal directions into a list of maneuvers
US10373614B2 (en) 2016-12-08 2019-08-06 Microsoft Technology Licensing, Llc Web portal declarations for smart assistants

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5726984A (en) * 1989-01-31 1998-03-10 Norand Corporation Hierarchical data collection network supporting packetized voice communications among wireless terminals and telephones
GB2317070A (en) * 1996-09-07 1998-03-11 Ibm Voice processing/internet system
US5761312A (en) * 1995-06-07 1998-06-02 Zelikovitz, Deceased; Joseph Enhanced individual intelligent communication platform for subscribers on a telephone system
US5794193A (en) * 1995-09-15 1998-08-11 Lucent Technologies Inc. Automated phrase generation
US5870549A (en) * 1995-04-28 1999-02-09 Bobo, Ii; Charles R. Systems and methods for storing, delivering, and managing messages
US5915001A (en) * 1996-11-14 1999-06-22 Vois Corporation System and method for providing and using universally accessible voice and speech data files
US5996006A (en) * 1996-11-08 1999-11-30 Speicher; Gregory J. Internet-audiotext electronic advertising system with enhanced matching and notification
US6044403A (en) * 1997-12-31 2000-03-28 At&T Corp Network server platform for internet, JAVA server and video application server
US6078886A (en) * 1997-04-14 2000-06-20 At&T Corporation System and method for providing remote automatic speech recognition services via a packet network
US6081780A (en) * 1998-04-28 2000-06-27 International Business Machines Corporation TTS and prosody based authoring system
US6240391B1 (en) * 1999-05-25 2001-05-29 Lucent Technologies Inc. Method and apparatus for assembling and presenting structured voicemail messages

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5726984A (en) * 1989-01-31 1998-03-10 Norand Corporation Hierarchical data collection network supporting packetized voice communications among wireless terminals and telephones
US5870549A (en) * 1995-04-28 1999-02-09 Bobo, Ii; Charles R. Systems and methods for storing, delivering, and managing messages
US5761312A (en) * 1995-06-07 1998-06-02 Zelikovitz, Deceased; Joseph Enhanced individual intelligent communication platform for subscribers on a telephone system
US5794193A (en) * 1995-09-15 1998-08-11 Lucent Technologies Inc. Automated phrase generation
GB2317070A (en) * 1996-09-07 1998-03-11 Ibm Voice processing/internet system
US5996006A (en) * 1996-11-08 1999-11-30 Speicher; Gregory J. Internet-audiotext electronic advertising system with enhanced matching and notification
US5915001A (en) * 1996-11-14 1999-06-22 Vois Corporation System and method for providing and using universally accessible voice and speech data files
US6078886A (en) * 1997-04-14 2000-06-20 At&T Corporation System and method for providing remote automatic speech recognition services via a packet network
US6044403A (en) * 1997-12-31 2000-03-28 At&T Corp Network server platform for internet, JAVA server and video application server
US6081780A (en) * 1998-04-28 2000-06-27 International Business Machines Corporation TTS and prosody based authoring system
US6240391B1 (en) * 1999-05-25 2001-05-29 Lucent Technologies Inc. Method and apparatus for assembling and presenting structured voicemail messages

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
USRE44326E1 (en) 2000-06-08 2013-06-25 Promptu Systems Corporation System and method of voice recognition near a wireline node of a network supporting cable television and/or video delivery
US7653748B2 (en) * 2000-08-10 2010-01-26 Simplexity, Llc Systems, methods and computer program products for integrating advertising within web content
US10257576B2 (en) 2001-10-03 2019-04-09 Promptu Systems Corporation Global speech user interface
US11070882B2 (en) 2001-10-03 2021-07-20 Promptu Systems Corporation Global speech user interface
US10932005B2 (en) 2001-10-03 2021-02-23 Promptu Systems Corporation Speech interface
US11172260B2 (en) 2001-10-03 2021-11-09 Promptu Systems Corporation Speech interface
US8983838B2 (en) 2001-10-03 2015-03-17 Promptu Systems Corporation Global speech user interface
US8005679B2 (en) 2001-10-03 2011-08-23 Promptu Systems Corporation Global speech user interface
US9848243B2 (en) 2001-10-03 2017-12-19 Promptu Systems Corporation Global speech user interface
US8407056B2 (en) 2001-10-03 2013-03-26 Promptu Systems Corporation Global speech user interface
US7324947B2 (en) 2001-10-03 2008-01-29 Promptu Systems Corporation Global speech user interface
US8818804B2 (en) 2001-10-03 2014-08-26 Promptu Systems Corporation Global speech user interface
US7289960B2 (en) 2001-10-24 2007-10-30 Agiletv Corporation System and method for speech activated internet browsing using open vocabulary enhancement
US9626965B2 (en) 2002-10-31 2017-04-18 Promptu Systems Corporation Efficient empirical computation and utilization of acoustic confusability
US10748527B2 (en) 2002-10-31 2020-08-18 Promptu Systems Corporation Efficient empirical determination, computation, and use of acoustic confusability measures
US8862596B2 (en) 2002-10-31 2014-10-14 Promptu Systems Corporation Method and apparatus for generation and augmentation of search terms from external and internal sources
US11587558B2 (en) 2002-10-31 2023-02-21 Promptu Systems Corporation Efficient empirical determination, computation, and use of acoustic confusability measures
US9305549B2 (en) 2002-10-31 2016-04-05 Promptu Systems Corporation Method and apparatus for generation and augmentation of search terms from external and internal sources
US8321427B2 (en) 2002-10-31 2012-11-27 Promptu Systems Corporation Method and apparatus for generation and augmentation of search terms from external and internal sources
US8959019B2 (en) 2002-10-31 2015-02-17 Promptu Systems Corporation Efficient empirical determination, computation, and use of acoustic confusability measures
US7519534B2 (en) 2002-10-31 2009-04-14 Agiletv Corporation Speech controlled access to content on a presentation medium
US10121469B2 (en) 2002-10-31 2018-11-06 Promptu Systems Corporation Efficient empirical determination, computation, and use of acoustic confusability measures
US8185390B2 (en) 2003-06-26 2012-05-22 Promptu Systems Corporation Zero-search, zero-memory vector quantization
US7729910B2 (en) 2003-06-26 2010-06-01 Agiletv Corporation Zero-search, zero-memory vector quantization
US7428273B2 (en) 2003-09-18 2008-09-23 Promptu Systems Corporation Method and apparatus for efficient preamble detection in digital data receivers
US9087507B2 (en) * 2006-09-15 2015-07-21 Yahoo! Inc. Aural skimming and scrolling
US20080086303A1 (en) * 2006-09-15 2008-04-10 Yahoo! Inc. Aural skimming and scrolling
US10152975B2 (en) 2013-05-02 2018-12-11 Xappmedia, Inc. Voice-based interactive content and user interface
US10157618B2 (en) * 2013-05-02 2018-12-18 Xappmedia, Inc. Device, system, method, and computer-readable medium for providing interactive advertising
US11373658B2 (en) 2013-05-02 2022-06-28 Xappmedia, Inc. Device, system, method, and computer-readable medium for providing interactive advertising
US10706849B2 (en) 2015-10-09 2020-07-07 Xappmedia, Inc. Event-based speech interactive media player
US9978366B2 (en) 2015-10-09 2018-05-22 Xappmedia, Inc. Event-based speech interactive media player
US10475453B2 (en) 2015-10-09 2019-11-12 Xappmedia, Inc. Event-based speech interactive media player
US11699436B2 (en) 2015-10-09 2023-07-11 Xappmedia, Inc. Event-based speech interactive media player
US10175060B2 (en) 2016-09-06 2019-01-08 Microsoft Technology Licensing, Llc Translation of verbal directions into a list of maneuvers
US10373614B2 (en) 2016-12-08 2019-08-06 Microsoft Technology Licensing, Llc Web portal declarations for smart assistants

Also Published As

Publication number Publication date
AU2001284713A1 (en) 2002-02-13

Similar Documents

Publication Publication Date Title
US8849659B2 (en) Spoken mobile engine for analyzing a multimedia data stream
US9477971B2 (en) Providing contextual information for spoken information
US6400806B1 (en) System and method for providing and using universally accessible voice and speech data files
US9185215B2 (en) Performing actions for users based on spoken information
US6658389B1 (en) System, method, and business model for speech-interactive information system having business self-promotion, audio coupon and rating features
US7457397B1 (en) Voice page directory system in a voice page creation and delivery system
US6895084B1 (en) System and method for generating voice pages with included audio files for use in a voice page delivery system
US9020107B2 (en) Performing actions for users based on spoken information
US7334050B2 (en) Voice applications and voice-based interface
US20170212960A1 (en) System and method for conducting a search using a wireless mobile device
KR100870798B1 (en) Natural language processing for a location-based services system
US7933389B2 (en) System and method generating voice sites
US20070203736A1 (en) Interactive 411 Directory Assistance
US20090172108A1 (en) Systems and methods for a telephone-accessible message communication system
CA2742308C (en) Location-based services
JP2003169147A (en) Client response system and method
US20070203735A1 (en) Transaction Enabled Information System
US20100076843A1 (en) Live-agent-enabled teis systems
AU2002256369A1 (en) Location-based services
US20150134340A1 (en) Voice internet system and method
US20050055310A1 (en) Method and system for accessing information within a database
WO2002011120A1 (en) System and method for voice-activated web content navigation
Kurkovsky et al. Mobile voice access in social networking systems
KR20110064843A (en) How to request and receive information using your mobile phone.
Duggan Revenue Opportunities in the Voice Enabled Web

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP