WO2001075662A2 - Method and apparatus for providing multilingual translation over a network - Google Patents

Method and apparatus for providing multilingual translation over a network Download PDF

Info

Publication number
WO2001075662A2
WO2001075662A2 PCT/US2001/010628 US0110628W WO0175662A2 WO 2001075662 A2 WO2001075662 A2 WO 2001075662A2 US 0110628 W US0110628 W US 0110628W WO 0175662 A2 WO0175662 A2 WO 0175662A2
Authority
WO
WIPO (PCT)
Prior art keywords
language
translation
text
electronic
user
Prior art date
Application number
PCT/US2001/010628
Other languages
French (fr)
Other versions
WO2001075662A8 (en
Inventor
Thomas Ritter
Jeffrey Chin
Raymond Flournoy
Pria Hidisyan
Rina Horiuchi
Yannick Kassum
Kevin Lee
Nicholas Lee
David Lowsky
David Weinstein
Christopher Callison-Burch
Original Assignee
Amikai, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Amikai, Inc. filed Critical Amikai, Inc.
Priority to AU2001249777A priority Critical patent/AU2001249777A1/en
Priority to JP2001573273A priority patent/JP2003529845A/en
Publication of WO2001075662A2 publication Critical patent/WO2001075662A2/en
Publication of WO2001075662A8 publication Critical patent/WO2001075662A8/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/232Orthographic correction, e.g. spell checking or vowelisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation

Definitions

  • This invention relates generally to translation methods and apparatus, and more particularly to translation methods and apparatus over a network.
  • Input or output information is typically composed of an arrangement of system-provided words or phrases followed by user or system-supplied data fields displayed in a predetermined pattern on the screen.
  • an application executing on the host utilizes only a limited number of screen patterns or formats so that only a standard set of screen images corresponding to the formats may be called into view by the user for input or invoked by the host processing for output.
  • the definition of each screen format is typically deeply embedded in the source code for the application. There is essentially no flexibility provided to the user to allow for the creation of customized formats and, correspondingly, their screen images.
  • major modifications to the source code of the system have conventionally been required, such as by rewriting significant portions of the code implementing input/output (I/O) interface functions.
  • the foremost modification in the above special situation is that of translating the descriptive words or phrases from the original language (e.g., English) to a different language (e.g., Spanish). If there are affiliates from numerous foreign countries then, besides the effort of rewriting the source code, multiple copies of modified source code require storing, tracking and updating. Such a task becomes unwieldy, burdensome and costly. Thus, for example, if a source code module uses or produces user-viewable information, then there must be a different copy of the module for each language executable by that software. Besides the actual system copies of the code, support software is required to inform the system developer of the status of the multiple copies. Moreover, additional storage devices are needed to store all the additional versions of the software. For a large scale system involving millions of lines of code and thousands of modules, the storage requirements may become enormous.
  • a translation environment which serves to buffer the host system to each of the access devices is disclosed in U.S. Patent No. 4,870,610.
  • the translation environment includes an autonomous processor interposed between the host system and each access device.
  • Information transmitted in either direction between the host and access device is diverted to the processor for intermediate processing.
  • the diverted information contains detailed character data either appearing on the input request screen originated at the access device or on the output response screen destined for the access device, depending upon the direction of original information transmission.
  • the character data is of two types, namely, system-supplied field identifiers and user-provided data entries associated with the identifiers.
  • Identifiers are expressed in a first user language (e.g., English).
  • the screen displays and, most particularly, the identifiers are first translated to the second language via a format create process.
  • the output of this create process is a translation file which stores the mapping relationship between the first language screen and its second language counterpart.
  • the translation file is invoked by a translation execution process whenever the second language user accesses the host system.
  • the contents of this file are used to translate from the second-to-first language upon a host request and from the first-to-second language upon a host response.
  • a feature of this arrangement is that both the format create process and the translation execution process operate in the translation environment which is transparent to the host system. With the translation environment, the user may customize screen displays to maximize system utilization.
  • U.S. Patent No. 5,966,685 is directed to a system of parallel discussion groups operated in conjunction with a message collection/posting software program, data filter software program, and a machine translation software program.
  • a structure and process is created to enable discussion group users, of different languages, to communicate with one another.
  • An automatic batch process is utilized that executes at a remote site. No human intervention is required for the pre-processing, translation, or post-processing functions. Additionally, users simply specify a language preference to realize the benefits and advantages of the present invention.
  • a number of discussion groups run in "parallel"; one group for each language being used in the discussion groups.
  • the individual discussion groups all contain the same information, in the same order; the only difference being that each parallel discussion group is written in a different language.
  • Once a user logs onto a particular parallel discussion group he or she may then choose his or her language preference. If the user's language preference is set to French, the French version of the discussion group will be accessed.
  • Messages posted to a discussion group will be periodically collected, translated to the other languages, and then posted to those respective target language discussion groups. The collection and posting of the messages will be accomplished by the Message Collection/Posting Software
  • the new messages which are collected on a periodic basis are sent to a commercially available Machine Translation (MT) software for translation. Messages are batch processed automatically at the network site and without human intervention. The translation takes place at a remote site so user actions are minimized.
  • MT Machine Translation
  • the input text is passed through a filter software program which preprocesses the data before it is submitted to the MT software.
  • the filter identifies and marks strings which are best left untranslated by the MT software, such as personal names, company product names, file and path names, commands, samples of source code, and the like. By marking these strings, the filter notifies the MT software to leave those strings untranslated. These strings are then linked to a preceding "hookword". Hookwords are automatically inserted then deleted in post-processing and are contained in dictionaries with a part-of-speech and other grammatical features to effect rearrangement of the word in the target language.
  • U.S. Patent No. 5,960,382 discloses a method and apparatus for translating a native-language message into a corresponding foreign-language message. Translation of an initially-unknown message is effected using native-language and foreign-language prototype messages that are independent of message variables, whereby a prototype message represents all messages of an individual type. An individual message is identified to belong to a particular type by using the native- language prototype message, and an equivalent foreign-language message is then generated by inserting variable values from the individual message into the foreign-language prototype message that represents the particular message type.
  • the native-language message which includes a value of a variable, is matched against a plurality of native- language prototype messages to identify a corresponding native-language prototype message, which includes the variable.
  • the plurality of native-language prototype messages preferably represent all native-language messages that require translation.
  • the identification of the prototype native-language message is used to obtain (e.g., retrieve) a corresponding foreign-language prototype message, which also includes the variable.
  • the value of the variable, obtained from the native-language message that is being translated, is then substituted for the variable in the obtained foreign- language prototype message to yield a foreign-language message which corresponds to (i.e., which is a translation of) the native-language message.
  • the native language message includes values of a plurality of variables
  • the identified native-language prototype message and the corresponding foreign-language prototype message each includes the plurality of variables.
  • the plurality of the variables have a first ordering in the identified native-language prototype message and a second ordering in the corresponding foreign-language prototype message, and the two orderings are generally different.
  • the substitution step then involves using the first ordering and the second ordering to determine a placement of the values of the variables into the obtained foreign-language prototype message.
  • the matching step involves the use of a multi-tiered multi-node tree constructed from the native-language prototype messages, and matching strings (e.g., words and numerals) which make up the native-language message in their order against the nodes of corresponding tiers in the tree to reach a node which represents the last string in the message and contains the message identifier of the corresponding prototype message. This identifier is then used to obtain the corresponding foreign-language prototype message, which has the same identifier.
  • strings e.g., words and numerals
  • an object of the present invention is to methods and apparatus that improve the quality and usability of the translations.
  • Another object of the present invention is to provide language translation methods and apparatus suitable for the internet and other distributed networks.
  • Yet another object of the present invention is to provide language translation methods and apparatus that provide user feedback.
  • a further object of the present invention is to provide language translation methods and apparatus for the internet that limits subject matter and language usage domains.
  • Another object of the present invention is to provide language translation methods and apparatus for networks that takes advantage of application-specific characteristic repetitions in language.
  • Another object of the present invention is to provide language translation methods and apparatus for the internet that provide user feedback relative for determining whether inputs are translated correctly.
  • Another object of the present invention is to provide language translation methods and apparatus that actively educate users on how to use translation engines.
  • Another object of the present invention is to provide language translation methods and apparatus with user defined dictionaries.
  • Another object of the present invention is to provide language translation methods and apparatus that provide a static translation cache of frequently encountered phrases
  • Another object of the present invention is to provide language translation methods and apparatus that provide a key form for storing cached phrases which removes extraneous information
  • Another object of the present invention is to provide language translation methods and apparatus that provides for the most flexible and productive application of a phrase cache.
  • Another object of the present invention is to provide language translation methods and apparatus that provide typing shortcuts for languages and allow free deletion of selected clauses.
  • Another object of the present invention is to provide language translation methods and apparatus that is real-time and does not cause delay for the user.
  • Another object of the present invention is to provide language translation methods and apparatus that indues multiple translation engines in a single system with a uniform API.
  • Another object of the present invention is to provide language translation methods and apparatus that provides a uniform API to numerous applications.
  • An electronic language translator is provided.
  • Source language text is received as an input to the electronic language translator.
  • the source language text is translated at the electronic language translator at the time of submission into one or more target language texts.
  • a user is then provided with an option of viewing one or more of the target language texts with or without the source language texts.
  • a method for electronically translating text provides an electronic language translator system that includes an electronic language translator and at least a first and a second dictionary.
  • the electronic language translator references the first dictionary and then the second dictionary in a process of translating source language text into one or more target language texts.
  • the dictionaries are maintained in an application or customer hierarchy.
  • Source language text is received at an input of the electronic language translator.
  • the source language text is translated at the electronic language translator into one or more target language texts.
  • An output is produced that includes the one or more target language texts.
  • a method for electronic language translation provides one or more translation modules receiving source language text from an input interface.
  • One or more input interfaces and one or more output interfaces are provided.
  • a generic data format is included that is independent of the translation modules, input interfaces, and output interfaces.
  • the input source language text is converted from the format for a specific input interface to the generic format.
  • a determination is made of the one or more translation modules that provides an optimal translation.
  • the text is routed to the module that provides the optimal translation.
  • Text is converted from the generic data format to a specific input format of a translation module.
  • the specific output format from a translation module is converted to the generic data format.
  • Data is converted from the generic data format into an output format suitable for an output interface.
  • a method for electronically translating text provides an electronic language translator coupled to an interface.
  • Source language text is translated at the electronic language translator into one or more target language texts.
  • Translated text is output in one or more target languages to an output interface.
  • Controls are provided at an interface coupled to the electronic language translator to dynamically select which of the one or more target languages are output at the interface.
  • the interface representation of text is varied in the one or more target languages to allow a user to differentiate between the displayed languages. Controls are provided at an interface to create differentiation between one or more target languages.
  • a method for electronically translating text provides an electronic language translator coupled to an interface.
  • the source language text is translated at the electronic language translator into one or more target language texts.
  • the translated output is displayed to the original user. Feedback is provided to the original user about the quality of the translation.
  • a method for electiOnically translating text provides an electronic language translator coupled to an interface.
  • the source language text is translated at the electronic language translator into one or more target language texts.
  • At least two candidate translations are produced for each source language text.
  • the translated candidates are compared to one or more language models trained on data similar in style and subject matter to the text being translated.
  • the best quality translation is selected for the input from the multiple translation candidates according to which best matches the one or more language models. A desired best quality translation is then displayed.
  • a system for electronically translating text includes an electronic language translator that receives source language text input and produces translated target language text.
  • An interface is coupled to the electronic language translator and configured to provide a user with an option of viewing one or more target language texts with or without source language text.
  • Figure 1 describes a screenshot of a chat application of the present invention.
  • Figures 2-3 describe embodiments of the real-time multilingual communication application of the present invention.
  • Figure 4 is a high level overview illustrating operation of the System of the present invention.
  • Figure 5 is a high level overview illustrating operation of the System of the present invention.
  • Figure 6 is a high level overview illustrating operation of the System of the present invention.
  • Figure 7 illustrates data flow between the client wireless device, the wireless network provider, the data provider, and the translation services of the present invention.
  • Figure 8 illustrates a generic peer-to-peer data exchange.
  • Figure 9 illustrates a generic client-server data exchange, where the translation service of the present invention acts as the intermediary in the data exchange
  • Figures 10(a)- 10(b) are directed to an auction tool embodiment of the present invention which combines features of the multi- lingual search engine and the browsing tool of the present invention
  • Figure 11 shows the dynamic translation cache of the present invention.
  • Figure 12 illustrates the framework of the present invention.
  • Figure 13 illustrates the five steps performed at the translation
  • Figure 14 illustrates the procedures that each step of transation undergoes of the present invention.
  • Figure 15(a) illustrates the full linguistic processing occurring in the chat application of the present invention.
  • Figure 15(b) is similar to Figure 15(a) with the same data path, for non- interactive applications of the present invention.
  • Figure 16 illustrates input aids incorporated into the System of the present invention.
  • Figure 17 illustrates a common phrase table with the static translation cache of the present invention.
  • Figure 18 illustrates language post-processing of the present invention.
  • Figure 19 illustrates text post-processing of the present invention.
  • Figure 20 illustrates text pre-processing of the present invention.
  • Figure 21 is a continuation of Figure 8 illustrating language preprocessing of the present invention.
  • Figure 22 illustrates the translation engines with dictionaries of different types of the present invention.
  • Figure 23 shows a topic specific dictionary of the present invention.
  • Figure 24 illustrates the four types of feedback produced and utilized by the System of the present invention.
  • Figure 25 shows the levels of feedback which are incorporated in the translation system of the present invention.
  • Figure 26 illustrates the way the System of the present invention educates the user about the MT engine, as well as about his or her own language.
  • Figure 27 describes the browsing tool of the present invention on a high level as a three-step process:
  • Figure 28 is a high-level blowup of step 2 from Figure 27.
  • Figure 29 provides more detail of step 2 from Figure 27 for page retrieval and processing.
  • Figure 30 is a blowup of page retrieval as represented in Figure 29.
  • Figure 31 is a blowup of step 1 from Figure 30 describing how parameters are added to a URL before querying the source site.
  • Figure 32 is a blowup of step 2 from Figure 30.
  • Figure 33 is directed to page rewriting of the present invention.
  • Figure 34 illustrates page rewriting as a two-pass process of the present invention.
  • Figure 35 is a graphical illustration of the page rewriting process described in Figure 34.
  • Figure 36 is a blowup of pass 1 from Figure 37.
  • Figure 37-39 give examples of how certain elements are handled in pass 1.
  • Figure 40 describes instances in the page where the browsing tool rewrites
  • Figure 41 illustrates how text is translated as part of the browsing tool of the present invention.
  • Figure 42 is a blowup of the final stage from Figure 30 for handling incoming cookies of the present invention.
  • Figure 43 shows the Help button of the present invention.
  • Figure 44 illustrates the restatement window of the present invention.
  • Figure 45 shows an example of the interactive tutorial entry of the present invention.
  • Figure 46 is a blow-up of the different types of the tutorials of the present invention.
  • Figure 47 shows an example of warning lights of the present invention.
  • Figure 48 shows how users' expectations and knowledge about the System are influenced through actual use of the System of the present invention.
  • Figure 49 illustrates the tutorial daemon of the present invention.
  • Figure 50 depicts the input length meter of the present invention.
  • Figure 51 is an example of shortcuts of the present invention.
  • Figure 52 shows example of emoticons of the present invention.
  • Figure 53 is an example of the translator of the present invention.
  • Figure 54 illustrates a number of personalization features of the present invention.
  • Figure 55 illustrates the splash page of the present invention.
  • Figure 56 illustrates the plurality of different features in the chat application of the present invention.
  • Figure 57 illustrates the different features in the chat room of the present invention.
  • Figure 58 illustrates the Help process of the present invention.
  • Figure 59 illustrates the "current member's box" of the present invention.
  • Figure 60 illustrates switching language zones of the present invention.
  • Figure 61 is an overview of the browsing tool of the present invention.
  • Figure 62 lists some of the browsing tool features of the present invention.
  • An interface is coupled to the electronic language translator and configured to provide a user with an option of viewing one or more target language texts with or without source language text.
  • the electronic language translator translates the source language text to at least one target language at the time of submission of the source language text.
  • An output interface outputs the target language text from the electronic language translator.
  • the output interface can vary an interface representation of text in the one or more target languages.
  • the electronic language translator can include at least first and second dictionaries.
  • the electronic language translator references the first dictionary and then the second dictionary in a process of translating source language text into one or more target language texts.
  • the dictionaries are maintained in an application or customer hierarchy.
  • a generic data format can be included that is independent of the translation engines, input interfaces and output interfaces.
  • a conversion module converts the input source language text from the format for a specific input interface to a generic format.
  • a routing module determines which translator provides an optimal translation and then routes the text to that translator.
  • a conversion module converts text from the generic data format to a specific input format .
  • a conversion module converts a specific output format from the franslation engine to the generic data format.
  • a conversion module can be included to convert data from the generic data format into an output format suitable for an output interface.
  • the System of the present invention has a variety of different applications including but not limited to translation of text, real time translated chat, website content, e-mail, instant messaging, multi-lingual auctions and marketplaces, and the like.
  • the present invention allows multiple people to engage in an online translated text conversation. Users can define their input and view languages and Chat applications of the present invnetion translates input sentences from one user to the appropriate output languages defined by each of the other users.
  • a screenshot of a chat application of the present invention is illustrated in Figure 1.
  • the present invention invention is for casual chat between users on a portal or community site, intra-company communication on a corporate intranet, business-oriented chat on a business-to-business exchange, and real-time customer support solutions, among other uses.
  • Figures 2-5 illustrate one emboidment of a real-time multilingual communication methods and apparatus of the present invention.
  • Figure 2 illustrates how different users use the method and apparatus of the present invention. Illustrated is a two-person interaction model. In this model, two people communicate exclusively with each other, sending messages back and forth. Each message is sent to the chat server, translated (if the message is textual), and relayed to the other user.
  • the second diagram illustrates multiple-user communication where multiple people message each other in one room. In this model, every message that is sent by any one chat client is captured by the chat server, translated (if the message is textual), and rebroadcast to every chat client in the chat room.
  • Figure 3 illustrates the several types of messages that travel within a chat room.
  • One is a plain text message which is translated instantaneously into any of the chat room's supported languages.
  • the next type is iconic. These are significant because when dealing with universal language there is a need to transmit messages that are understood universally.
  • Another type of message is meta-transactional whose sole purpose is to facilitate the entire process of communication.
  • One example of a meta-transactional message is the "Help?" message, which one user may send to a second user alerting him that she did not understand his message, and requesting that he restate it in a way that may be translated more effectively and thereby more easily understood.
  • FIG 4 is a high level overview illustrating operation of the System of the present invention.
  • a user uses their browser to access the web server of the present invention.
  • the web server delivers a page which contains an applet.
  • the • applet appears. From that point forward communication between the user and the System occurs exclusively through that applet.
  • the user inputs a message in the applet which is sent to the chat server.
  • the chat server sends the message to the translation System, which translates the message and sends it back to the chat server.
  • the message contains versions of the user-entered text in all the supported languages of the chat room.
  • the server then retransmits the message out to all users in the chat room.
  • Figure 5 illustrates the type of feedback that occurs. Throughout the process a feedback loop runs continuously.
  • the translation System As messages are received by the translation System, there is interaction between 1) the database for storage of information of the messages, as well as 2) with the machine translation engines.
  • the database stores our static translation cache, which contains many text phrases pre-translated across multiple languages. These translations are performed by humans and are thereby guaranteed to have perfect quality.
  • the cache modifies itself over time by reacting to patterns it observes in the types of messages it receives. This results in higher translation quality. Text elements that are not handled by the cache, are sent to the translation engines for translation.
  • Each box in the figure represents a single machine translation language direction, e.g. English to French.
  • the translation System utilizes multiple translation engine components. Different providers can be used for different language pairs. Some engines can support multiple directions.
  • the present invetnion also provides a translated web browsing tool that provides machine translations of website content.
  • One example is translation of all text on a website into a language defined by a user.
  • the present invention also provides methods and apparatus that translate text embedded within graphics on a website.
  • the browsing tool of the present invention can be useed both on an actual website and as a downloadable tool that plugs in directly to the user's browser.
  • the actual tool itself includes a toolbar that resides on the top or bottom of the user's browser screen and gives the user functional control over the language of translation, the URL of the website the user wishes to access, as well as a number of other features such as a way to submit the current site for human translation.
  • a screenshot of the browsing tool of the present invention is shown in Figure 6.
  • the browsing tool of the present invention is primarily a tool that provides individual users with access to Internet content that they would not be able to access with the tool. Examples of applications of the browsing tool include but are not limited to education, entertainment, research and the like.
  • Figure 7 illustrates data flow between the client wireless device, the wireless network provider, the data provider, and the translation services of the present invention.
  • the client wireless device is any personal mobile electronic device with a display/output apparatus, an input apparatus, and data transmission capability which is designed to serve as a mobile terminal for Internet and other network transactions. Examples include but are not limited to: cellular phones with data transmission and display capabilities, personal handyphone systems (PHS), personal.digital assistants (PDAs), palmtop computers, and Internet/network capable appliances and devices.
  • the wireless network provider is the data transmission infrastructure which allows the client devices to exchange data with each other and with any other devices accessible over the network.
  • the data provider is any device which supplies either static or dynamic data to the client device over the data transmission infrastructure.
  • the present invention acts as an intermediary in this data exchange, translating the data from one language to another as it passes from client device to data provider, from data provider to client device, or from client device to client device.
  • the wireless translation applications of the present invention are substantially equivalent to the internet translation applications. Some of the differences between the two include, the data is encoded in WML (wireless markup language), HDML, or some other standard for wireless data exchange, rather than HTML, the target end-user device is a data-capable cellular phone, personal computer, or other wireless data terminal rather than a desktop or laptop computer and the data is transmitted over the data network of the cellular service provider instead of/in addition to being transmitted over networks such as the Internet.
  • WML wireless markup language
  • HDML high definition language
  • the target end-user device is a data-capable cellular phone, personal computer, or other wireless data terminal rather than a desktop or laptop computer and the data is transmitted over the data network of the cellular service provider instead of/in addition to being transmitted over networks such as the Internet.
  • a client wireless device may send a data transmission over the wireless data transmission infrastructure and network, where it is routed to another client wireless device.
  • Figure 8 illustrates a generic peer-to-peer data exchange where the translation service of the present invention acts as the intermediary in the data exchange.
  • the present invention is integrated with the wireless data infrastructure and network. As data is sent from a client wireless device to another client wireless device over the wireless network server, that data is passed to the present invention which translates/processes/transforms that data and returns it to the wireless network server to be routed to the destination wireless device. Examples of data transmissions which fit this peer-to-peer model include SMS (short messaging system) messages, alphanumeric pager messages and the like.
  • SMS short messaging system
  • a client wireless device may send a request for data over the wireless data transmission infrastructure and network, where it is routed to a data provider server. That server replies with the requested data, which is returned to the client wireless device over the wireless data transmission network.
  • Figure 9 illustrates a generic client-server data exchange, where the translation service of the present invention acts as the intermediary in the data exchange.
  • a client wireless device formulates a request for data from a particular server; this request is then forwarded to the present invention.
  • the translation service accesses the wireless data and services specified in the client request and translates/processes/transforms that information before returning it to the requesting end-user. Examples of this client/server model include WAP data browsing, server push data, and the like.
  • the methods and apparatus of the present invention provide can provide draft quality translations of text emails. Users simply type their email in their own language and it is translated by into the target language of the person the email is being sent to. The translation can also take place on the side of the receiver, when someone receives an email in a language he or she may not be familiar with.
  • Instant messaging is designed as a communication platform for people who are accessing networks, including but not limited to the Internet, concurrently. When someone receives a message while offline, the message is stored for them to view the next time they log in to the Internet.
  • Translated instant messaging can be used in corporate communication, customer service, student interaction, and any other situation requiring instantaneous communication across a language barrier.
  • the System of the present invention can include a multilingual search engine that allows someone who speaks one language to search for information on the Internet or on a specific site that is in a different language.
  • a query can be entered in one language and the search engine of the System translates the query into the target language before searching for matching information.
  • the System also designs a mechanism that can resolve ambiguous search queries by asking the user for more input in potentially ambiguous situations.
  • the multilingual search engine can be used to search for information on the Internet in general, or to search for a product or piece of information on a specific website or domain of information.
  • Examples of potential uses include but are not limited to searching for a certain type of business outside of a country on an informational website, searching for a certain type of product on a foreign ecommerce site, or searching the entire Internet for websites related to a certain topic that are written in a language not native to the user.
  • Methods and apparatus of the present invention can tie in directly with online auction and marketplace sites. This solution allows users of the marketplace or auction to post messages or product descriptions in such a way that they are easily viewed and translated into a number of different languages. Form fields and drop-down menus can be used that limit the number of choices a user has when describing a certain product. This allows for storage of the posted information in a format that can be easily transferred to any language.
  • Figures 10(a) and 10(b) are directed to an auction tool embodiment of the present invention which combines features of the multi- lingual search engine and the browsing tool A user can enter a query in Language A, even though the site itself is in Language B.
  • a user could enter in Japanese "osara” which means plate. That gets translated into the proper language; in the case of an English-language site, "osara” is translated into “plate.”
  • the regular query is run on the auction site's database. From the auction site's side the interaction is the same as a regular, monolingual search. They haven't changed their processing at all.
  • the pages that are returned are translated by the System of the present invention and the links are shown along with the translated version of the links. When a user clicks on a link to see the actual auction, the auction page comes up and is completely translated.
  • the auction site does not have to change its database lookup or change the pages they push. All of the translation management, including preparing text for translation, executing translations, and displaying translated versions of pages is handled by the System, and is completely transparent to the auction site.
  • Figure 11 shows the dynamic translation cache, which records recently translated sentences and is dynamically updated with each translation call.
  • the dynamic translation cache is consulted first to see if the requested sentence was translated recently. If so, the recorded translation can be returned immediately, saving time and processing cycles on the translation engine.
  • This is significant for many applications of the translation System, but in particular for the auction tool.
  • auction searches users will often enter successive queries that are very similar to each other, varying the keywords only slightly. This causes the same auctions to be returned repeatedly.
  • the present invention capitalizes on this repetitive behavior with a dynamic cache that keeps a record of all the recent translations. This is done in a manner similar to the common phrase table, and similarly takes advantage of the characteristic repetitions in the application's language use.
  • the present invention integrates translated text and translated search into a product, allowing users of the marketplace or auction to search for goods and information in multiple languages. The results can then be displayed in their own language.
  • the present invention also provides users with custom dictionaries and common phrase lists tailored to their particular applications. This is especially effective where translations relate to a limited topic area, such as for a specialty goods auctions site.
  • System Users can have direct access to the System, web-sites and interfaces of the present invention. Additionally, the System of the present invention permits users to act as hosts.
  • the framework of the present invention includes an interface, distribution and translation layers.
  • a user uses the interface layer to construct an initial input and view the translated output as the user engages in multilingual communication.
  • Each application can have a unique interface that maximizes the effectiveness of translated communication in a particular domain.
  • the methods and apparatus of the present invention utilize Java-based input and output interfaces.
  • the present invention can also integrate with user interfaces of existing business applications, making it easy to empower existing applications with the capability of multilingual communication.
  • the present invention provides outpt interfaces that differ among applications, can be primarily Java-based, and handle output in all supported language pairs.
  • the output interface displays languages even to users who do not have an operating system that is native to the language of the output.
  • the System can utilize all Java Unicode character strings.
  • it may request that the user allow a short procedure that installs appropriate fonts and writes to certain configuration files on the user's system. This installation procedure enables the user's system to display fonts that are not native to their operating system.
  • the interface layer forwards the request to the distribution layer for further processing.
  • the distribution layer serves as a conduit between the user Interface and the translation layer.
  • the distribution layer provides language pair distribution, load balancing and is a common interface to the translation layer.
  • the distribution layer ensures that the translation queries are passed on to the correct translation engines of the System based on the language-pair of the translation request.
  • the Distribution Layer utilizes a load-balancing system that manages the load of each instance of the translation engines.
  • the System of the present invention can create multiple instances of the translation engine.
  • the Distribution Layer ensures that the queries are distributed efficiently among the different instances of the engine.
  • the interface layer will vary depending on the software application. Applications with widely differing user interfaces can all utilize the translation layer in the same manner
  • Translation is performed at the translation layer which can include the five steps illustrated in Figure 13. Procedures that each step undergoes are more fully explained in Figure 14.
  • Figures 15(a) and 15(b) illustrates the full linguistic processing that occurs in the chat application specifically.
  • a series of input aids are available to help the user type in inputs. For example, typing in Japanese is very slow.
  • the method and apparatus of the present invention minimizes the problems of monolingual input, which is usually too poor quality to translate well. This is achieved by providing a series of tools that draws a balance between the two extremes of fast input and strictly grammatical language. Further detail on input tools is given in Figure 16.
  • a static translation cache also called a common phrase table, is provided. This includes phrases that are frequently repeated in chat applications (or whatever the specific application of the machine translation is); the phrases are stored with perfect translations. Items in the cache go directly to post-processing without going through the translation or other engines. Further detail on the static translation cache is given in Figure 17. Finally, there is some post-processing. Further detail on the post-processing is given in Figures 18 and 19.
  • step 1 the text input is converted into a state that can be translated.
  • Text inputs differ among applications. It is important for every input be distilled down to a form that can be synthesized by subsequent steps which are application- independent. Therefore, step 1 is application-dependent.
  • step 1 the following actions occur: unnecessary whitespace within the text are removed, improper capitalization are removed, which are later restored in step 5, excess punctuation is removed, again restored in step 5, the input is spell-checked, the input is grammer checked. Certain contractions are removed from the input.
  • the pre-processing operates differently.
  • the pre-processing step handles the task of parsing the HTML, preparing the appropriate text for translation, and then reformatting the resulting translations while preserving the form of the original HTML page.
  • the text pre-processing step for HTML translation is described in more detail further below. As this pre-processing step distills the input, it retains information about the input's original state that is later restored to the translation in Step 6, as the translated output is produced.
  • Figure 20 illustrates text pre-processing.
  • Text pre-processing can remove white space, remove and retain capitalization information, remove and retain punctuation information, and rewrite contractions.
  • step 2 the input is analyzed for special linguistic structures such as synonymous words, extraneous expressions and common phrases. This is also the level where the System will examine the input for any potential ambiguities that the user would need to resolve. In cases such as translated search, where the System employs a feedback to the user mechanism, any such ambiguities would be resolved at this level before the input passes to the next step.
  • the language pre-processor determines what calls to make to step 4, as well as which of those calls to make to the static translation cache or to the translation engines, as described below.
  • Figure 21 is a continuation of Figure 20 illustrating the language preprocessing. This includes removing extraneous expressions (such as "well” or “so"), rewriting slang, rewriting abbreviations, and dissecting compound phrases into analyzable units.
  • extraneous expressions such as "well” or "so”
  • rewriting slang such as "well” or "so”
  • dissecting compound phrases into analyzable units.
  • 20 and 21 includes (but is not limited to) the following specific steps: • Remove capitalization, preserving the information; • Remove punctuation, preserving the information;
  • Step 3 is where the lowest-level translation occurs. Depending upon the content of the text input, the translation step employs one of two subsystems:
  • a translation cache stores commonly submitted inputs and their translations for extremely fast lookup.
  • the motivations for the static translation cache are, translation quality, speed and scalability.
  • the present invention is able to specify perfect translations for a large number of the most commonly submitted text inputs.
  • Such inputs include colloquialisms, slang and common phrases in each language, as well as specialized phrases that are common in specific client applications and industries.
  • the present invention increases the speed of common translations and thereby improves the user experience. By minimizing the number of calls that the Systemof the present invention makes to the translation engines, scalability and stability are unprovided.
  • the cache functions by grouping phrases that have similar meanings and then associates a single canonical phrase with each group. When performing a translation on any of those phrases, the cache returns the translation of that canonical phrase.
  • the cache includes a database table of canonical phrases across all supported languages and a series of hashtables for each supported language.
  • canonical phrases that have a version in every supported language.
  • the expression "Hello” is universal and has a version in all languages.
  • Each hashtable stores phrases that may not have exact equivalents in the other languages, but can be approximated to one of the canonical phrases in the first table.
  • the key of the hashtable is the common phrase, and the value is an index to the row in the first database table with the equivalent canonical phrase. Because the text has been pre-processed and distilled prior to handling by the cache, the lookup is not disturbed by minor textual differences in the input such as extra spaces or inadvertent punctuation.
  • chat applications a large number of common phrases, including but not limited to greetings, frequently repeated phrases, and chat lingo are stored in a table that lists translations for each of the phrases. This ensures fast, completely accurate translations for the most common phrases which people use in the chat environment.
  • the phrases stored in this master table are called canonical forms.
  • variants of each of these phrases in each language so that these will be recognized as well. These variants include contracted versions, versions with extraneous, non-content-bearing words ("Hello” vs. "Hello there"), and synonymous expressions ("I'm well” vs. "I'm fine.”).
  • Figure 17 illustrates the common phrase table with the static translation cache.
  • the processed input is received, and then converted to a key form by removing particulars of language usage. (This process includes many of the steps described in Figures 20 and 21.)
  • There is a look up in the key table to see if the input is a common phrase.
  • the key table then gives a reference into the canonical phrase table, which gives the output translation of the input in the appropriate target language. Punctuation, capitalization information, and the like are then restored.
  • the System of the present invention sends the input to an appropriate third-party translation engine for processing.
  • the present invention utilizes several translation engines to ensure that the quality of translation is optimal for each supported language-pair and treats each third-party engine as a virtual black-box. Different engines have different capabilities.
  • a custom Java wrapper is written to each engine, which serves as a common API so that previous steps do not have to understand or interact with each engine's unique API.
  • Each engine instance handles a single language pair and produces for each text input one translated output.
  • Each third-party translation engine is treated as a distributed object and communicated by using the RMI protocol.
  • the System of the present invention utilizes multiple instances of each translation engine running on numerous machines to minimize dependence upon the stability of any one single engine instance or machine. Further, the System of the present invention can be scaled simply by adding additional machines and connecting them to the distribution step, described previously.
  • the quality of an MT engine's output is highly dependent upon the quality and relevance of the lexica it utilizes.
  • To improve the quality of output lexica for each language is compiled for numerous topic areas, and the appropriate topical lexica is applied to each communication domain.
  • the translation engines employ business-related lexica, whereas its sports rooms use sports- related lexica.
  • Figure 22 illustrates the translation engines with proprietary dictionaries of different types. These include topic specific dictionaries appropriate for the topic of the chat room, website, or other current application, proper name and proper noun lists that the method and system of the present invention frequently update to make the translations in each application as current as possible. Users are also able to make their own dictionaries.
  • step 4 the various fragments of the original input are reassembled after having been split apart in step 2. Some parts of the input may have passed through a translation engine, while others were routed to the static translation cache.
  • Figure 18 illustrates the language post-processing (restoration) stages where separate units are constructed and text is reconstructed. This is includes restoring certain abbreviations and reconstructed units that were separated during the language pre-processing in Figure 2-1.
  • Step 5 restores the textual changes that were made to the input in step 2.
  • text post-processing occurs with punctuation, contractions, and capitalization restored as appropriate. This step generally restores the information extracted in Figure 20. The text is then prepared for display and output.
  • the translation layer includes customized dictionaries.
  • One type of customized dictionary is topic specific where the topic is specified either automatically by the topic of the chat room, or manually by the user as he/she uses the browser or search engine as illustrated in Figure 23.
  • These topic-specific dictionaries include ones provided by the translation engine itself.
  • the dictionaries of the present invention improve the level of translation for specialized topics. Additionally, for general translation language and topics of the dictionaries are maintained topical and current.
  • the present invention updates a dictionary of proper nouns with their correct translations or transliterations for all necessary language pairs.
  • the translation layer also includes user specific dictionaries. Users are also encouraged to assemble personalized lexica. These allow users that use specialized language not handled well by general dictionaries to specify the desired translations. Users that have familiarity with a language other than their own can build this dictionary directly. If a user lacks this ability, the present invention invention provides a tool for speakers of two different languages to specify jointly the proper translation of a term for the dictionary.
  • a second feature allows a person to store a word in his/her personal lexicon, notify a professional translator, and have the correct translation of the expression added to in his/her dictionary at some point in the next day or two.
  • the System of the present provides a filter that scans all input to the chat application for slang, idioms, chat lingo and problematic constructions.
  • the filter expands or rewrites these specialized phrases to expressions in a form that can be better translated by the translation engine.
  • the filters can be constantly updated to keep up with current slang and chat language
  • the present invention provides feedback to enable the users of the System to judge and respond to the quality of the translated output.
  • Figure 25 shows the many levels of feedback which are incorporated in the translation System of the present invention.
  • the System incorporates different type of feedback to improve the quality and usability of the translations.
  • the "Help?" button provides a mechanism for other users to say whether an input was translated in an understandable fashion. This is especially important for monolingual users who otherwise have no way of knowing whether their inputs are being translated correctly.
  • System-User Feedback The System incorporates warnings, suggestions, and both static and interactive tutorials to actively educate users to use the translation engines as productively as possible.
  • User- System Feedback Users are able to direct the translations through a number of means, such as “do not translate” lists, “do not translate” markers in the input line, and user-defined dictionaries.
  • User- System Feedback Indirect: User activity directs modifications that the developers make to the translation System. For examples, users will report poorly translated words and phrases, and developers will also monitor user- defined dictionaries and "do not translate” lists to find items to add to the System dictionaries.
  • Figure 26 illustrates the way the System educates the user about the MT engine, as well as about his or her own language. People believe that they are experts on their native languages, however most people have limited knowledge about their native language and how it works.
  • the tutorial informs users about the elements in their language which are likely to be ambiguous or difficult to translate, such as slang expressions, idioms, and certain words and constructions ("got", "se”, etc.) Interacting with the MT engine itself also shows users who understand at least some of the target language the strengths and limitations of the System, and helps educate them about the most productive use of the translation engines.
  • User feedback to other users is from the recipient of the translated output to the original sender. For example, a user who receives an incomprehensible message can tell the original sender that he or she did not understand the message. This immediately prompts the sender to rephrase the message in a form that can be more easily translated. A person receiving an instant message from a colleague, realizes that part of the translated message is not very clear. The receiver can immediately prompt the sender to rephrase the difficult part of the original message.
  • User feedback to the translation system is feedback that the recipient of the translated output gives to the translation System. Over time, as large amounts of translation data are accumulated, the present invention can use this data to improve the quality of the translation System. This can occur manually or automatically with the System.
  • System feedback to the user occurs in a negotiated translation when the System and the user together attempt to resolve ambiguities in a translation.
  • System feedback is especially critical when text entries are short as in search queries.
  • the System can respond by prompting the user to select from a list of ambiguity resolving options. Without this type of feedback highly accurate translated search queries are not possible.
  • Text-processing step for HTML page translation is different from plain text translation.
  • the present invention parses HTML pages, and provides placement of translations in the HTML page.
  • There are two options for HTML page translation show both original translation and show only the translation. When the originalanl and translation are both shown the translations are preferably inserted into the original page without disrupting the form of the page.
  • Tthe System then parses the HTML page and finds key markers which delineate appropriate locations for inserting translations. When only the translation is shown, the original text is replaced entirely by the translations.
  • igure 15(b) is similar to Figure 15(a) with the same data path, but is for non-interactive applications such as the browser or auction tool. Wireless communication would also fall under this same grouping.
  • Figure 27 describes the browsing tool on a high level as a three-step process: 1) the user makes a page request, 2) the request undergoes processing by our System, and 3) the System returns the response page to the user.
  • a user request consists of a URL and a language pair — source language and target language.
  • the source language is the original language of the page, and the target language is the language that the page is translated into.
  • the source language may become optional as a language identifier is incorporated into the browsing tool.
  • the request may also include cookies previously set by the web site associated with the page request and other parameters, including but not limited to form parameters which can be forwarded on with the request.
  • Figure 28 is a high-level blowup of step 2 from Figure 27. It describes the overall processing which occurs between the users' page request and the page response. Three steps are included: extract parameters from the user request, perform page retrieval and processing, and return the processed page within a dynamically-generated page.
  • Figure 29 provides more detail of step 2 from Figure 28 for page retrieval and processing.
  • page requests There are three main types of page requests. The first is a user- specific page: These pages are never cached because it is assumed that their content is changing too often for them to be effectively cached. An example of such a page is a user profile page. They are always newly retrieved and rewritten on each new request.
  • the second type is a non-user-specific page that has been cached. If a page is not user-specific and has already been cached, it is pulled from the cache and returned.
  • the third type is a non-user-specific page that has not been cached or whose cache entry is out-of-date. These pages are newly retrieved and rewritten. In addition, they are stored in the cache for future queries.
  • Figure 30 is a blowup of page retrieval as represented in Figure 29
  • the browsing tool of the present invention must first request the page from the source web site. In order to do this, it must first extract necessary information from the user's request, create a new second request, and then utilize this second request to query the source site.
  • the page retrieval process consists of five steps: 1) Add parameters to URL. Here, any parameters contained in the user's page request are added to the URL of the second request.
  • Figure 31 is a blowup of step 1 from Figure 30 describing how parameters are added to a URL before querying the source site.
  • this process is language-sensitive. Specifically, when a user is viewing pages with source language A in target language B, parameters which represent user inputs are translated from language B to language A before being added to the page request. This enables the users to actively interact with pages that are not in their own language. For example, if a user is viewing an auction site whose source is language A in language B, and the user wishes to enter a search query in language B for a particular object, that search query is translated into language A before being submitted to the page.
  • Figure 32 is a blowup of step 2 from Figure 30. Cookies which are passed in as part of the user request are rewritten to drop the path prefix of the present invention, thereby restoring the original path of the cookie, and the browsing tool includes the cookie when querying the original source site.
  • Figure 33 is directed to page rewriting and illustrates why the browsing tool of the present invention is unique.
  • the browsing tool kit enables the user to insert the translations inline, in order to view both the original text and the translation simultaneously. Additionally, the browsing tool preserves the look and feel of the original page. This is accomplished by carefully positioning the insertion of translations at strategic locations within the page so that they do not significantly shift or displace original content. While the page is made inherently longer, its overall look and feel is not disrupted.
  • page rewriting is a two-pass process.
  • the page In the first pass the page is traversed and translation placeholders are inserted in places where translations should be later added. Simultaneously, a list of text strings which must be translated is extracted.
  • the text strings which have at this point been translated are now inserted into the page, replacing the placeholders.
  • Figure 35 is a graphical illustration of the page rewriting process described in Figure 34
  • Figure 36 is a blowup of pass 1 from Figure 37.
  • pass 1 the HTML page is traversed and HTML elements are encountered. Each HTML element is handled uniquely. Certain HTML elements represent textual elements which require translation, while other elements contains links which must be rewritten. Still other elements require other handling.
  • Figures 37-39 give examples of how certain elements are handled in pass 1.
  • Figure 37 illustrates handling of normal text. Normal text is defined as text positioned outside of any HTML tags. Normal text is handled in two ways: 1) It is copied to the rewritten page, and 2) It is added to a cumulative text buffer to be later translated and inserted into the rewritten page. Both steps are required because the browsing tool displays both the original text and the translated text in the page. So the first step preserves the content and location of the original text piece. The second step causes the text to be translated and inserted into the page.
  • the text piece is added to a buffer and later translated and inserted into the page, rather than immediately translated and transferred to the page.
  • the translations In order for the translations to be inserted into the page in a way that does not disrupt the page's original look and feel, they must be strategically positioned. This requires that each text piece not be immediately translated and inserted following its original text counterpart, but rather that translations be grouped together and later inserted into the HTML page in an appropriate location. This results in a more coherent page and a better user experience.
  • Figure 38 illustrates handling of JavaScript.
  • a JavaScript block When a JavaScript block is encountered: 1) it is scanned for text strings which require translation, and these strings are replaced with placeholders in the JavaScript; 2) these strings are translated; 3) the placeholders are replaced by the newly translated strings; and 4) the new JavaScript block is copied over to the rewritten page.
  • JavaScript blocks Unlike normal HTML where the original text and the translated text are conveyed, JavaScript blocks only convey translations, since most JavaScript text strings represent single text elements which can only have a single value.
  • Figure 39 illustrates handling of the translation identifier.
  • Translation identifiers are HTML tags that signify the end of a contiguous chunk of text, representing a position where a translation of previous text should be inserted.
  • FIG. 40 describes instances in the page where the browsing tool rewrites URLs in the page.
  • URLs representing textual content are encountered, they are rewritten through the server of the present invention. URL rewriting ensures that as the user clicks through to subsequent pages, these pages continue to be translated as well. This provides a seamless user experience, allowing the user to browse and translate the web freely without any intermediate steps.
  • This figure denotes specific cases where URLs are written to pass through the current invention. It is important to note that URLs are only rewritten if they represent textual content. URLs which represent other types of content, such as binary objects (images), should not be rewritten and should reflect the original source location.
  • the content of a URL is rewritten through the servers of the present invention., by changing the URL to pass through serves of the System of the present invention, and the original source location becomes a parameter which is passed to the servers. This parameter denotes the page which the user is requesting.
  • a relative URL is one in which the domain of the source location is not specified. With the present invention, the source's domain is added as a prefix to the URL, and then the URL is written as described in the preceding paragraph.
  • Figure 41 is a blowup from Figure 35 for text translation. Figure 41 illustrates how text is translated as part of the browsing tool. Multiple individual text strings which have been encountered during the page traversal process are concatenated. A single concatenated string is passed to the translation engine, which returns a single translated string. This single translation string is broken back up into multiple translations and returned. This approach enables all the text on a page to be translated using a single call to the translation layer.
  • Figure 42 is a blowup of the final stage from Figure 30 for handling incoming cookies. This is the reverse of the process described in Figure 32. Cookies which are returned from the queried site are rewritten so that their path passes through the servers of the present invention and are then inserted into the page response and returned to the user. This ensures that the cookies will be resent to the site whenever the user utilizes the browsing tool to access the same site in the future.
  • the HTML page translation process of the present invention performs the following steps: (i) Performs text pre-processing on the HTML page, parsing the HTML page and producing a collection of text strings that should be translated.
  • (ii) Performs language pre-processing on each of these text strings.
  • the language pre-processor determines what, if any, textual elements within each of these strings needs to be sent on for translation. For each of these to-be-translated strings, it separates them into two groups, "a" those that should be handled by the third-party translation engines, and "b" those which should be handled by the static translation cache. For those in group a, the language pre-processor concatenates all of these to-be-translated elements into a single demarcated string to the third-party engine in step 4. By concatenating all of these strings into one, it limits the number of calls to the third-party engine for each HTML translation.
  • the language pre-processor makes individual calls to the translation cache for each textual element.
  • step 4 Upon receiving all translations from step 4, language post-processing occurs where the proper outputs are reconstructed. Text post-processing in step 6 then reconstructs the HTML page, inserting translations in the appropriate locations and thereby preserving its original form.
  • the System properly can handlesHTML pages with Javascript, Forms, and
  • the resulting page then operates as the original with no change in functionality. Additionally, the System can use optical character recognition technology to recognize the textual content of images, and provide a translation for text embedded in images as well as pure HTML text.
  • the framework is written primarily in Java, making it compatible with existing software applications and legacy systems. Because the framework is Java-based, it can run on any platform.
  • the present invetnion can operate on Linux Pentium machines.
  • the third-party translation engine can operate on a Unix, Linux or Windows NT platform on distributed machines.
  • a "Help" butoom can also be provided.
  • the use of translation within a chat environment is provides users to give feedback about the understandability of a statement's translation. This feedback takes the form of a button connected with each posted message called the Help button which other users can click to indicate an unclear translation , see Figure 43..
  • the user that made the original statement is notified of the bad translation and shown a screen with an editable copy of the statement as illustrated in Figure 44.
  • the mistranslated statement can be sent through a grammar checker which can scan the input for a number of possible problems, including unparsable grammar, misspelled words, diff ⁇ cult-to-translate words or constructions, ambiguous words and the like.
  • the Help button provides information about misunderstood sentences to the user and closes the feedback loop. This is done when another user does not understood the translation.
  • the System of the present invention removes the burden of detecting the need for clarification from the computer.
  • the System takes advantage of the communal nature of the chat room to allow users to help each other to find the best language for translation.
  • the Help button can refer to either the full user comment or to a phrase or word within the comment. In the case where just a single word or phrase is mistranslated, the other users can specify the specific part of the input which was confusing.
  • the translated browser of the present invention can automatically provide translation sites without requiring the user to specify the source language of each site. This is done by implementing a language identifier that scans a page, guesses the language and then executes the proper translation automatically. In the case where the identifier guesses wrong, or the page contains multiple languages, the user can override this feature, By removing any necessity for the user to worry about, or even be aware of, the source language of the materials he/she is looking at, the present invention makes the browsing experience as seamless as possible.
  • the present invention can also provide a translation helper for the user.
  • the translation helper is an interactive process with many functions including, instructing users on the proper use of the translation engine, helps users determine the best phrasing in order to achieve high-quality translation, and adjusts user expectations about the capabilities and limitations of machine translation. Users are trained to avoid these problematic constructions. This is done through both a passive approach, which attempts to provide instruction and information to the user, and an active element that reacts to user input to guide the user to better phrasing for translation.
  • the present invention offers a number of formats for the information which allow the user to choose the level of detail and the conciseness of presentation which best suits his/her tastes.
  • the formats the user can choose from are:
  • a longer README-style file This format gives an expanded form of the information with longer explanations and good and bad examples to illustrate each point.
  • the mascot is featured in amusing cartoons to make each point more memorable.
  • An interactive tutorial In this version, the the present invention guides the user interactively through a number of examples to communicate the information in a fun, memorable way as shown in Figure 45.
  • the interaction includes illustrations, small quizzes, good and bad examples, and areas to test examples with the translation engine.
  • Figure 46 is a blow-up of the different types of the tutorials, such as a quick, bulleted list of points, a more thorough tutorial with good and bad examples, illustrations, and explanations, or an interactive tutorial with quizzes, games, and translation test areas.
  • the more elaborate tutorials make the learning experience more memorable and fun.
  • the method and apparatus of the present invention provides a level of interactivity with the user to assist the learning process.
  • the present invention also provide tutorial daemons that are programs which run in the background and monitor the users' inputs. By monitoring a user's typing before the sentence is sent to the translation engine, the present invention helps to guide the user toward sentences that are more easily translated and warn them of dangerous inputs. When a problem is detected, it is marked in the text within the input box and a "warning light" comes up in an area of the screen dedicated to tutorial messages as illustrated in Figure 47.
  • the tutorial daemon can includes a spell checker, a grammar checker, a difficult phrase flagger and an input length meter.
  • Figure 48 shows how users' expectations and knowledge about the System are influenced through actual use of the System.
  • the tutorial daemon is a second-level tutorial which runs in the background and gives feedback about the user's input.
  • the tutorial daemon flags difficult words and phrases, troublesome constructions, spelling and punctuation errors, likely accent errors, troublesome zero-anaphora, unlikely part-of-speech sequences, and other possible sources of translation errors, in order to train the user. Further detail on the tutorial daemon is given in Figure 49.
  • the user receives feedback from seeing the translations that come through and other users give feedback in the form of the "Help?" button.
  • Figure 49 illustrates the tutorial daemon, which provides a number of checking stages to provide feedback to the user. Before the user hits the enter button the tutorial daemon provides a warning of things to watch out for.
  • the daemon includes (among other elements) a grammar checker, a spelling checker, a difficult-phrase detector, an input-length meter (to warn users about overly-long inputs), an ambiguity detector, and an ambiguity resolver, which uses local context to determine the meaning of ambiguous words and phrases.
  • the spell checkers reports each word that does not appear in one of the active lexicons. It does this by checking the current input line at short intervals before the return key is hit, and marking, either by highlighting, underlining, or some other graphical notation, that an unknown word has been found. This allows users to filter out spelling errors, non-standard words and slang, as well as problematic proper nouns before they are sent to the translation engine. When a user right clicks on a questionable word, a list of suggested alternatives is presented to speed correction.
  • the grammar checker checks grammer such as how punctuation is used. It also attaches part-of-speech tags to each word and checks to see if any unlikely tag sequences are detected. A questionable sentence or phrase is highlighted to notify the user that the user should rephrase the input if possible. Right clicking on the questionable phrase brings up an explanation of the problem and a possible suggestion for a fix. Examples of the grammer checker include checking to make sure every sentence in Japanese has a subject and verb, and if question words have the proper accent marks as in Spanish.
  • Aa number of languages have words and phrases that are not grammatically incorrect but are difficult to translate. Examples of such difficulties include”no" and “suki” in Japanese, the impersonal passive with “se” in Spanish; “got” in English, “marche” in French..
  • the difficult phrase flagger of the present invention highlights these to encourage the user to rephrase the sentence for better translation.
  • a right click on the problematic expression brings up an explanation and a list of preferable rewordings.
  • An input length meter is also provided with the present invention. Because translation quality declines with longer sentences it is important to keep input as short as possible. As a constant reminder of this, a small input-length meter is displayed next to the input text box in the chat application as illustrated in Figure 50. The input box is periodically checked to see how many words, or in Asian languages, how many characters, have been entered, and increases the meter reading accordingly. Certain words, such as conjunctions, push the meter's needle up even further. After a certain word count, the meter enters a red "Danger Zone" which warns the user that their input is much more likely to be mistranslated. The Danger Zone level depends on the language and engine being used.
  • a daemon watches the queries a user inputs, and issues a suggestion if a number of one- or two- word queries are entered in succession.
  • a major obstacle is the difficulty in translating the exact meaning of the search terms.
  • the context of a number of search terms aids significantly in determining the exact meaning of the query words.
  • the average number of words per query as reported in most studies is usually around two, so without encouragement most users will tend to enter these short queries and will probably become discouraged by the poor search results.
  • the present invention checks the input for potentially ambiguous words and phrases. These ambiguous expressions are highlighted to encourage the user to rephrase the input for a clearer translation. Without this feedback, the user will often have no idea why a sentence or query produced such a bad translation.
  • the ambiguous words can be detected by consulting a specialized word-translation dictionary which lists specific alternate translations for a word.
  • Ambiguous phrases can be detected either by scanning for specific phrases (such as a "yes” or "no" following a negative question) or by executing a part-of-speech tagging and seeing if there are multiple tag sequences judged likely.
  • an ambiguity resolution program can be triggered, either with a right click in chat, or automatically in search.
  • the resolver can either consult surrounding context or other search terms to determine the most likely sense of the ambiguous word, or it can spawn a dialogue box to ask the user for clarification directly.
  • the method and apparatus of the present invention logs every word which passes through the translation engine untranslated and also logs every input which receives Help button feedback from another user. These logs permit immediate recognition of any patterns in mistranslation which occur, including words missing from the lexicon, constructions not covered in the translation System's grammar, and frequent grammatical and spelling errors.
  • the present invention provides a number of aids and shortcuts to help users enter their input quickly and correctly. These include an iconic entry which provides a shortcut for input in the chat application. A user clicks on a series of special icons which immediately insert certain set phrases into his/her text entry box. These icons take three different forms serving different purposes.
  • Figure 16 illustrates the wide variety of input aids incorporated into the System. These include typing short cuts (either by keyboard or mousing on a separate menu), emotions, a hyperlinked dictionary, buttons which introduce long phrases into the chat or other application, special characters which set apart text which is not to be translated, lists of words and phrases which are not to be translated, and automatic recognition of URLs in the text.
  • the present invention also provides typing shortcuts.
  • the chat environment requires fast input and quick reaction to maintain a fun and interesting level of interaction.
  • this pressure to increase input speed also encourages the user to cut corners which greatly harm the quality of translation.
  • These cut-corners include abbreviating frequently repeated words, using pronouns, and leaving out subjects or verbs entirely, especially in Japanese.
  • the present invention shows the user a small window with a number of phrases which can immediately be entered into the text input line with a single click of the mouse as illustrated in Figure 51.
  • the phrases can also be accessed with keyboard shortcuts to make input even faster and simpler.
  • the present invention also provides a number of emoticons and illustrations that users can include in their messages, such as a smiley face and a heart that is illustrated in Figure 52. These are transparent to the translation engine and thus will have no effect on the translation quality. However, they have a substantial effect on the user-friendliness of the System and the total ability of the users to communicate and connect with each other.
  • Action buttons are provided to enable a user to select from a menu of buttons which print out full sentences describing the user's attitude or actions. These range from the straightforward (“[User A] scratches his head.") to the cute (“[User A] blows [User B] a kiss!) to the silly (“[User A] dances the Macarena.”). Each action phrase is stored in each translated form, and is displayed to each user in the appropriate language.
  • Special characters are designated which signal the translation engine of the System of the present invention not to translate part of the input.
  • the user simply surrounds the text not to be translated with these special characters and the translation
  • the System of the present invention ignores that section of the input and sends it through verbatim. These are important when entering names which are also common nouns (e.g. Nick, Young, the Giants, Los Angeles), when entering titles which the user does not want translated, and when users are discussing actual language use and language learning and need to mention specific examples.
  • users can construct a personal list of words and expressions which are not to be translated. With such a list, a user can record names and titles which he/she mentions frequently, removing the need to annotate them each time with the special characters.
  • hyperlink dictionaries are provided and permit a user to immediately bring up the dictionary definition for any word by right clicking on that word. This is important for users because many of them will be language learners or people interested in other cultures and they will want the ability to see immediately the meaning of new words they encounter.
  • the user might feel some concern that the intended meaning of the sentence was preserved in the translation.
  • One way to reassure the user and give him/her the power to make sure the translation is correct is to make available the dictionary definitions of the translated words.
  • the dictionary definitions shown can either be the literal dictionary entries in their entirety, or a check of local context can be used to determine which particular sense of the word is correct for the sentence.
  • the My Translator area includes the following:
  • the present invention permits chatters to set their keyboard shortcuts to enter certain words and phrases automatically. Frequent chatters will appreciate being able to store these shortcuts from session to session.
  • a general-purpose space is provided to the user to jot down notes while using the web browser, search engine, or chat rooms.
  • a number of personalization features are unified and presented in one section for the user's convenience. Each user is able to have their area where they keep their personalized dictionary, a personal "do not translate" list, personally chosen default lexicon selection, personally defined keyboard- shortcuts, and a notepad.
  • the next page that they see after the Splash page is the Welcome page and then from that page, if they are a returning user they can log right in, go to the chat rooms page, choose a chat room and start chatting. If they are new users, there are three main options. Ideally they would go to the sign up form, fill all of that out, then go to the tutorial, learn how to use the chat and then go to the chat rooms page. If they are not convinced that they should sign up on the Welcome page, then they can go to the tour, find out more about it, and then go to the sign up form. There are also a lot of other pages on the site that anybody can access.
  • the web site of the present invention is built with different language zones. Initially a user comes in, selects a language.
  • Figure 56 illustrates that there are a plurality of different features in the chat application.
  • the user can have conversations with other users by exchanging translated messages in the chatroom.
  • the user can also open a private chat window in order to have a one-on-one conversation.
  • the user can switch to another chatroom.
  • the user can view profiles of other users and see their gender, location, age, occupation, fluent languages, country of origin, and personal message. They can edit their own personal profile, as well. And they can access the help section which includes a tutorial, translation tips, support form, and FAQ.
  • Figure 57 illustrates that there are different features in the chat room.
  • Examples include keyboard shortcuts for entering special characters, icon messages so users can send pictures (such as a smiley face) as a message.
  • Bilingual users can switch the enter language control and enter in different languages. When the user moves the mouse over components of the chatroom, a description of that component appears in the mouseover tip box. Moderators have extra features, such as silencing other users or even eliminating their accounts if they are too disruptive.
  • a "Do Not Translate” feature is also provided. This is utilized when the user is entering a phrase and wants to have a part of it not translated. For example, if they type in "Apple Computer,” in the English-French chatroom. They do not want “Apple” to be translated into “pomme,” the French word for "apple.” Right now we have a feature where they user can place “o” characters around whatever they don't want to have translated. There are two ways to do this: they can use the Do Not Translate Button or type in the characters themselves. The Do Not Translate button is on the chat, and when they hit that button, the "o” characters are always automatically inserted around the cursor. So when they type, they are actually typing between the "o” characters already instead of having to go and put the special characters around it themselves. But once they learn that those characters keep the phrase from being translated, they can just type them in themselves instead of using the button.
  • the Help process can be used when somebody enters a message that a user doesn't understand. The user can let the other party know that the user doesn't understand. We will go into more detail about this in Figure 58.
  • the Help process is illustrated n Figure 58.
  • user A enters a message with a typo.
  • User B views the translation but doesn't understand it.
  • User B can click the Help button that is on User A's message and right away it will put up a message that says, "User A, I didn't understand your message. Please rephrase it.” Both that message and the message that was misunderstood become highlighted.
  • User A can re-enter the message so it can be translated again.
  • a keyboard shortcut is also provided: when the up arrow is pressed the previous messages appear in the text box where the user enters its message. Instead of retyping the entire message User A can hit the up arrow, fix the typo in the previous message, and send it again. This provides a fast way for people to be able to let each other know when there is any miscommunication.
  • the highlighting makes the process clearer and faster as well. By highlighting the message it becomes much easier to spot the misunderstood message.
  • Figure 59 illustrates the "current member's box".
  • In the upper right area of the chat room is a list of all of the members currently in the chat room.
  • a "personal information window” provides information on how to find a person.
  • "Private chat” brings up a new window where a user chat one-on-one with that person and the "ignore button” is used to ignore a user and stop seeing their messages. If none of the names are selected all of these buttons are disabled. If a user's own name is selected then the user can see their own profile and edit it. The other buttons are disabled.
  • the "personal information” button can be clicked so the user see its own information and the user can also edit its own information. Another button is provided on the "personal information" window which brings up another window where a user can edit its own profile information. If another member's name is selected, then all three of those features work. A user can see its profile, the user can chat one-on-one with that person in another window or the user can gray them out and stop seeing their messages.
  • Switching language zones is illustrated in Figure 60. For example, if a user is a viewing a website in French and decides to go to a chat room where another language is used, a window pops up that says "You are moving to a different language zone would you like to view it in English or Japanese?" The French user then selects the new language and from then on views the site or chat room in the new language.
  • Figure 61 is an overview of the browsing tool of the present invention.
  • the browsing tool is a frame and has various features, more fully described in Figure 62.
  • the browsing tool is utilized when a user on one website enters the URL of a website he or she would like to translate. The user then goes to that new website with the translations. At the bottom of the window the user clicks on a link on the page and goes to that new page which is also translated. A user can also enter a new URL into the browsing tool and goes to that site translated.
  • Figure 62 lists some of the browsing tool features.
  • the browsing tool permits a user to change what language the site is being translating to, including "none". Additionally, the user can customize it and have its own favorite links, set up its own look and feel, toggle between showing and hiding the original language.
  • a multi-lingual dictionary pop-up is also provided.

Abstract

A method for electronically translating text provides an electronic language translator. Source language text is received as an input to the electronic language translator. The source language text is translated at the electronic language translator at the time of submission into one or more target language texts. A user is then provided with an option of viewing one or more of the target language texts with or without the source language texts.

Description

METHOD AND APPARATUS FOR PROVIDING MULTT TNGTTAT, TRANSLATION OVER A NETWORK
BACKGROUND OF THE INVENTION
Field of the Invention This invention relates generally to translation methods and apparatus, and more particularly to translation methods and apparatus over a network.
Description of the Related Art:
In most computer systems involving a central host processor and numerous distributed access devices such as video display terminals, information is transferred between the host and each access device via a screen display formed as an integral part of the access device. The screen serves the two-fold purpose of displaying input information provided by a user as well as displaying user- readable output information, generated by host processing, to the user. The input information is generally provided by the user via entries on a keyboard also formed as an integral part of each access device. Input or output information is typically composed of an arrangement of system-provided words or phrases followed by user or system-supplied data fields displayed in a predetermined pattern on the screen. In conventional systems, an application executing on the host utilizes only a limited number of screen patterns or formats so that only a standard set of screen images corresponding to the formats may be called into view by the user for input or invoked by the host processing for output. The definition of each screen format is typically deeply embedded in the source code for the application. There is essentially no flexibility provided to the user to allow for the creation of customized formats and, correspondingly, their screen images. In order to expand the user base of a previously developed software application system, particularly to allow foreign affiliates of the system developer full utilization of system capabilities, major modifications to the source code of the system have conventionally been required, such as by rewriting significant portions of the code implementing input/output (I/O) interface functions.
The foremost modification in the above special situation is that of translating the descriptive words or phrases from the original language (e.g., English) to a different language (e.g., Spanish). If there are affiliates from numerous foreign countries then, besides the effort of rewriting the source code, multiple copies of modified source code require storing, tracking and updating. Such a task becomes unwieldy, burdensome and costly. Thus, for example, if a source code module uses or produces user-viewable information, then there must be a different copy of the module for each language executable by that software. Besides the actual system copies of the code, support software is required to inform the system developer of the status of the multiple copies. Moreover, additional storage devices are needed to store all the additional versions of the software. For a large scale system involving millions of lines of code and thousands of modules, the storage requirements may become enormous.
In addition to the problem of direct language translation, there are also problems as how to treat the data supplied to the data fields. It is usually required that certain data be converted, such as by converting from non-metric to metric, and other data be reordered, such as month/day/year versus day/month/year which is the convention of some foreign affiliates.
A translation environment which serves to buffer the host system to each of the access devices is disclosed in U.S. Patent No. 4,870,610. Broadly speaking, the translation environment includes an autonomous processor interposed between the host system and each access device. Information transmitted in either direction between the host and access device is diverted to the processor for intermediate processing. The diverted information contains detailed character data either appearing on the input request screen originated at the access device or on the output response screen destined for the access device, depending upon the direction of original information transmission. The character data is of two types, namely, system-supplied field identifiers and user-provided data entries associated with the identifiers. Identifiers are expressed in a first user language (e.g., English).
In order to offer access to the host system by a user of a second language (e.g., Spanish), the screen displays and, most particularly, the identifiers are first translated to the second language via a format create process. The output of this create process is a translation file which stores the mapping relationship between the first language screen and its second language counterpart. The translation file is invoked by a translation execution process whenever the second language user accesses the host system. The contents of this file are used to translate from the second-to-first language upon a host request and from the first-to-second language upon a host response. A feature of this arrangement is that both the format create process and the translation execution process operate in the translation environment which is transparent to the host system. With the translation environment, the user may customize screen displays to maximize system utilization.
U.S. Patent No. 5,966,685 is directed to a system of parallel discussion groups operated in conjunction with a message collection/posting software program, data filter software program, and a machine translation software program. A structure and process is created to enable discussion group users, of different languages, to communicate with one another. An automatic batch process is utilized that executes at a remote site. No human intervention is required for the pre-processing, translation, or post-processing functions. Additionally, users simply specify a language preference to realize the benefits and advantages of the present invention.
A number of discussion groups run in "parallel"; one group for each language being used in the discussion groups. The individual discussion groups all contain the same information, in the same order; the only difference being that each parallel discussion group is written in a different language. Once a user logs onto a particular parallel discussion group he or she may then choose his or her language preference. If the user's language preference is set to French, the French version of the discussion group will be accessed. Messages posted to a discussion group will be periodically collected, translated to the other languages, and then posted to those respective target language discussion groups. The collection and posting of the messages will be accomplished by the Message Collection/Posting Software The new messages which are collected on a periodic basis are sent to a commercially available Machine Translation (MT) software for translation. Messages are batch processed automatically at the network site and without human intervention. The translation takes place at a remote site so user actions are minimized.
Before the input text is actually submitted to the MT software, the input text is passed through a filter software program which preprocesses the data before it is submitted to the MT software. The filter identifies and marks strings which are best left untranslated by the MT software, such as personal names, company product names, file and path names, commands, samples of source code, and the like. By marking these strings, the filter notifies the MT software to leave those strings untranslated. These strings are then linked to a preceding "hookword". Hookwords are automatically inserted then deleted in post-processing and are contained in dictionaries with a part-of-speech and other grammatical features to effect rearrangement of the word in the target language. Once the translation process is complete, the translations are collected and posted, by the Message Collection/Posting Software, to the target language discussion groups at the same location within the message structure as the original version of the message. The pre-processing, translation, and post-processing functions are all performed automatically in accordance with a batch process that executes on a periodic basis at the network site. U.S. Patent No. 5,960,382 discloses a method and apparatus for translating a native-language message into a corresponding foreign-language message. Translation of an initially-unknown message is effected using native-language and foreign-language prototype messages that are independent of message variables, whereby a prototype message represents all messages of an individual type. An individual message is identified to belong to a particular type by using the native- language prototype message, and an equivalent foreign-language message is then generated by inserting variable values from the individual message into the foreign-language prototype message that represents the particular message type.
The native-language message, which includes a value of a variable, is matched against a plurality of native- language prototype messages to identify a corresponding native-language prototype message, which includes the variable. The plurality of native-language prototype messages preferably represent all native-language messages that require translation. The identification of the prototype native-language message is used to obtain (e.g., retrieve) a corresponding foreign-language prototype message, which also includes the variable.
The value of the variable, obtained from the native-language message that is being translated, is then substituted for the variable in the obtained foreign- language prototype message to yield a foreign-language message which corresponds to (i.e., which is a translation of) the native-language message. If the native language message includes values of a plurality of variables, the identified native-language prototype message and the corresponding foreign-language prototype message each includes the plurality of variables. The plurality of the variables have a first ordering in the identified native-language prototype message and a second ordering in the corresponding foreign-language prototype message, and the two orderings are generally different.
The substitution step then involves using the first ordering and the second ordering to determine a placement of the values of the variables into the obtained foreign-language prototype message. Preferably, the matching step involves the use of a multi-tiered multi-node tree constructed from the native-language prototype messages, and matching strings (e.g., words and numerals) which make up the native-language message in their order against the nodes of corresponding tiers in the tree to reach a node which represents the last string in the message and contains the message identifier of the corresponding prototype message. This identifier is then used to obtain the corresponding foreign-language prototype message, which has the same identifier.
There is a need for a language translation method and apparatus for networks that addresses issues associated with two-bit characters. There is a further need a language translation method and apparatus for networks that provides for user feedback. There is a further need for a language translation method and apparatus for networks that limits subject matter and language usage domains. Yet there is another need for a language translation method and apparatus for networks that takes advantage of application-specific characteristic repetitions in language.
SUMMARY OF THE INVENTION Accordingly, an object of the present invention is to methods and apparatus that improve the quality and usability of the translations.
Another object of the present invention is to provide language translation methods and apparatus suitable for the internet and other distributed networks.
Yet another object of the present invention is to provide language translation methods and apparatus that provide user feedback. A further object of the present invention is to provide language translation methods and apparatus for the internet that limits subject matter and language usage domains.
Another object of the present invention is to provide language translation methods and apparatus for networks that takes advantage of application-specific characteristic repetitions in language.
Another object of the present invention is to provide language translation methods and apparatus for the internet that provide user feedback relative for determining whether inputs are translated correctly.
Another object of the present invention is to provide language translation methods and apparatus that actively educate users on how to use translation engines.
Another object of the present invention is to provide language translation methods and apparatus with user defined dictionaries.
Another object of the present invention is to provide language translation methods and apparatus that provide user direction and customization of translation. Another object of the present invention is to provide language translation methods and apparatus that permit modification in order to better handle the characteristic language of different specific applications.
Another object of the present invention is to provide language translation methods and apparatus that provide a static translation cache of frequently encountered phrases,
Another object of the present invention is to provide language translation methods and apparatus that provide a key form for storing cached phrases which removes extraneous information Another object of the present invention is to provide language translation methods and apparatus that provides for the most flexible and productive application of a phrase cache.
Another object of the present invention is to provide language translation methods and apparatus that provide typing shortcuts for languages and allow free deletion of selected clauses.
Another object of the present invention is to provide language translation methods and apparatus that provide simultaneous display of original and translated content or messages from a network on a single screen without disrupting the look and feel. Another object of the present invention is to provide language translation methods and apparatus that provides for personalization.
Another object of the present invention is to provide language translation methods and apparatus that is real-time and does not cause delay for the user.
Another object of the present invention is to provide language translation methods and apparatus that indues multiple translation engines in a single system with a uniform API.
Another object of the present invention is to provide language translation methods and apparatus that provides a uniform API to numerous applications.
These and other objects of the present invention are achieved in a method for electronically translating text. An electronic language translator is provided. Source language text is received as an input to the electronic language translator. The source language text is translated at the electronic language translator at the time of submission into one or more target language texts. A user is then provided with an option of viewing one or more of the target language texts with or without the source language texts.
In another embodiment of the present invention, a method for electronically translating text provides an electronic language translator system that includes an electronic language translator and at least a first and a second dictionary. The electronic language translator references the first dictionary and then the second dictionary in a process of translating source language text into one or more target language texts. The dictionaries are maintained in an application or customer hierarchy. Source language text is received at an input of the electronic language translator. The source language text is translated at the electronic language translator into one or more target language texts. An output is produced that includes the one or more target language texts.
In another embodiment of the present invention, a method for electronic language translation provides one or more translation modules receiving source language text from an input interface. One or more input interfaces and one or more output interfaces are provided. A generic data format is included that is independent of the translation modules, input interfaces, and output interfaces. The input source language text is converted from the format for a specific input interface to the generic format. A determination is made of the one or more translation modules that provides an optimal translation. The text is routed to the module that provides the optimal translation. Text is converted from the generic data format to a specific input format of a translation module. The specific output format from a translation module is converted to the generic data format. Data is converted from the generic data format into an output format suitable for an output interface.
In another embodiment of the present invention, a method for electronically translating text provides an electronic language translator coupled to an interface. Source language text is translated at the electronic language translator into one or more target language texts. Translated text is output in one or more target languages to an output interface. Controls are provided at an interface coupled to the electronic language translator to dynamically select which of the one or more target languages are output at the interface. The interface representation of text is varied in the one or more target languages to allow a user to differentiate between the displayed languages. Controls are provided at an interface to create differentiation between one or more target languages.
In another embodiment of the present invention, a method for electronically translating text provides an electronic language translator coupled to an interface. The source language text is translated at the electronic language translator into one or more target language texts. The translated output is displayed to the original user. Feedback is provided to the original user about the quality of the translation.
In another embodiment of the present invention, a method for electiOnically translating text provides an electronic language translator coupled to an interface. The source language text is translated at the electronic language translator into one or more target language texts. At least two candidate translations are produced for each source language text. The translated candidates are compared to one or more language models trained on data similar in style and subject matter to the text being translated. The best quality translation is selected for the input from the multiple translation candidates according to which best matches the one or more language models. A desired best quality translation is then displayed.
In another embodiment of the present invention, a system for electronically translating text includes an electronic language translator that receives source language text input and produces translated target language text. An interface is coupled to the electronic language translator and configured to provide a user with an option of viewing one or more target language texts with or without source language text.
BRIEF DESCRIPTION OF THE FIGURES Figure 1 describes a screenshot of a chat application of the present invention. Figures 2-3 describe embodiments of the real-time multilingual communication application of the present invention.
Figure 4 is a high level overview illustrating operation of the System of the present invention. Figure 5 is a high level overview illustrating operation of the System of the present invention.
Figure 6 is a high level overview illustrating operation of the System of the present invention.
Figure 7 illustrates data flow between the client wireless device, the wireless network provider, the data provider, and the translation services of the present invention.
Figure 8 illustrates a generic peer-to-peer data exchange.
Figure 9 illustrates a generic client-server data exchange, where the translation service of the present invention acts as the intermediary in the data exchange
Figures 10(a)- 10(b) are directed to an auction tool embodiment of the present invention which combines features of the multi- lingual search engine and the browsing tool of the present invention
Figure 11 shows the dynamic translation cache of the present invention. Figure 12 illustrates the framework of the present invention.
Figure 13 illustrates the five steps performed at the translation
Figure 14 illustrates the procedures that each step of transation undergoes of the present invention.
Figure 15(a) illustrates the full linguistic processing occurring in the chat application of the present invention.
Figure 15(b) is similar to Figure 15(a) with the same data path, for non- interactive applications of the present invention.
Figure 16 illustrates input aids incorporated into the System of the present invention. Figure 17 illustrates a common phrase table with the static translation cache of the present invention.
Figure 18 illustrates language post-processing of the present invention. Figure 19 illustrates text post-processing of the present invention.
Figure 20 illustrates text pre-processing of the present invention.
Figure 21 is a continuation of Figure 8 illustrating language preprocessing of the present invention. Figure 22 illustrates the translation engines with dictionaries of different types of the present invention.
Figure 23 shows a topic specific dictionary of the present invention.
Figure 24 illustrates the four types of feedback produced and utilized by the System of the present invention. Figure 25 shows the levels of feedback which are incorporated in the translation system of the present invention.
Figure 26 illustrates the way the System of the present invention educates the user about the MT engine, as well as about his or her own language.
Figure 27 describes the browsing tool of the present invention on a high level as a three-step process:
Figure 28 is a high-level blowup of step 2 from Figure 27.
Figure 29 provides more detail of step 2 from Figure 27 for page retrieval and processing.
Figure 30 is a blowup of page retrieval as represented in Figure 29. Figure 31 is a blowup of step 1 from Figure 30 describing how parameters are added to a URL before querying the source site.
Figure 32 is a blowup of step 2 from Figure 30.
Figure 33 is directed to page rewriting of the present invention.
Figure 34 illustrates page rewriting as a two-pass process of the present invention.
Figure 35 is a graphical illustration of the page rewriting process described in Figure 34.
Figure 36 is a blowup of pass 1 from Figure 37.
Figure 37-39 give examples of how certain elements are handled in pass 1. Figure 40 describes instances in the page where the browsing tool rewrites
URLs in the page of the present invention. Figure 41 illustrates how text is translated as part of the browsing tool of the present invention.
Figure 42 is a blowup of the final stage from Figure 30 for handling incoming cookies of the present invention. Figure 43 shows the Help button of the present invention.
Figure 44 illustrates the restatement window of the present invention.
Figure 45 shows an example of the interactive tutorial entry of the present invention.
Figure 46 is a blow-up of the different types of the tutorials of the present invention.
Figure 47 shows an example of warning lights of the present invention.
Figure 48 shows how users' expectations and knowledge about the System are influenced through actual use of the System of the present invention.
Figure 49 illustrates the tutorial daemon of the present invention. Figure 50 depicts the input length meter of the present invention.
Figure 51 is an example of shortcuts of the present invention.
Figure 52 shows example of emoticons of the present invention.
Figure 53 is an example of the translator of the present invention.
Figure 54 illustrates a number of personalization features of the present invention.
Figure 55 illustrates the splash page of the present invention.
Figure 56 illustrates the plurality of different features in the chat application of the present invention.
Figure 57 illustrates the different features in the chat room of the present invention.
Figure 58 illustrates the Help process of the present invention.
Figure 59 illustrates the "current member's box" of the present invention.
Figure 60 illustrates switching language zones of the present invention.
Figure 61 is an overview of the browsing tool of the present invention. Figure 62 lists some of the browsing tool features of the present invention. DETAILED DESCRIPTION
In one embodiment of the present invention, a system ("System") for electronically translating text includes an electronic language translator that receives source language text input and produces translated target language text. An interface is coupled to the electronic language translator and configured to provide a user with an option of viewing one or more target language texts with or without source language text. The electronic language translator translates the source language text to at least one target language at the time of submission of the source language text. An output interface outputs the target language text from the electronic language translator. The output interface can vary an interface representation of text in the one or more target languages.
The electronic language translator can include at least first and second dictionaries. The electronic language translator references the first dictionary and then the second dictionary in a process of translating source language text into one or more target language texts. The dictionaries are maintained in an application or customer hierarchy. A generic data format can be included that is independent of the translation engines, input interfaces and output interfaces.
In one embodiment, a conversion module converts the input source language text from the format for a specific input interface to a generic format. A routing module then determines which translator provides an optimal translation and then routes the text to that translator. A conversion module converts text from the generic data format to a specific input format . A conversion module converts a specific output format from the franslation engine to the generic data format. A conversion module can be included to convert data from the generic data format into an output format suitable for an output interface.
The System of the present invention has a variety of different applications including but not limited to translation of text, real time translated chat, website content, e-mail, instant messaging, multi-lingual auctions and marketplaces, and the like. The present invention allows multiple people to engage in an online translated text conversation. Users can define their input and view languages and Chat applications of the present invnetion translates input sentences from one user to the appropriate output languages defined by each of the other users. A screenshot of a chat application of the present invention is illustrated in Figure 1. In one embodiment, the present invention invention is for casual chat between users on a portal or community site, intra-company communication on a corporate intranet, business-oriented chat on a business-to-business exchange, and real-time customer support solutions, among other uses.
Figures 2-5 illustrate one emboidment of a real-time multilingual communication methods and apparatus of the present invention. Figure 2 illustrates how different users use the method and apparatus of the present invention. Illustrated is a two-person interaction model. In this model, two people communicate exclusively with each other, sending messages back and forth. Each message is sent to the chat server, translated (if the message is textual), and relayed to the other user. The second diagram illustrates multiple-user communication where multiple people message each other in one room. In this model, every message that is sent by any one chat client is captured by the chat server, translated (if the message is textual), and rebroadcast to every chat client in the chat room.
Figure 3 illustrates the several types of messages that travel within a chat room. One is a plain text message which is translated instantaneously into any of the chat room's supported languages. The next type is iconic. These are significant because when dealing with universal language there is a need to transmit messages that are understood universally. Another type of message is meta-transactional whose sole purpose is to facilitate the entire process of communication. One example of a meta-transactional message is the "Help?" message, which one user may send to a second user alerting him that she did not understand his message, and requesting that he restate it in a way that may be translated more effectively and thereby more easily understood.
Figure 4 is a high level overview illustrating operation of the System of the present invention. A user uses their browser to access the web server of the present invention. The web server delivers a page which contains an applet. The • applet appears. From that point forward communication between the user and the System occurs exclusively through that applet. The user inputs a message in the applet which is sent to the chat server. When the message is textual, the chat server sends the message to the translation System, which translates the message and sends it back to the chat server. At this point, the message contains versions of the user-entered text in all the supported languages of the chat room. The server then retransmits the message out to all users in the chat room.
Figure 5 illustrates the type of feedback that occurs. Throughout the process a feedback loop runs continuously. As messages are received by the translation System, there is interaction between 1) the database for storage of information of the messages, as well as 2) with the machine translation engines. The database stores our static translation cache, which contains many text phrases pre-translated across multiple languages. These translations are performed by humans and are thereby guaranteed to have perfect quality. In addition, the cache modifies itself over time by reacting to patterns it observes in the types of messages it receives. This results in higher translation quality. Text elements that are not handled by the cache, are sent to the translation engines for translation. Each box in the figure represents a single machine translation language direction, e.g. English to French. As a result, the translation System utilizes multiple translation engine components. Different providers can be used for different language pairs. Some engines can support multiple directions.
The present invetnion also provides a translated web browsing tool that provides machine translations of website content. One example is translation of all text on a website into a language defined by a user. The present invention also provides methods and apparatus that translate text embedded within graphics on a website. The browsing tool of the present invention can be useed both on an actual website and as a downloadable tool that plugs in directly to the user's browser. The actual tool itself includes a toolbar that resides on the top or bottom of the user's browser screen and gives the user functional control over the language of translation, the URL of the website the user wishes to access, as well as a number of other features such as a way to submit the current site for human translation. A screenshot of the browsing tool of the present invention is shown in Figure 6. The browsing tool of the present invention is primarily a tool that provides individual users with access to Internet content that they would not be able to access with the tool. Examples of applications of the browsing tool include but are not limited to education, entertainment, research and the like. Figure 7 illustrates data flow between the client wireless device, the wireless network provider, the data provider, and the translation services of the present invention.
Generally, the client wireless device is any personal mobile electronic device with a display/output apparatus, an input apparatus, and data transmission capability which is designed to serve as a mobile terminal for Internet and other network transactions. Examples include but are not limited to: cellular phones with data transmission and display capabilities, personal handyphone systems (PHS), personal.digital assistants (PDAs), palmtop computers, and Internet/network capable appliances and devices. The wireless network provider is the data transmission infrastructure which allows the client devices to exchange data with each other and with any other devices accessible over the network. The data provider is any device which supplies either static or dynamic data to the client device over the data transmission infrastructure. The present invention acts as an intermediary in this data exchange, translating the data from one language to another as it passes from client device to data provider, from data provider to client device, or from client device to client device.
The wireless translation applications of the present invention are substantially equivalent to the internet translation applications. Some of the differences between the two include, the data is encoded in WML (wireless markup language), HDML, or some other standard for wireless data exchange, rather than HTML, the target end-user device is a data-capable cellular phone, personal computer, or other wireless data terminal rather than a desktop or laptop computer and the data is transmitted over the data network of the cellular service provider instead of/in addition to being transmitted over networks such as the Internet. In a typical untranslated peer-to-peer transaction, a client wireless device may send a data transmission over the wireless data transmission infrastructure and network, where it is routed to another client wireless device. Figure 8 illustrates a generic peer-to-peer data exchange where the translation service of the present invention acts as the intermediary in the data exchange. The present invention is integrated with the wireless data infrastructure and network. As data is sent from a client wireless device to another client wireless device over the wireless network server, that data is passed to the present invention which translates/processes/transforms that data and returns it to the wireless network server to be routed to the destination wireless device. Examples of data transmissions which fit this peer-to-peer model include SMS (short messaging system) messages, alphanumeric pager messages and the like.
In a typical untranslated client/server transaction, a client wireless device may send a request for data over the wireless data transmission infrastructure and network, where it is routed to a data provider server. That server replies with the requested data, which is returned to the client wireless device over the wireless data transmission network. Figure 9 illustrates a generic client-server data exchange, where the translation service of the present invention acts as the intermediary in the data exchange. A client wireless device formulates a request for data from a particular server; this request is then forwarded to the present invention. The translation service accesses the wireless data and services specified in the client request and translates/processes/transforms that information before returning it to the requesting end-user. Examples of this client/server model include WAP data browsing, server push data, and the like. Further, the methods and apparatus of the present invention provide can provide draft quality translations of text emails. Users simply type their email in their own language and it is translated by into the target language of the person the email is being sent to. The translation can also take place on the side of the receiver, when someone receives an email in a language he or she may not be familiar with.
Instant messaging is designed as a communication platform for people who are accessing networks, including but not limited to the Internet, concurrently. When someone receives a message while offline, the message is stored for them to view the next time they log in to the Internet. Translated instant messaging can be used in corporate communication, customer service, student interaction, and any other situation requiring instantaneous communication across a language barrier.
The System of the present invention can include a multilingual search engine that allows someone who speaks one language to search for information on the Internet or on a specific site that is in a different language. A query can be entered in one language and the search engine of the System translates the query into the target language before searching for matching information. In another embodiment, the System also designs a mechanism that can resolve ambiguous search queries by asking the user for more input in potentially ambiguous situations. The multilingual search engine can be used to search for information on the Internet in general, or to search for a product or piece of information on a specific website or domain of information. Examples of potential uses include but are not limited to searching for a certain type of business outside of a country on an informational website, searching for a certain type of product on a foreign ecommerce site, or searching the entire Internet for websites related to a certain topic that are written in a language not native to the user. Methods and apparatus of the present invention can tie in directly with online auction and marketplace sites. This solution allows users of the marketplace or auction to post messages or product descriptions in such a way that they are easily viewed and translated into a number of different languages. Form fields and drop-down menus can be used that limit the number of choices a user has when describing a certain product. This allows for storage of the posted information in a format that can be easily transferred to any language.
Figures 10(a) and 10(b) are directed to an auction tool embodiment of the present invention which combines features of the multi- lingual search engine and the browsing tool A user can enter a query in Language A, even though the site itself is in Language B.
For example, a user could enter in Japanese "osara" which means plate. That gets translated into the proper language; in the case of an English-language site, "osara" is translated into "plate." The regular query is run on the auction site's database. From the auction site's side the interaction is the same as a regular, monolingual search. They haven't changed their processing at all. The pages that are returned are translated by the System of the present invention and the links are shown along with the translated version of the links. When a user clicks on a link to see the actual auction, the auction page comes up and is completely translated.
The auction site does not have to change its database lookup or change the pages they push. All of the translation management, including preparing text for translation, executing translations, and displaying translated versions of pages is handled by the System, and is completely transparent to the auction site.
Figure 11 shows the dynamic translation cache, which records recently translated sentences and is dynamically updated with each translation call. When a translation is requested, the dynamic translation cache is consulted first to see if the requested sentence was translated recently. If so, the recorded translation can be returned immediately, saving time and processing cycles on the translation engine. This is significant for many applications of the translation System, but in particular for the auction tool. In auction searches, users will often enter successive queries that are very similar to each other, varying the keywords only slightly. This causes the same auctions to be returned repeatedly. The present invention capitalizes on this repetitive behavior with a dynamic cache that keeps a record of all the recent translations. This is done in a manner similar to the common phrase table, and similarly takes advantage of the characteristic repetitions in the application's language use. In another embodiment, the present invention integrates translated text and translated search into a product, allowing users of the marketplace or auction to search for goods and information in multiple languages. The results can then be displayed in their own language.
Along with the specific features needed to multilingualize an auction or market site, the present invention also provides users with custom dictionaries and common phrase lists tailored to their particular applications. This is especially effective where translations relate to a limited topic area, such as for a specialty goods auctions site.
Users can have direct access to the System, web-sites and interfaces of the present invention. Additionally, the System of the present invention permits users to act as hosts.
Referring now to Figure 12, the framework of the present invention is illustrated and includes an interface, distribution and translation layers. A user uses the interface layer to construct an initial input and view the translated output as the user engages in multilingual communication. Each application can have a unique interface that maximizes the effectiveness of translated communication in a particular domain. In one embodiment, the methods and apparatus of the present invention utilize Java-based input and output interfaces. However, the present invention can also integrate with user interfaces of existing business applications, making it easy to empower existing applications with the capability of multilingual communication.
In one embodiment, the present invention provides outpt interfaces that differ among applications, can be primarily Java-based, and handle output in all supported language pairs. The output interface displays languages even to users who do not have an operating system that is native to the language of the output. To do this, the System can utilize all Java Unicode character strings. Depending upon the user's system, it may request that the user allow a short procedure that installs appropriate fonts and writes to certain configuration files on the user's system. This installation procedure enables the user's system to display fonts that are not native to their operating system. After the input is received the interface layer forwards the request to the distribution layer for further processing. The distribution layer serves as a conduit between the user Interface and the translation layer. Specifically, the distribution layer provides language pair distribution, load balancing and is a common interface to the translation layer. For language pair distribution, the distribution layer ensures that the translation queries are passed on to the correct translation engines of the System based on the language-pair of the translation request. The Distribution Layer utilizes a load-balancing system that manages the load of each instance of the translation engines. For every language pair, the System of the present invention can create multiple instances of the translation engine. The Distribution Layer ensures that the queries are distributed efficiently among the different instances of the engine. As a common interface to the translation layer the interface layer will vary depending on the software application. Applications with widely differing user interfaces can all utilize the translation layer in the same manner
Translation is performed at the translation layer which can include the five steps illustrated in Figure 13. Procedures that each step undergoes are more fully explained in Figure 14.
Figures 15(a) and 15(b) illustrates the full linguistic processing that occurs in the chat application specifically. As the user inputs something a series of input aids are available to help the user type in inputs. For example, typing in Japanese is very slow. There is tension between the desire to type and input as fast as possible especially in the chat application, while at the same time making the input language as clean and proper as possible for the translation engine. The method and apparatus of the present invention minimizes the problems of monolingual input, which is usually too poor quality to translate well. This is achieved by providing a series of tools that draws a balance between the two extremes of fast input and strictly grammatical language. Further detail on input tools is given in Figure 16.
A static translation cache, also called a common phrase table, is provided. This includes phrases that are frequently repeated in chat applications (or whatever the specific application of the machine translation is); the phrases are stored with perfect translations. Items in the cache go directly to post-processing without going through the translation or other engines. Further detail on the static translation cache is given in Figure 17. Finally, there is some post-processing. Further detail on the post-processing is given in Figures 18 and 19.
In step 1 the text input is converted into a state that can be translated. Text inputs differ among applications. It is important for every input be distilled down to a form that can be synthesized by subsequent steps which are application- independent. Therefore, step 1 is application-dependent. In step 1, the following actions occur: unnecessary whitespace within the text are removed, improper capitalization are removed, which are later restored in step 5, excess punctuation is removed, again restored in step 5, the input is spell-checked, the input is grammer checked. Certain contractions are removed from the input. In the case of translated HTML browsing for the translated browse feature of the present invention, the pre-processing operates differently. There, the pre-processing step handles the task of parsing the HTML, preparing the appropriate text for translation, and then reformatting the resulting translations while preserving the form of the original HTML page. The text pre-processing step for HTML translation is described in more detail further below. As this pre-processing step distills the input, it retains information about the input's original state that is later restored to the translation in Step 6, as the translated output is produced.
Figure 20 illustrates text pre-processing. Text pre-processing can remove white space, remove and retain capitalization information, remove and retain punctuation information, and rewrite contractions.
In step 2 the input is analyzed for special linguistic structures such as synonymous words, extraneous expressions and common phrases. This is also the level where the System will examine the input for any potential ambiguities that the user would need to resolve. In cases such as translated search, where the System employs a feedback to the user mechanism, any such ambiguities would be resolved at this level before the input passes to the next step. After performing the linguistic analysis and determining what phrases need to be translated, the language pre-processor determines what calls to make to step 4, as well as which of those calls to make to the static translation cache or to the translation engines, as described below.
Figure 21 is a continuation of Figure 20 illustrating the language preprocessing. This includes removing extraneous expressions (such as "well" or "so"), rewriting slang, rewriting abbreviations, and dissecting compound phrases into analyzable units. The text pre-processing and language pre-processing described in Figures
20 and 21 includes (but is not limited to) the following specific steps: • Remove capitalization, preserving the information; • Remove punctuation, preserving the information;
• Standardize spacing;
• Remove commas, hyphens, and other sentence-internal punctuation for cache matching; • Parse out names of other users from beginning or end of sentences;
• Parse out connecting words and expressions such as "well," "oh," "ah," "well then" (English), "pues," "si" (Spanish), "eh bien" (French), " ," " " (Japanese), etc.;
• Attempt to chop sentences at commas and semicolons to match each half with cache;
• Attempt to correct common spelling errors for cache matching;
• Expand contractions, preserving data (English, German, French);
• Rewrite abbreviations for cache matching, e.g. o "r" = "are", "4" = "for" (English) o "2" = "de", "9" = "neuf ' (French) o "k" = "que", "t" = "te" (Spanish)
• Add dropped from , (Japanese);
• Chop off ending particles ("gobi") (Japanese);
• Expand to , (Japanese); • Attempt to drop accents for cache matching (French, Spanish);
Use alternate spelling for cache matching (German). Step 3 is where the lowest-level translation occurs. Depending upon the content of the text input, the translation step employs one of two subsystems:
Static Translation Cache
In this embodiment, a translation cache stores commonly submitted inputs and their translations for extremely fast lookup. The motivations for the static translation cache are, translation quality, speed and scalability. By utilizing a cache, the present invention is able to specify perfect translations for a large number of the most commonly submitted text inputs. Such inputs include colloquialisms, slang and common phrases in each language, as well as specialized phrases that are common in specific client applications and industries. By accumulating commonly accessed inputs and their translations in a high-speed cache, the present invention increases the speed of common translations and thereby improves the user experience. By minimizing the number of calls that the Systemof the present invention makes to the translation engines, scalability and stability are unprovided.
The cache functions by grouping phrases that have similar meanings and then associates a single canonical phrase with each group. When performing a translation on any of those phrases, the cache returns the translation of that canonical phrase. The cache includes a database table of canonical phrases across all supported languages and a series of hashtables for each supported language.
These are canonical phrases that have a version in every supported language. For example, the expression "Hello" is universal and has a version in all languages. Each hashtable stores phrases that may not have exact equivalents in the other languages, but can be approximated to one of the canonical phrases in the first table. The key of the hashtable is the common phrase, and the value is an index to the row in the first database table with the equivalent canonical phrase. Because the text has been pre-processed and distilled prior to handling by the cache, the lookup is not disturbed by minor textual differences in the input such as extra spaces or inadvertent punctuation. In chat applications a large number of common phrases, including but not limited to greetings, frequently repeated phrases, and chat lingo are stored in a table that lists translations for each of the phrases. This ensures fast, completely accurate translations for the most common phrases which people use in the chat environment. The phrases stored in this master table are called canonical forms. In addition, variants of each of these phrases in each language so that these will be recognized as well. These variants include contracted versions, versions with extraneous, non-content-bearing words ("Hello" vs. "Hello there"), and synonymous expressions ("I'm well" vs. "I'm fine."). The variants are stored in a standardized format called a key with all letters are downcased, punctuation removed, and spelling standardized to ensure that the widest range of user input will be recognized. Figure 17 illustrates the common phrase table with the static translation cache. The processed input is received, and then converted to a key form by removing particulars of language usage. (This process includes many of the steps described in Figures 20 and 21.) There is a look up in the key table to see if the input is a common phrase. The key table then gives a reference into the canonical phrase table, which gives the output translation of the input in the appropriate target language. Punctuation, capitalization information, and the like are then restored.
Third Party Translation Engines with the System of The Present Invention For text inputs that are not stored in the static cache, the System of the present invention sends the input to an appropriate third-party translation engine for processing. The present invention utilizes several translation engines to ensure that the quality of translation is optimal for each supported language-pair and treats each third-party engine as a virtual black-box. Different engines have different capabilities. A custom Java wrapper is written to each engine, which serves as a common API so that previous steps do not have to understand or interact with each engine's unique API. Each engine instance handles a single language pair and produces for each text input one translated output.
Each third-party translation engine is treated as a distributed object and communicated by using the RMI protocol. The System of the present invention utilizes multiple instances of each translation engine running on numerous machines to minimize dependence upon the stability of any one single engine instance or machine. Further, the System of the present invention can be scaled simply by adding additional machines and connecting them to the distribution step, described previously.
Because of the ambiguity of language, the quality of an MT engine's output is highly dependent upon the quality and relevance of the lexica it utilizes. To improve the quality of output lexica for each language is compiled for numerous topic areas, and the appropriate topical lexica is applied to each communication domain. For example, in its business chat rooms, the translation engines employ business-related lexica, whereas its sports rooms use sports- related lexica. Figure 22 illustrates the translation engines with proprietary dictionaries of different types. These include topic specific dictionaries appropriate for the topic of the chat room, website, or other current application, proper name and proper noun lists that the method and system of the present invention frequently update to make the translations in each application as current as possible. Users are also able to make their own dictionaries.
In step 4 the various fragments of the original input are reassembled after having been split apart in step 2. Some parts of the input may have passed through a translation engine, while others were routed to the static translation cache. Figure 18 illustrates the language post-processing (restoration) stages where separate units are constructed and text is reconstructed. This is includes restoring certain abbreviations and reconstructed units that were separated during the language pre-processing in Figure 2-1.
Step 5 restores the textual changes that were made to the input in step 2. In Figure 19, text post-processing (restoration) occurs with punctuation, contractions, and capitalization restored as appropriate. This step generally restores the information extracted in Figure 20. The text is then prepared for display and output.
The translation layer includes customized dictionaries. One type of customized dictionary is topic specific where the topic is specified either automatically by the topic of the chat room, or manually by the user as he/she uses the browser or search engine as illustrated in Figure 23. These topic-specific dictionaries include ones provided by the translation engine itself.
The dictionaries of the present invention improve the level of translation for specialized topics. Additionally, for general translation language and topics of the dictionaries are maintained topical and current.
The present invention updates a dictionary of proper nouns with their correct translations or transliterations for all necessary language pairs. In the chat setting, this allows discussion of the most current topics. In the browser and search engines users will are able to keep up with the fast-moving nature of web sites on the Internet. The translation layer also includes user specific dictionaries. Users are also encouraged to assemble personalized lexica. These allow users that use specialized language not handled well by general dictionaries to specify the desired translations. Users that have familiarity with a language other than their own can build this dictionary directly. If a user lacks this ability, the present invention invention provides a tool for speakers of two different languages to specify jointly the proper translation of a term for the dictionary. In addition, a second feature allows a person to store a word in his/her personal lexicon, notify a professional translator, and have the correct translation of the expression added to in his/her dictionary at some point in the next day or two.
The System of the present provides a filter that scans all input to the chat application for slang, idioms, chat lingo and problematic constructions. The filter expands or rewrites these specialized phrases to expressions in a form that can be better translated by the translation engine. The filters can be constantly updated to keep up with current slang and chat language
The present invention provides feedback to enable the users of the System to judge and respond to the quality of the translated output. There are four types of feedback produced and utilized by the System. These are illustrated in Figure 24. Figure 25 shows the many levels of feedback which are incorporated in the translation System of the present invention. The System incorporates different type of feedback to improve the quality and usability of the translations.
User-User Feedback: The "Help?" button provides a mechanism for other users to say whether an input was translated in an understandable fashion. This is especially important for monolingual users who otherwise have no way of knowing whether their inputs are being translated correctly.
System-User Feedback: The System incorporates warnings, suggestions, and both static and interactive tutorials to actively educate users to use the translation engines as productively as possible. User- System Feedback (direct): Users are able to direct the translations through a number of means, such as "do not translate" lists, "do not translate" markers in the input line, and user-defined dictionaries. User- System Feedback (indirect): User activity directs modifications that the developers make to the translation System. For examples, users will report poorly translated words and phrases, and developers will also monitor user- defined dictionaries and "do not translate" lists to find items to add to the System dictionaries.
This feedback is important because many people are unfamiliar with machine translation (MT), and so the different feedback cycles help to educate users about the strengths and limitations of MT. In addition, the feedback provides mechanisms for users to control and personalize the performance of the MT engines, so this gives them a greater sense of control and allows them to view MT as a useful tool instead of something mysterious.
Figure 26 illustrates the way the System educates the user about the MT engine, as well as about his or her own language. People believe that they are experts on their native languages, however most people have limited knowledge about their native language and how it works. The tutorial informs users about the elements in their language which are likely to be ambiguous or difficult to translate, such as slang expressions, idioms, and certain words and constructions ("got", "se", etc.) Interacting with the MT engine itself also shows users who understand at least some of the target language the strengths and limitations of the System, and helps educate them about the most productive use of the translation engines.
User feedback to other users is from the recipient of the translated output to the original sender. For example, a user who receives an incomprehensible message can tell the original sender that he or she did not understand the message. This immediately prompts the sender to rephrase the message in a form that can be more easily translated. A person receiving an instant message from a colleague, realizes that part of the translated message is not very clear. The receiver can immediately prompt the sender to rephrase the difficult part of the original message. User feedback to the translation system is feedback that the recipient of the translated output gives to the translation System. Over time, as large amounts of translation data are accumulated, the present invention can use this data to improve the quality of the translation System. This can occur manually or automatically with the System. System feedback to the user occurs in a negotiated translation when the System and the user together attempt to resolve ambiguities in a translation. System feedback is especially critical when text entries are short as in search queries. When a user enters a query in one language whose meaning is ambiguous, the System can respond by prompting the user to select from a list of ambiguity resolving options. Without this type of feedback highly accurate translated search queries are not possible.
Text-processing step for HTML page translation is different from plain text translation. The present invention parses HTML pages, and provides placement of translations in the HTML page. There are two options for HTML page translation, show both original translation and show only the translation. When the originalanl and translation are both shown the translations are preferably inserted into the original page without disrupting the form of the page. Tthe System then parses the HTML page and finds key markers which delineate appropriate locations for inserting translations. When only the translation is shown, the original text is replaced entirely by the translations. igure 15(b) is similar to Figure 15(a) with the same data path, but is for non-interactive applications such as the browser or auction tool. Wireless communication would also fall under this same grouping.
Figure 27 describes the browsing tool on a high level as a three-step process: 1) the user makes a page request, 2) the request undergoes processing by our System, and 3) the System returns the response page to the user.
A user request consists of a URL and a language pair — source language and target language. The source language is the original language of the page, and the target language is the language that the page is translated into. The source language may become optional as a language identifier is incorporated into the browsing tool. The request may also include cookies previously set by the web site associated with the page request and other parameters, including but not limited to form parameters which can be forwarded on with the request.
Figure 28 is a high-level blowup of step 2 from Figure 27. It describes the overall processing which occurs between the users' page request and the page response. Three steps are included: extract parameters from the user request, perform page retrieval and processing, and return the processed page within a dynamically-generated page.
Figure 29 provides more detail of step 2 from Figure 28 for page retrieval and processing. There are three main types of page requests. The first is a user- specific page: These pages are never cached because it is assumed that their content is changing too often for them to be effectively cached. An example of such a page is a user profile page. They are always newly retrieved and rewritten on each new request. The second type is a non-user-specific page that has been cached. If a page is not user-specific and has already been cached, it is pulled from the cache and returned. The third type is a non-user-specific page that has not been cached or whose cache entry is out-of-date. These pages are newly retrieved and rewritten. In addition, they are stored in the cache for future queries. Figure 30 is a blowup of page retrieval as represented in Figure 29 In order to fulfill a user's page request the browsing tool of the present invention must first request the page from the source web site. In order to do this, it must first extract necessary information from the user's request, create a new second request, and then utilize this second request to query the source site. The page retrieval process consists of five steps: 1) Add parameters to URL. Here, any parameters contained in the user's page request are added to the URL of the second request.
2) Handle outgoing cookies: Cookies contained in the user's page request are forwarded to the second request.
3) Perform HTTP request on new URL: This is where the source site is queried.
4) Retrieve page: Using the appropriate character encoding for the page (based upon its language), the page is retrieved.
5) Handle incoming cookies
Figure 31 is a blowup of step 1 from Figure 30 describing how parameters are added to a URL before querying the source site. An important thing to note is that this process is language-sensitive. Specifically, when a user is viewing pages with source language A in target language B, parameters which represent user inputs are translated from language B to language A before being added to the page request. This enables the users to actively interact with pages that are not in their own language. For example, if a user is viewing an auction site whose source is language A in language B, and the user wishes to enter a search query in language B for a particular object, that search query is translated into language A before being submitted to the page.
Figure 32 is a blowup of step 2 from Figure 30. Cookies which are passed in as part of the user request are rewritten to drop the path prefix of the present invention, thereby restoring the original path of the cookie, and the browsing tool includes the cookie when querying the original source site.
Figure 33 is directed to page rewriting and illustrates why the browsing tool of the present invention is unique. The browsing tool kit enables the user to insert the translations inline, in order to view both the original text and the translation simultaneously. Additionally, the browsing tool preserves the look and feel of the original page. This is accomplished by carefully positioning the insertion of translations at strategic locations within the page so that they do not significantly shift or displace original content. While the page is made inherently longer, its overall look and feel is not disrupted.
Referring now to Figure 34 page rewriting is a two-pass process. In the first pass the page is traversed and translation placeholders are inserted in places where translations should be later added. Simultaneously, a list of text strings which must be translated is extracted. In the second pass the text strings which have at this point been translated are now inserted into the page, replacing the placeholders. Figure 35 is a graphical illustration of the page rewriting process described in Figure 34
Figure 36 is a blowup of pass 1 from Figure 37. In pass 1, the HTML page is traversed and HTML elements are encountered. Each HTML element is handled uniquely. Certain HTML elements represent textual elements which require translation, while other elements contains links which must be rewritten. Still other elements require other handling. Figures 37-39 give examples of how certain elements are handled in pass 1. Figure 37 illustrates handling of normal text. Normal text is defined as text positioned outside of any HTML tags. Normal text is handled in two ways: 1) It is copied to the rewritten page, and 2) It is added to a cumulative text buffer to be later translated and inserted into the rewritten page. Both steps are required because the browsing tool displays both the original text and the translated text in the page. So the first step preserves the content and location of the original text piece. The second step causes the text to be translated and inserted into the page.
It is important to note that in the second handling step the text piece is added to a buffer and later translated and inserted into the page, rather than immediately translated and transferred to the page. In order for the translations to be inserted into the page in a way that does not disrupt the page's original look and feel, they must be strategically positioned. This requires that each text piece not be immediately translated and inserted following its original text counterpart, but rather that translations be grouped together and later inserted into the HTML page in an appropriate location. This results in a more coherent page and a better user experience.
Figure 38 illustrates handling of JavaScript. When a JavaScript block is encountered: 1) it is scanned for text strings which require translation, and these strings are replaced with placeholders in the JavaScript; 2) these strings are translated; 3) the placeholders are replaced by the newly translated strings; and 4) the new JavaScript block is copied over to the rewritten page. Unlike normal HTML where the original text and the translated text are conveyed, JavaScript blocks only convey translations, since most JavaScript text strings represent single text elements which can only have a single value. Figure 39 illustrates handling of the translation identifier. Translation identifiers are HTML tags that signify the end of a contiguous chunk of text, representing a position where a translation of previous text should be inserted. When a translation identifier is encountered, the contents of the text buffer (described above for Figure 37 and composed of text strings which are encountered within the page) are sent off for translation, a placeholder is added to the rewritten page and the text buffer is cleared. Figure 40 describes instances in the page where the browsing tool rewrites URLs in the page. When URLs representing textual content are encountered, they are rewritten through the server of the present invention. URL rewriting ensures that as the user clicks through to subsequent pages, these pages continue to be translated as well. This provides a seamless user experience, allowing the user to browse and translate the web freely without any intermediate steps. This figure denotes specific cases where URLs are written to pass through the current invention. It is important to note that URLs are only rewritten if they represent textual content. URLs which represent other types of content, such as binary objects (images), should not be rewritten and should reflect the original source location.
The content of a URL is rewritten through the servers of the present invention., by changing the URL to pass through serves of the System of the present invention, and the original source location becomes a parameter which is passed to the servers. This parameter denotes the page which the user is requesting.
A relative URL is one in which the domain of the source location is not specified. With the present invention, the source's domain is added as a prefix to the URL, and then the URL is written as described in the preceding paragraph. Figure 41 is a blowup from Figure 35 for text translation. Figure 41 illustrates how text is translated as part of the browsing tool. Multiple individual text strings which have been encountered during the page traversal process are concatenated. A single concatenated string is passed to the translation engine, which returns a single translated string. This single translation string is broken back up into multiple translations and returned. This approach enables all the text on a page to be translated using a single call to the translation layer.
Figure 42 is a blowup of the final stage from Figure 30 for handling incoming cookies. This is the reverse of the process described in Figure 32. Cookies which are returned from the queried site are rewritten so that their path passes through the servers of the present invention and are then inserted into the page response and returned to the user. This ensures that the cookies will be resent to the site whenever the user utilizes the browsing tool to access the same site in the future.
The HTML page translation process of the present invention performs the following steps: (i) Performs text pre-processing on the HTML page, parsing the HTML page and producing a collection of text strings that should be translated.
(ii) Performs language pre-processing on each of these text strings. The language pre-processor determines what, if any, textual elements within each of these strings needs to be sent on for translation. For each of these to-be-translated strings, it separates them into two groups, "a" those that should be handled by the third-party translation engines, and "b" those which should be handled by the static translation cache. For those in group a, the language pre-processor concatenates all of these to-be-translated elements into a single demarcated string to the third-party engine in step 4. By concatenating all of these strings into one, it limits the number of calls to the third-party engine for each HTML translation.
For those in group b, the language pre-processor makes individual calls to the translation cache for each textual element.
Upon receiving all translations from step 4, language post-processing occurs where the proper outputs are reconstructed. Text post-processing in step 6 then reconstructs the HTML page, inserting translations in the appropriate locations and thereby preserving its original form.
The System properly can handlesHTML pages with Javascript, Forms, and
Cookies. The resulting page then operates as the original with no change in functionality. Additionally, the System can use optical character recognition technology to recognize the textual content of images, and provide a translation for text embedded in images as well as pure HTML text.
In one embodiment, the framework is written primarily in Java, making it compatible with existing software applications and legacy systems. Because the framework is Java-based, it can run on any platform. The present invetnion can operate on Linux Pentium machines. The third-party translation engine can operate on a Unix, Linux or Windows NT platform on distributed machines. A "Help" butoom can also be provided. With the present invention, the use of translation within a chat environment is provides users to give feedback about the understandability of a statement's translation. This feedback takes the form of a button connected with each posted message called the Help button which other users can click to indicate an unclear translation , see Figure 43..
After the Help button has been clicked, the user that made the original statement is notified of the bad translation and shown a screen with an editable copy of the statement as illustrated in Figure 44. This gives the user an opportunity to modify the statement to something more understandable for the System. In addition, the mistranslated statement can be sent through a grammar checker which can scan the input for a number of possible problems, including unparsable grammar, misspelled words, diffϊcult-to-translate words or constructions, ambiguous words and the like.
One of the problems with most machine translation systems is that there is no incorporation of feedback to guide the translation. At numerous stages in analysis there are decisions that must be made, often about the resolution of ambiguities. For example, there can be ambiguities that are lexical, syntactic, semantic and pragmatic. A user can also confuse a translation system by using, (i) a word not in the lexicon, (ii) a construction that is not in the grammar and (iii) a known word in a novel way or with a novel meaning. Confusion can be unintentional and caused by misspelled words, poor grammar; incorrect punctuation and incorrect characters, particualarly in Japanese and Chinese. For these ambiguities and confusing constructions, the Help button provides information about misunderstood sentences to the user and closes the feedback loop. This is done when another user does not understood the translation. Thus, the System of the present invention removes the burden of detecting the need for clarification from the computer. Furthermore, the System takes advantage of the communal nature of the chat room to allow users to help each other to find the best language for translation. The Help button can refer to either the full user comment or to a phrase or word within the comment. In the case where just a single word or phrase is mistranslated, the other users can specify the specific part of the input which was confusing.
As the user browses the web the translated browser of the present invention can automatically provide translation sites without requiring the user to specify the source language of each site. This is done by implementing a language identifier that scans a page, guesses the language and then executes the proper translation automatically. In the case where the identifier guesses wrong, or the page contains multiple languages, the user can override this feature, By removing any necessity for the user to worry about, or even be aware of, the source language of the materials he/she is looking at, the present invention makes the browsing experience as seamless as possible.
The present invention can also provide a translation helper for the user. The translation helper is an interactive process with many functions including, instructing users on the proper use of the translation engine, helps users determine the best phrasing in order to achieve high-quality translation, and adjusts user expectations about the capabilities and limitations of machine translation. Users are trained to avoid these problematic constructions. This is done through both a passive approach, which attempts to provide instruction and information to the user, and an active element that reacts to user input to guide the user to better phrasing for translation.
Each user is encouraged to read a list of suggestions and instructions for the best language and constructions to use to produce the best ttanslations. There is a separate list for each of the languages. The present invention offers a number of formats for the information which allow the user to choose the level of detail and the conciseness of presentation which best suits his/her tastes. The formats the user can choose from are:
1) A quick, bulleted list of points. This gives the basic information in an easy-to-read quick-reference format.
2) A longer README-style file. This format gives an expanded form of the information with longer explanations and good and bad examples to illustrate each point. The mascot is featured in amusing cartoons to make each point more memorable. 3) An interactive tutorial. In this version, the the present invention guides the user interactively through a number of examples to communicate the information in a fun, memorable way as shown in Figure 45. The interaction includes illustrations, small quizzes, good and bad examples, and areas to test examples with the translation engine.
Figure 46 is a blow-up of the different types of the tutorials, such as a quick, bulleted list of points, a more thorough tutorial with good and bad examples, illustrations, and explanations, or an interactive tutorial with quizzes, games, and translation test areas. The more elaborate tutorials make the learning experience more memorable and fun. The method and apparatus of the present invention provides a level of interactivity with the user to assist the learning process.
The present invention also provide tutorial daemons that are programs which run in the background and monitor the users' inputs. By monitoring a user's typing before the sentence is sent to the translation engine, the present invention helps to guide the user toward sentences that are more easily translated and warn them of dangerous inputs. When a problem is detected, it is marked in the text within the input box and a "warning light" comes up in an area of the screen dedicated to tutorial messages as illustrated in Figure 47. The tutorial daemon can includes a spell checker, a grammar checker, a difficult phrase flagger and an input length meter.
Figure 48 shows how users' expectations and knowledge about the System are influenced through actual use of the System. For the chat application, as the user provides input the input goes through a tutorial daemon. The tutorial daemon is a second-level tutorial which runs in the background and gives feedback about the user's input. The tutorial daemon flags difficult words and phrases, troublesome constructions, spelling and punctuation errors, likely accent errors, troublesome zero-anaphora, unlikely part-of-speech sequences, and other possible sources of translation errors, in order to train the user. Further detail on the tutorial daemon is given in Figure 49. In addition, the user receives feedback from seeing the translations that come through and other users give feedback in the form of the "Help?" button. Figure 49 illustrates the tutorial daemon, which provides a number of checking stages to provide feedback to the user. Before the user hits the enter button the tutorial daemon provides a warning of things to watch out for. The daemon includes (among other elements) a grammar checker, a spelling checker, a difficult-phrase detector, an input-length meter (to warn users about overly-long inputs), an ambiguity detector, and an ambiguity resolver, which uses local context to determine the meaning of ambiguous words and phrases.
The spell checkers reports each word that does not appear in one of the active lexicons. It does this by checking the current input line at short intervals before the return key is hit, and marking, either by highlighting, underlining, or some other graphical notation, that an unknown word has been found. This allows users to filter out spelling errors, non-standard words and slang, as well as problematic proper nouns before they are sent to the translation engine. When a user right clicks on a questionable word, a list of suggested alternatives is presented to speed correction.
The grammar checker checks grammer such as how punctuation is used. It also attaches part-of-speech tags to each word and checks to see if any unlikely tag sequences are detected. A questionable sentence or phrase is highlighted to notify the user that the user should rephrase the input if possible. Right clicking on the questionable phrase brings up an explanation of the problem and a possible suggestion for a fix. Examples of the grammer checker include checking to make sure every sentence in Japanese has a subject and verb, and if question words have the proper accent marks as in Spanish.
Aa number of languages have words and phrases that are not grammatically incorrect but are difficult to translate. Examples of such difficulties include"no" and "suki" in Japanese, the impersonal passive with "se" in Spanish; "got" in English, "marche" in French.. The difficult phrase flagger of the present invention highlights these to encourage the user to rephrase the sentence for better translation. A right click on the problematic expression brings up an explanation and a list of preferable rewordings.
An input length meter is also provided with the present invention. Because translation quality declines with longer sentences it is important to keep input as short as possible. As a constant reminder of this, a small input-length meter is displayed next to the input text box in the chat application as illustrated in Figure 50. The input box is periodically checked to see how many words, or in Asian languages, how many characters, have been entered, and increases the meter reading accordingly. Certain words, such as conjunctions, push the meter's needle up even further. After a certain word count, the meter enters a red "Danger Zone" which warns the user that their input is much more likely to be mistranslated. The Danger Zone level depends on the language and engine being used.
In contrast, in the context of Search Application of the present invention, inputs which are too short are the problem. In this setting, a daemon watches the queries a user inputs, and issues a suggestion if a number of one- or two- word queries are entered in succession. In cross-language search a major obstacle is the difficulty in translating the exact meaning of the search terms. The context of a number of search terms aids significantly in determining the exact meaning of the query words. The average number of words per query as reported in most studies is usually around two, so without encouragement most users will tend to enter these short queries and will probably become discouraged by the poor search results.
In both the chat and search applications the present invention checks the input for potentially ambiguous words and phrases. These ambiguous expressions are highlighted to encourage the user to rephrase the input for a clearer translation. Without this feedback, the user will often have no idea why a sentence or query produced such a bad translation. The ambiguous words can be detected by consulting a specialized word-translation dictionary which lists specific alternate translations for a word. Ambiguous phrases can be detected either by scanning for specific phrases (such as a "yes" or "no" following a negative question) or by executing a part-of-speech tagging and seeing if there are multiple tag sequences judged likely.
Once an ambiguity has been detected, an ambiguity resolution program can be triggered, either with a right click in chat, or automatically in search. The resolver can either consult surrounding context or other search terms to determine the most likely sense of the ambiguous word, or it can spawn a dialogue box to ask the user for clarification directly.
In order to track translation problems and provide feedback for refinements to the help files, lexicons, and tutorial daemons, the method and apparatus of the present invention logs every word which passes through the translation engine untranslated and also logs every input which receives Help button feedback from another user. These logs permit immediate recognition of any patterns in mistranslation which occur, including words missing from the lexicon, constructions not covered in the translation System's grammar, and frequent grammatical and spelling errors.
The present invention provides a number of aids and shortcuts to help users enter their input quickly and correctly. These include an iconic entry which provides a shortcut for input in the chat application. A user clicks on a series of special icons which immediately insert certain set phrases into his/her text entry box. These icons take three different forms serving different purposes.
Figure 16 illustrates the wide variety of input aids incorporated into the System. These include typing short cuts (either by keyboard or mousing on a separate menu), emotions, a hyperlinked dictionary, buttons which introduce long phrases into the chat or other application, special characters which set apart text which is not to be translated, lists of words and phrases which are not to be translated, and automatic recognition of URLs in the text.
The present invention also provides typing shortcuts. The chat environment requires fast input and quick reaction to maintain a fun and interesting level of interaction. However, this pressure to increase input speed also encourages the user to cut corners which greatly harm the quality of translation. These cut-corners include abbreviating frequently repeated words, using pronouns, and leaving out subjects or verbs entirely, especially in Japanese. In order to facilitate fast input while discouraging these bad habits, the present invention shows the user a small window with a number of phrases which can immediately be entered into the text input line with a single click of the mouse as illustrated in Figure 51. The phrases can also be accessed with keyboard shortcuts to make input even faster and simpler. The present invention also provides a number of emoticons and illustrations that users can include in their messages, such as a smiley face and a heart that is illustrated in Figure 52. These are transparent to the translation engine and thus will have no effect on the translation quality. However, they have a substantial effect on the user-friendliness of the System and the total ability of the users to communicate and connect with each other.
Action buttons are provided to enable a user to select from a menu of buttons which print out full sentences describing the user's attitude or actions. These range from the straightforward ("[User A] scratches his head.") to the cute ("[User A] blows [User B] a kiss!") to the silly ("[User A] dances the Macarena."). Each action phrase is stored in each translated form, and is displayed to each user in the appropriate language.
Special characters are designated which signal the translation engine of the System of the present invention not to translate part of the input. The user simply surrounds the text not to be translated with these special characters and the translation The System of the present invention ignores that section of the input and sends it through verbatim. These are important when entering names which are also common nouns (e.g. Nick, Young, the Giants, Los Angeles), when entering titles which the user does not want translated, and when users are discussing actual language use and language learning and need to mention specific examples.
In addition to the special "do not translate" characters, users can construct a personal list of words and expressions which are not to be translated. With such a list, a user can record names and titles which he/she mentions frequently, removing the need to annotate them each time with the special characters.
With the present invention, hyperlink dictionaries are provided and permit a user to immediately bring up the dictionary definition for any word by right clicking on that word. This is important for users because many of them will be language learners or people interested in other cultures and they will want the ability to see immediately the meaning of new words they encounter. In addition, once a user has typed an input and seen it go out in its translated for to the chat room, the user might feel some concern that the intended meaning of the sentence was preserved in the translation. One way to reassure the user and give him/her the power to make sure the translation is correct is to make available the dictionary definitions of the translated words.
The dictionary definitions shown can either be the literal dictionary entries in their entirety, or a check of local context can be used to determine which particular sense of the word is correct for the sentence.
As a user enters a URL into the flow of chat, it is immediately recognized as such and is transformed into a hot link. This feature encourages users to trade links and information and will facilitate communication in the chat environment. In order to help users produce the best translations possible for their particular interests, the present invention gathers a number of personalizing features into one area of the web site called "My Translator" and illustrated in Figure 53. Users are encouraged to customize these features to their own particular needs. Not only does this produce better translations and greater user satisfaction it also encourages a sense of ownership in the translation technologies and will encourage repeated visits to the web site. The My Translator area includes the following:
1) User-Built Custom Dictionary: The user can collect and store words, phrases, and names which they frequently discuss, search for, or see web pages about.
2) User-Built "Do Not Translate" List: Names, titles, and phrases which the user usually does not want to be sent through the translation engine are collected in this list.
The present invention permits chatters to set their keyboard shortcuts to enter certain words and phrases automatically. Frequent chatters will appreciate being able to store these shortcuts from session to session.
If the user prefers to keep one of the subject-specific lexicons provided by the present invention as the default lexicon to use in web searches, web browsing, or chatting, this can be indicated in a "My Translator" area. Additionally, a general-purpose space is provided to the user to jot down notes while using the web browser, search engine, or chat rooms. In Figure 54, a number of personalization features are unified and presented in one section for the user's convenience. Each user is able to have their area where they keep their personalized dictionary, a personal "do not translate" list, personally chosen default lexicon selection, personally defined keyboard- shortcuts, and a notepad. This effects how all the other parts of the System, such as the browser, the auction translator, the translator for wireless, translated chat, and all other translation-based applications will work. It summarizes the information in one area, so a user can have control and can improve the quality of the translation engine performance for his or her own particular uses. Referring now to Figure 55, the user starts on the splash page, which just has the company logo, and a choice of languages. They choose a language, so that they can go into the website and have it be written in the language that they speak. From there on, the rest of the pages are translated into all of the languages that we offer. The next page that they see after the Splash page is the Welcome page and then from that page, if they are a returning user they can log right in, go to the chat rooms page, choose a chat room and start chatting. If they are new users, there are three main options. Ideally they would go to the sign up form, fill all of that out, then go to the tutorial, learn how to use the chat and then go to the chat rooms page. If they are not convinced that they should sign up on the Welcome page, then they can go to the tour, find out more about it, and then go to the sign up form. There are also a lot of other pages on the site that anybody can access. The web site of the present invention is built with different language zones. Initially a user comes in, selects a language. The user can change the viewing language of the site at any time. Figure 56 illustrates that there are a plurality of different features in the chat application. The user can have conversations with other users by exchanging translated messages in the chatroom. The user can also open a private chat window in order to have a one-on-one conversation. The user can switch to another chatroom. The user can view profiles of other users and see their gender, location, age, occupation, fluent languages, country of origin, and personal message. They can edit their own personal profile, as well. And they can access the help section which includes a tutorial, translation tips, support form, and FAQ. Figure 57 illustrates that there are different features in the chat room. Examples include keyboard shortcuts for entering special characters, icon messages so users can send pictures (such as a smiley face) as a message. Bilingual users can switch the enter language control and enter in different languages. When the user moves the mouse over components of the chatroom, a description of that component appears in the mouseover tip box. Moderators have extra features, such as silencing other users or even eliminating their accounts if they are too disruptive.
There is also a special interface for people who want to enter in double- byte text but do not have double-byte operating system. For example, usually when a user enters text in Japanese, they enter the text phonetically and then hit return to select the characters they want to represent the phonetics. In Internet Explorer, if a user who does not have a Japanese operating system wants to enter text in Japanese, a small HTML window will appear and when the user hits return to select the characters they want to represent the phonetics, those characters will be automatically sent to the chatroom as well. In Netscape, there needs to be an extra HTML window in addition to a window where the user selects the characters. We will hide this extra window, so when the user selects the characters, it will look like it s being sent directly to the chat window instead of being sent to the intermediary HTML window and then to the chat window.
A "Do Not Translate" feature is also provided. This is utilized when the user is entering a phrase and wants to have a part of it not translated. For example, if they type in "Apple Computer," in the English-French chatroom. They do not want "Apple" to be translated into "pomme," the French word for "apple." Right now we have a feature where they user can place "o" characters around whatever they don't want to have translated. There are two ways to do this: they can use the Do Not Translate Button or type in the characters themselves. The Do Not Translate button is on the chat, and when they hit that button, the "o" characters are always automatically inserted around the cursor. So when they type, they are actually typing between the "o" characters already instead of having to go and put the special characters around it themselves. But once they learn that those characters keep the phrase from being translated, they can just type them in themselves instead of using the button.
The Help process can be used when somebody enters a message that a user doesn't understand. The user can let the other party know that the user doesn't understand. We will go into more detail about this in Figure 58.
The Help process is illustrated n Figure 58. For example, user A enters a message with a typo. User B views the translation but doesn't understand it. User B can click the Help button that is on User A's message and right away it will put up a message that says, "User A, I didn't understand your message. Please rephrase it." Both that message and the message that was misunderstood become highlighted. User A can re-enter the message so it can be translated again.
A keyboard shortcut is also provided: when the up arrow is pressed the previous messages appear in the text box where the user enters its message. Instead of retyping the entire message User A can hit the up arrow, fix the typo in the previous message, and send it again. This provides a fast way for people to be able to let each other know when there is any miscommunication. The highlighting makes the process clearer and faster as well. By highlighting the message it becomes much easier to spot the misunderstood message.
Figure 59 illustrates the "current member's box". In the upper right area of the chat room is a list of all of the members currently in the chat room.
Different actions can be taken with the different members in the chat room. For example, a "personal information window" provides information on how to find a person. "Private chat" brings up a new window where a user chat one-on-one with that person and the "ignore button" is used to ignore a user and stop seeing their messages. If none of the names are selected all of these buttons are disabled. If a user's own name is selected then the user can see their own profile and edit it. The other buttons are disabled.
If a user's own name is selected the "personal information" button can be clicked so the user see its own information and the user can also edit its own information. Another button is provided on the "personal information" window which brings up another window where a user can edit its own profile information. If another member's name is selected, then all three of those features work. A user can see its profile, the user can chat one-on-one with that person in another window or the user can gray them out and stop seeing their messages.
Switching language zones is illustrated in Figure 60. For example, if a user is a viewing a website in French and decides to go to a chat room where another language is used, a window pops up that says "You are moving to a different language zone would you like to view it in English or Japanese?" The French user then selects the new language and from then on views the site or chat room in the new language.
Figure 61 is an overview of the browsing tool of the present invention. The browsing tool is a frame and has various features, more fully described in Figure 62. The browsing tool is utilized when a user on one website enters the URL of a website he or she would like to translate. The user then goes to that new website with the translations. At the bottom of the window the user clicks on a link on the page and goes to that new page which is also translated. A user can also enter a new URL into the browsing tool and goes to that site translated. Figure 62 lists some of the browsing tool features. The browsing tool permits a user to change what language the site is being translating to, including "none". Additionally, the user can customize it and have its own favorite links, set up its own look and feel, toggle between showing and hiding the original language. A multi-lingual dictionary pop-up is also provided.
The foregoing description of a preferred embodiment of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise forms disclosed. Obviously, many modifications and variations will be apparent to practitioners skilled in this art. It is intended that the scope of the invention be defined by the following claims and their equivalents.

Claims

What is claimed is:
1. A method for electronically translating text, comprising providing an electronic language translator; receiving source language text as an input of the electronic language translator; translating the source language text at the electronic language translator into one or more target language texts; and providing a first user with an option of viewing one or more of the target language texts with or without the source language texts.
2. The method of claim 1, wherein the electronic language translator includes at least a first translation engine.
3. The method of claim 1 , wherein the electronic language translator includes a translation cache.
4. The method of claim 3, wherein the translation cache includes a store of phrase and equivalents across multiple languages.
5. The method of claim 3, wherein the translation cache includes a store of source and one or more target language equivalencies that are dynamically updated.
6. The method of claim 3, wherein the translation cache includes heuristics to enable matching between inputs and cache entries which are not typographically identical.
7. The method of claim 6, wherein the flexible matching heuristics include ignoring differences in the capitalization scheme.
8. The method of claim 6, wherein the flexible matching heuristics include ignoring differences in the punctuation.
9. The method of claim 6, wherein the flexible matching heuristics include dividing the input at punctuation such as commas in order to match phrases at a sub-sentential level.
10. The method of claim 6, wherein the flexible matching heuristics eliminate appellatives at the beginning and end of phrases before attempting the match.
11. The method of claim 6, wherein the flexible matching heuristics include a glossary of abbreviations, slang forms, and other non-standard forms in order to recognize all variants of the cached phrases.
12. The method of claim 6, wherein the flexible matching heuristics include ignoring diacritics.
13. The method of claim 6, wherein the flexible matching heuristics include unifying hiragana and katakana in Japanese inputs.
14. The method of claim 6, wherein the flexible matching heuristics include unifying small and large kana in Japanese inputs.
15. The method of claim 6, wherein the flexible matching heuristics include ignoring sentence-final expressive particles (gobi) in Japanese inputs.
16. The method of claim 1, wherein the electronic language translator includes a plurality of translation engines.
17. The method of claim 16, wherein the electronic language translator includes a multiple engine comparison tool that receives translated target language outputs from multiple engines and selects a desired output.
18. The method of claim 1 , wherein the electronic language translator includes a pre-processor that improves the translatability of the source language.
19. The method of claim 18, wherein the pre-processor corrects the source language inputs for improved translatability by application of language- specific heuristics
20. The method of claim 18, wherein the pre-processor includes a spell-checker to correct spelling errors.
21. The method of claim 18, wherein the pre-processing expands acronyms and abbreviations that would otherwise not translate properly.
22. The method of claim 18, wherein the pre-processor includes an accent-restoration routine to correct deleted or incorrect accent marks.
23. The method of claim 18, wherein the pre-processor replaces slang with standard language equivalents which will translate better.
24. The method of claim 18, wherein the pre-processor replaces conversational constructions with language equivalents that translate better
25. The method of claim 18, wherein the pre-processor eliminates difficult to translate sentence-final expressive particles.
26. The method of claim 25, wherein the pre-processor eliminates gobi from Japanese inputs.
27. The method of claim 1, wherein the electronic language translator includes a tutorial to instruct users on use of the translator.
28. The method of claim 1, wherein the electronic language translator includes a composition tool that interactively guides the user to use translation- friendly language.
29. The method of claim 28, wherein the composition tool includes a spell checker that provides a notification to a user when the input includes a lexical item not found in dictionaries used by the system.
30. The method of claim 28, wherein the composition tool scans the input for at least one of specific words, phrases, and expressions which do not translate properly.
31. The method of claim 28, wherein the composition tool checks for lexically ambiguous words which cause translation problems.
32. The method of claim 28, wherein the composition tool monitors a length of the input and reminds the user that shorter inputs may translate better.
33. The method of claim 32 wherein the input length monitor uses heuristics to increase the input length count for terms that increase translation complexity.
34. The method of claim 33, wherein the heuristics increase the input length count for conjunctions.
35. The method of claim 28, wherein the composition tool scans the input for syntactic constructions which are difficult to translate.
36. The method of claim 28, wherein the composition tool scans the input for syntactic constructions which are ambiguous.
37. The method of claim 28, wherein the composition tool warns the user about accent errors and suggests corrections.
38. The method of claim 28, wherein the composition tool passes the input through a language model and warns the user when the model does not recognize the input with a desired certain confidence level.
39. The method of claim 38, wherein the language model is selected from a trigram model, bigram model, unigram model, or a linear combination of trigram, bigram, and unigram models.
40. The method of claim 38, wherein the language model is a Hidden Markov Model.
41. The method of claim 28, wherein the composition tool executes a preliminary translation of the input, passes the input through a language model, and warns the user when the model does not recognize the translated output with a desired certain confidence level.
42. The method of claim 41, wherein the language model is selected from a trigram model, bigram model, unigram model, or a linear combination of trigram, bigram, and unigram models.
43. The method of claim 41, wherein the language model is a Hidden Markov Model.
44. The method of claim 1, wherein the electronic language translator provides the user an indicator to indicate those portion of the input that are not to be translated.
45. The method of claim 44, wherein the indicator includes special characters placed before and after the text not to be translated.
46. The method of claim 44, wherein the electronic language translator replaces text not to be translated with a lexical term that is not changed by the machine translation engine.
47. The method of claim 46, wherein the lexical term is a randomly generated, very large integer.
48. The method of claim 46, wherein the lexical term is a randomly generated, very large integer concatenated with a sequentially generated integer to ensure that the same lexical term is not generated twice in one translation.
49. The method of claim 46, wherein the lexical term is a randomly generated alpha-numeric string.
50. The method of claim 46, wherein the lexical term is a randomly generated alpha-numeric string, concatenated with a sequentially generated character, to ensure that the same lexical term is not generated twice in one translation.
51. The method of claim 46, wherein the lexical term is a randomly generated alpha-numeric string, concatenated with a sequentially generated integer, to ensure that the same lexical term is not generated twice in one translation.
52. The method of claim 1, wherein the electronic language translator uses specialized dictionaries to maximize the quality of the translation.
53. The method of claim 52, wherein the specialized dictionaries are selected from topic-specific, application-specific and user-specific dictionaries.
54. The method of claim 1, wherein the electronic language translator retains information about the capitalization scheme of the input, and restores this scheme in the output.
55. The method of claim 1, wherein the electronic language translator retains information about the punctuation of the input, and restores this punctuation in the output.
56. The method of claim 1, wherein the electronic language translator provides a mechanism for viewers of the translate output to indicate to the inputting user when the translation has not been understood.
57. A method for electronically translating text, comprising
submitting source language text to an electronic language translator; executing translation from the source language text at the electronic language translator to at least one target language at the time of submission of the source language text ; outputting the at least one target language text from the electronic language translator.
58. The method of claim 57, wherein the output includes at least one of the target language texts and includes at least a portion of the source language text
59. The method of claim 57, wherein a first user submits the source language text and a second user receives the at least one target language text.
60. The method of claim 59, wherein the second user creates a reply in response to the at least one or more target language texts and possibly the source language.
61. The method of claim 60, wherein the reply is sent to the first user.
62. The method of claim 61, wherein the reply is sent to the first user in the form of the original source language.
63. The method of claim 62, wherein the original and reply texts are disseminated to multiple users.
64. The method of claim 63, wherein the multiple users are each able to reply to the messages and the replies are also disseminated to multiple users.
65. The method of claim 64, wherein two or more users are communicating in a chat environment using an electronic language translator.
66. The method of claim 64, wherein two or more users are communicating in an instant messaging environment using an electronic language translator.
67. The method of claim 64, wherein two or more users are communicating in a discussion boards environment using an electronic language translator.
68. The method of claim 64, wherein two or more users are communicating in an email environment using an electronic language translator.
69. The method of claim 64, wherein two or more users are communicating in an electronic customer service environment using an electronic language translator.
70. The method of claim 69, wherein two or more users communicating in an electronic customer service environment are communicating in a chat customer service environment using an electronic language translator.
71. The method of claim 69, wherein two or more users communicating in an electronic customer service environment are communicating in an email customer service environment using an electronic language translator.
72. The method of claim 71 , wherein the input text from the first user is analyzed for meaning.
73. The method of claim 72, wherein the analysis is triggered upon receipt of the input text without explicit instructions from a human operator.
74. The method of claim 72, wherein the input text from the first user is analyzed for meaning, and based upon the meaning the reply is selected.
75. The method of claim 75, wherein the analysis is triggered upon receipt of the input text without explicit instructions from a human operator.
76. The method of claim 75, wherein the reply text is delivered to the first user in the same source language as the original text.
77. The method of claim 57, wherein the at least one target language text is posted to an electronic marketplace system
78. The method of claim 77, wherein the at least one target language text is stored to a marketplace database.
79. The method of claim 77, wherein the at least one target language text is posted to the electronic marketplace system along with the source language text.
80. The method of claim 77, wherein the source language text is a description of an object in an electronic marketplace system and the one or more target language texts are a translation of the object description.
81. The method of claim 57, wherein the source language text represents a search query string, and the at least one translated text output is delivered as a search query string to an electronic search system
82. The method of claim 81, wherein the electronic search system returns one or more search results, which are then translated by the electronic language translator and returned to the original user in the original user's source language.
83. The method of claim 57, wherein the source language text is a request for a document, which is submitted from the original user's hardware using a software client, transported over a network, and delivered to a server.
84. The method of claim 83, wherein the requested document is a document augmented with information in the form of a markup language.
85. The method of claim 84, wherein the textual components of the document are extracted and translated into at least one target language by the electronic language translator.
86. The method of claim 85, wherein the textual components of the document are chosen from text, mouseovers, meta-tags, and cookies.
87. The method of claim 85, wherein hotlink within the document is rewritten as calls to the electronic language translator.
88. The method of claim 87, wherein hotlinks are rewritten as calls to the electronic language translator so that the linked documents are automatically submitted for translation.
89. The method of claim 87, wherein at least one target language output is returned to the original requesting user and reconstituted with non- textual portions of the original document according to the original markup language tags.
90. The method of claim 89, wherein the non-textual portions of the original document are chosen from graphics, pictures, formatting, backgrounds, frames, animations, sounds, and videos.
91. The method of claim 90, wherein the reconstituted document is returned to the original requesting user to preserve the original look and feel of the original requested document.
92. The method of claim 90, wherein the original user's hardware is a computer, the user's software client is a browser, the network is a network connection between computers, the server is another computer, and the markup language is HTML.
93. The method of claim 90, where the original user's hardware is a personal data assistant, the user's software client is a PDA browser, the network is a wireless internet, the server is a computer, and the markup language is
WML or HDML.
94. The method of claim 90, where the original user's hardware is a phone, the user's software client is a WAP browser, the network is a WAP network, the server is a computer, and the markup language is WML or HDML.
95. The method of claim 90, where the original user's hardware is a phone, the user's software client is an iMode browser, the network is an iMode network, the server is a computer, and the markup language is WHTML.
96. A method for electronically translating text, comprising providing an electronic language translator system that includes an electronic language translator and at least a first and a second dictionary, wherein the electronic language translator references the first dictionary and then the second dictionary in a process of translating source language text into one or more target language texts and the dictionaries are maintained in an application or customer hierarchy; receiving source language text at an input of the electronic language translating the source language text at the electronic language translator into one or more target language texts; producing an output that includes the one or more target language texts.
97. The method of claim 96, wherein the electronic dictionaries include one or more of subject-specific, application-specific, customer-specific, and user-specific dictionaries.
98. The method of claim 97, wherein the specialized dictionaries are selected for use by the electronic language translator dynamically at the time of translation.
99. The method of claim 97, wherein specialized dictionaries are created by users of the electronic translation system.
100. The method of claim 97, wherein the specialized dictionaries are maintained in a hierarchical organization.
101. The method of claim 100, wherein the dictionary hierarchy can be augmented by users with user-created dictionaries.
102. The method of claim 97, wherein the specialized dictionaries are created, stored, and modified in a format that is independent of a specific translation engine.
103. The method of claim 102, wherein the specialized dictionaries are mapped into engine-specific formats by engine specific routines.
104. The method of claim 103, wherein the specialized dictionaries are engine- independent and usable by any translation engine.
105. A method for electronic language translation, comprising; providing one or more translation modules receiving source language text from an input interface; providing one or more input interfaces; providing one or more output interfaces; providing a generic data format which is independent of the translation modules, input interfaces, and output interfaces; converting the input source language text from the format for a specific input interface to the generic format; determining the one or more translation modules that provides an optimal translation; routing the text to the module that provides the optimal translation; converting text from the generic data format to a specific input format of a translation module; converting the specific output format from a translation module to the generic data format; and converting data from the generic data format into an output format suitable for an output interface.
106. The method of claim 105, wherein one or more translation modules is a translation engine.
107. The method of claim 106, wherein the one or more translation modules is coupled with a specialized dictionary with relevant vocabulary for a translation request.
108. The method of claim 107, wherein the specialized dictionary is chosen from subject-specific, application-specific, client-specific, and user- specific dictionaries.
109. The method of claim 105, wherein the one or more translation modules includes at least one static translation cache.
110. The method of claim 105, wherein the one or more translation modules include at least one dynamic translation cache.
111. The method of claim 105, wherein the one or more translation modules include at least one input pre-processing system.
112. The method of claim 105, wherein the one or more translation modules include at least one output post-processing system
113. A method for electronically translating text, comprising: providing an electronic language translator coupled to an interface; translating source language text at the electronic language translator into one or more target language texts; outputting translated text in one or more target languages to an output interface; providing controls at an interface coupled to the electronic language translator to dynamically select which of the one or more target languages are output at the interface; varying the interface representation of text in the one or more target languages to allow a user to differentiate between the displayed languages; and providing controls at an interface to create differentiation between one or more target languages.
114. The method of claim 113, wherein the electronic language translator outputs the source language input text, in addition to the one or more target language texts.
115. The method of claim 114, wherein the electronic language translator includes controls at the interface coupled to dynamically select which of the source and target languages are output at the interface.
116. The method of claim 115, wherein the electronic language translator varies the interface representation of the text in the source and one or more target languages to allow the user to differentiate between the display languages.
117. The method of claim 116, wherein the electronic language translator provides controls at an interface to create differentiation between the source and one or more target languages.
118. The method of claim 113, wherein the variation of the representation of the output is chosen from varying typefaces, varying colors, varying spatial placement, and adding typographic symbols.
121. A method for electronically translating text, comprising: providing an electronic language translator coupled to an interface; translating the source language text at the electronic language translator into one or more target language texts; displaying the translated output to the original user; and providing feedback to the original user about the quality of the translation.
122. The method of claim 121, wherein the translator with feedback displays the original input text aligned with one or more output target languages.
123. The method of claim 121, wherein the translator with feedback provides an electronic dictionary attached to the translated text.
124. The method of claim 123, wherein the attached electronic dictionary is used by the user to translate words from the translated text back into the source language, in order to double-check the translation quality.
125. The method of claim 124, wherein the attached electronic dictionary is hyperlinked to the words in the translated text.
126. The method of claim 125, wherein the hyperlinked dictionary is activated by clicking on a word.
127. The method of claim 126, wherein clicking on a word retrieves its translation from the hyperlinked dictionary.
128. The method of claim 126, wherein clicking on a word retrieves its definition from the hyperlinked dictionary.
129. The method of claim 124, wherein the attached electronic dictionary is activated by mousing over words in the translated text.
130. The method of claim 129, wherein mousing over a word in the translated text retrieves its translation from the attached electronic dictionary.
131. The method of claim 130, wherein mousing over a word in the translated text retrieves its definition from the attached electronic dictionary.
132. The method of claim 121, wherein the translator with feedback passes the translated text through a language model and indicates when the translated output is not recognized by the model with a minimum confidence level.
133. The method of claim 132, wherein the language model is chosen from a trigram model, a bigram model, a unigram model, or a linear combination of a trigram, bigram, and unigram model.
134. The method of claim 132, wherein the language model is a Hidden Markov Model.
135. The method of claim 121, wherein the translator with feedback indicates to the user words that were not translated by the electronic language translator.
136. The method of claim 135, wherein the untranslated words are indicated in the output text through visual means.
137. The method of claim 136, wherein the visual means are chosen from highlighting, differently colored font, italics, holding, underlining, and surrounding the untranslated words with special characters.
138. The method of claim 135, wherein the untranslated words are returned to the user in a list.
139. The method of claim 121, wherein the translator with feedback is used simultaneously across a network by more than one user at different interfaces.
140. The method of claim 139, wherein the multi-user translator accepts input text from any of the multiple users.
141. The method of claim 140, wherein the multi-user translator displays to all of the multiple users the input text translated into one or more output languages.
142. The method of claim 141, wherein the multi-user translation system with feedback includes an indicator for users to indicate that a translation of an input was not understandable.
143. The method of claim 142, wherein the poor-translation indicator redisplays to all users the input which was not understandable in translation, along with a request to rephrase the input.
144. The method of claim 143, wherein the poor-translation indicator warning serves as feedback to the user that originally entered the input which was not understandable in translation.
145. A method for electronically translating text, comprising: providing an electronic language translator coupled to an interface; translating the source language text at the electronic language translator into one or more target language texts ; producing at least two candidate translations for each source language text; comparing the translated candidates to one or more language models trained on data similar in style and subject matter to the text being translated; selecting the best quality translation for the input from the multiple translation candidates, according to which best matches the one or more language models; and displaying a desired best quality translation.
146. The method of claim 145, wherein the multi-candidate electronic language translator includes two or more translation engines that each produce at least one candidate translation.
147. The method of claim 145, wherein the multi-candidate electtonic language translator includes at least one translation engine which produces two or more candidate translations for each input.
148. The method of claim 145, wherein the one or more multi- candidate electronic language translator's language models are chosen from unigram models, bigram models, and trigram models, or a linear combination of unigram, bigram, and trigram models.
149. The method of claim 145, wherein the one or more multi- candidate electronic language translator's language models are Hidden Markov Models.
150. A system for electronically translating text, comprising an electronic language translator that receives source language text input and produces translated target language text; and and an interface coupled to the electronic language translator and configured to provide a user with an option of viewing one or more target language texts with or without source language text.
151. The system of claim 150, wherein the electronic language translator includes at least one translation engine.
152. The system of claim 150, wherein the electronic language translator includes a translation cache.
153. The system of claim 152, wherein the translation cache includes a store of phrases and equivalents across multiple languages.
154. The system of claim 152, wherein the translation cache includes a store of source and one or more target language equivalents that are dynamically updated.
155. The system of claim 152, wherein the translation cache includes a processing unit for executing matching between inputs and cache entries which are not typographically identical.
156. The system of claim 155, wherein the flexible matching unit includes a routine for ignoring differences in the capitalization scheme.
157. The system of claim 155, wherein the flexible matching unit includes a routine for ignoring differences in the punctuation.
158. The system of claim 155, wherein the flexible matching unit includes a routine for dividing the input at punctuation.
159. The system of claim 155, wherein the flexible matching unit includes a routine for eliminating appellatives at the beginning and end of phrases before attempting the match.
160. The system of claim 155, wherein the flexible matching unit includes a glossary of abbreviations, slang forms, and other non-standard forms, plus a routine for substituting standard forms for the glossary entries.
161. The system of claim 155, wherein the flexible matching unit includes a diacritic removal routine.
162. The system of claim 155, wherein the flexible matching unit includes a hiragana and katakana unification routine for Japanese inputs.
163. The system of claim 155, wherein the flexible matching unit includes a small and large kana unification routine for Japanese inputs.
164. The system of claim 155, wherein the flexible matching unit includes a sentence-final expressive particles (gobi) elimination routine for Japanese inputs.
165. The system of claim 150, wherein the electronic language translator includes a plurality of ttanslation engines.
166. The system of claim 165, wherein the electronic language translator includes a multiple engine comparison tool that receives translated target language outputs from multiple engines and selects a desired output.
167. The system of claim 150, wherein the electronic language translator includes a pre-processor that improves the translatability of the source language.
168. The system of claim 167, wherein the pre-processor includes a language-specific source language input corrector for improved translatability
169. The system of claim 167, wherein the pre-processor includes a spell-checker unit. -
170. The system of claim 167, wherein the pre-processor includes an acronyms and abbreviations expander. -
171. The system of claim 167, wherein the pre-processor includes an accent-restoration unit-
172. The system of claim 167, wherein the pre-processor includes a slang replacement unit.
173. The system of claim 167, wherein the pre-processor includes a conversational constructions replacement routine.
174. The system of claim 167, wherein the pre-processor includes a sentence-final expressive particles elimination routine.
175. The system of claim 174, wherein the pre-processor includes a Japanese gobi elimination routine.
176. The system of claim 150, wherein the electronic language translator includes a translator training tutorial. -
177. The system of claim 150, wherein the electronic language translator includes an input composition tool which interactively guides the user to use translation-friendly language.
178. The system of claim 177, wherein the composition tool includes a spell checker.
179. The system of claim 177, wherein the composition tool includes a difficult-to-translate phrase detection routine.
180. The system of claim 177, wherein the composition tool includes a lexically-ambiguous word detection routine.
181. The system of claim 177, wherein the composition tool includes an input-length monitor.
182. The system of claim 181, wherein the input length monitor includes a word demerit monitor.
183. The system of claim 182, wherein the word demerit monitor is a conjunction demerit monitor.
184. The system of claim 177, wherein the composition tool includes a difficult-to-translate syntax scanner.
185. The system of claim 177, wherein the composition tool includes an ambiguous construction scanner.
186. The system of claim 177, wherein the composition tool includes an accent corrector.
187. The system of claim 177, wherein the composition tool includes a language model.
188. The system of claim 187, wherein the language model is chosen from a trigram model, bigram model, unigram model, or a linear combination of ttigram, bigram, and unigram models.
189. The system of claim 187, wherein the language model is a Hidden Markov Model.
190. The system of claim 177, wherein the composition tool includes a language model for preliminary translations.
191. The system of claim 190, wherein the language model is chosen from a trigram model, bigram model, unigram model, or a linear combination of trigram, bigram, and unigram models.
192. The system of claim 190, wherein the language model is a Hidden Markov Model.
193. The system of claim 150, wherein the electronic language translator includes a do-not-translator indicator.
194. The system of claim 193, wherein the do not-translate indicator is a set of special characters places before and after text not to translate.
195. The system of claim 193, wherein the do-not-translate indicator includes a translation-neutral token substitution routine.
196. The system of claim 195, wherein the translation-neutral token is a randomly-generated very large integer.
197. The system of claim 195, wherein the translation-neutral token is a randomly-generated very large integer concatenated with a sequentially generated integer.
198. The system of claim 195, wherein the translation-neutral token is a randomly-generated alpha-numeric string.
199. The system of claim 195, wherein the translation-neutral token is a randomly-generated alpha-numeric string concatenated with a sequentially generated character.
200. The system of claim 195, wherein the translation-neutral token is a randomly-generated alpha-numeric string concatenated with a sequentially generated integer.
201. The system of claim 150, wherein the electronic language translator includes specialized dictionaries.
202. The system of claim 201 , wherein the specialized dictionaries are chosen from topic-specific, application-specific, and user-specific dictionaries.
203. The system of claim 150, wherein the electronic language translator includes a capitalization recording and restoration unit.
204. The system of claim 150, wherein the electronic language translator includes a punctuation recording and restoration unit.
205. The system of claim 150, wherein the electronic language translator includes a poor-translation feedback mechanism for the input user.
206. A system for electronically translating text, comprising an input interface for submitting source language text to an electronic language ttanslator; an electronic language translator for translating the source language text to at least one target language at the time of submission of the source language text; and an output interface for outputting the at least one target language text from the electronic language translator.
207. The system of claim 206, wherein the output interface produces as output at least one of the target language texts and at least a portion of the source language text.
208. The system of claim 206, wherein the input interface includes a text submission device and the output interface includes a translated text display device.
209. The system of claim 208, wherein the output interface includes a reply composition device.
210. The system of claim 209, wherein the output interface includes a reply submission device.
211. The system of claim 210, wherein electronic language translator includes a component to translate the submitted replies into the original source language.
212. The system of claim 211, wherein the electronic language ttanslator includes components to disseminate the original and reply texts to multiple users.
213. The system of claim 212, wherein the electronic language translator includes interfaces which allow the multiple users to reply to messages and have the replies disseminated to multiple users.
214. The system of claim 212, wherein the electronic language translator is within a chat system environment.
215. The system of claim 212, wherein the electronic language translator is within a instant messaging system environment.
216. The system of claim 212, wherein the electronic language translator is within a discussion board system environment.
217. The system of claim 212, wherein the electronic language translator is within an email system environment.
218. The system of claim 212, wherein the electronic language translator is within an electtonic customer service system environment.
219. The system of claim 218, wherein the electronic language translator is within a chat system environment in an electronic customer service system environment.
220. The system of claim 218, wherein the electronic language translator is within an email system environment in an electronic customer service system environment.
221. The system of claim 220, wherein the email electronic customer service system includes a first-user input text meaning analyzer.
222. The system of claim 221 , wherein the first-user input text meaning analyzer is triggered by receipt of the input text without explicit instructions from a human operator.
223. The system of claim 221 , wherein the email electronic customer service system includes an automatic reply-generation component which generates a reply based on the analyzed meaning of the input text.
224. The system of claim 223, wherein the reply generation component is triggered by receipt of the input text without explicit instructions from a human operator.
225. The system of claim 223 , wherein the reply generation component generates the reply to the first user in the first user's original source language.
226. The system of claim 206, wherein the electronic language translator includes a posting tool to post at least one target language to an electronic marketplace system.
227. The system of claim 226, wherein the electronic language translator includes a storage routine to store at least one target language text to a marketplace database.
228. The system of claim 226, wherein the electronic language translator includes a posting tool to post at least one target language text to the electronic marketplace system along with the source language text, component
229. The system of claim 226, wherein the electronic language translator interprets the source language text as a description of an object in an electronic marketplace system and the one or more target language texts as translations of the object description.
230. The system of claim 206, wherein the electronic language translator interprets the source language text as a search query string, and includes an electronic search system configured to receive at least one translated text output as a search query string.
231. The system of claim 230, wherein the electronic search system's output interface translates the returned search results into the original user's source language using the electronic language translator.
232. The system of claim 206, wherein the electronic language translator's input interface accepts the source language text in the form of a request for a document, which is submitted from the original user's hardware using a software client, transported over a network, and delivered to a server.
233. The system of claim 232, wherein the electronic language translator includes a routine to interpret a markup language which augments the requested document.
234. The system of claim 233, wherein the electronic language translator includes a component to extract the textual components of the document and translate them into at least one target language.
235. The system of claim 233 , wherein the textual components of the document are chosen from text, mouseovers, meta-tags, and cookies.
236. The system of claim 234, wherein the electronic language translator includes a component to rewrite the hotlinks within the document to be calls to the electronic language translator.
237. The system of claim 236, wherein the electtonic language translator includes a component to reconstitute the at least one target language output with the non-textual portions of the original document according to the original markup language tags, and return the reconstituted document to the original requesting user.
238. The system of claim 237, wherein the non-textual portions of the original document are chosen from graphics, pictures, formatting, backgrounds, frames, animations, sounds, and videos.
239. The method of claim 237, wherein the reconstituted document is returned to the original requesting user to preserve the original look and feel of the original requested document.
240. The system of claim 237, wherein the original user's hardware is a computer, the user's software client is a browser, the network is a network connection between computers, the server is another computer, and the markup language is HTML.
241. The system of claim 237, where the original user's hardware is a personal data assistant, the user's software client is a PDA browser, the network is a wireless internet, the server is a computer, and the markup language is WML or HDML.
242. The system of claim 237, where the original user's hardware is a phone, the user's software client is a WAP browser, the network is a WAP network, the server is a computer, and the markup language is WML or HDML.
243. The system of claim 237, where the original user's hardware is a phone, the user's software client is an iMode browser, the network is an iMode network, the server is a computer, and the markup language is WHTML.
244. A system for electronically translating text, comprising: an electtonic language translator system that includes an electronic language ttanslator and at least a first and a second dictionary, wherein the electronic language ttanslator references the first dictionary and then the second dictionary in a process of translating source language text into one or more target language texts and the dictionaries are maintained in an application or customer hierarchy; an interface for receiving input of the electtonic language; and an interface for outputting the source language text translated into one or more target languages.
245. The system of claim 244, wherein the electronic dictionaries include one or more of subject-specific, application-specific, customer-specific, and user-specific dictionaries.
246. The system of claim 245, wherein the electtonic language translator includes a component for selecting which specialized dictionaries are to be used for translation dynamically, at the time of translation.
247. The system of claim 245, wherein the electronic language translator includes a specialized dictionary creation component.
248. The system of claim 245, wherein the electronic language translator includes a specialized dictionary hierarchy maintenance routine.
249. The system of claim 248, wherein the dictionary hierarchy includes a hierarchy augmentation tool to allow users to augment the hierarchy with user-created dictionaries.
250. The system of claim 245, wherein the electronic language translator includes creation, storage, and modification routines for the specialized dictionaries, a dictionary format which is independent of any specific translation engine, and a dictionary mapping routine which maps the independent dictionary format into engine-specific formats by engine-specific routines.
251. A system for electronic language translation, comprising; one or more translation modules receiving source language text from an input interface; one or more input interfaces; one or more output interfaces; a generic data format that is independent of the translation modules, input interfaces and output interfaces; a conversion module configured to convert input source language text from a specific input interface to a generic format; a routing module configured to determine the one or more translation modules that provide an optimal translation and then route the text to the module that provides the optimal translation; a conversion module configured to convert text from the generic data format to a specific input format of a translation module; a conversion module configured to convert specific output format from a translation module to the generic data format; and a conversion module configured to convert data from the generic data format into an output format suitable for an output interface.
252. The system of claim 251 , wherein one or more translation modules is a ttanslation engine.
253. The system of claim 252, wherein the one or more translation modules is coupled with a specialized dictionary with relevant vocabulary for a translation request.
254. The system of claim 252, wherein the specialized dictionary is chosen from subject-specific, application-specific, client-specific, and user- specific dictionaries.
255. The system of claim 251, wherein the one or more translation modules includes at least one static translation cache.
256. The system of claim 251 , wherein the one or more translation modules includes at least one dynamic translation cache is as a module.
257. The system of claim 251, wherein the one or more translation modules includes at least one input pre-processing system is as a module.
258. The system of claim 251 , wherein the one or more ttanslation modules includes at least one output post-processing system is as a module.
259. A system for electronically translating text, comprising; an electronic language translator which translates the source language text into one or more target language texts; an output interface that displays one or more target languages; and an output interface configured tovary an interface representation of text in the one or more target languages.
260. The system of claim 259, further comprising: controls at the output interface that permit a user to customize differentiation between source and target languages.
261. The system of claim 260, wherein the controls permit a user to customize differentiation between source and multiple target languages.
262. A system for electronically translating text, comprising: an electtonic language translator with feedback; an interface for receiving input of the electronic language; an interface for outputting the source language text translated into one or more target languages; and a component for providing feedback to the original user about the quality of the translation.
263. The system of claim 263, wherein the translator with feedback includes a component for displaying the original input text aligned with one or more output target languages.
264. The system of claim 263, wherein the ttanslator with feedback includes an electronic dictionary coupled to a main text.
265. The system of claim 264, wherein a hyperlink component couples the dictionary to the main text.
266. The system of claim 264, wherein a mouse-over component couples the dictionary to the main text.
268. The system of claim 263, wherein the translator with feedback includes a component to indicate to the user words that were not translated by the electronic language translator.
269. The system of claim 263, wherein the ttanslator with feedback includes a component to display translated output to one or more other users. 270. The system of claim 269, wherein the translator with feedback includes a component for third party users to indicate if translation of the input was understandable.
PCT/US2001/010628 2000-03-31 2001-04-02 Method and apparatus for providing multilingual translation over a network WO2001075662A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
AU2001249777A AU2001249777A1 (en) 2000-03-31 2001-04-02 Method and apparatus for providing multilingual translation over a network
JP2001573273A JP2003529845A (en) 2000-03-31 2001-04-02 Method and apparatus for providing multilingual translation over a network

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US19393700P 2000-03-31 2000-03-31
US60/193,937 2000-03-31
US21255300P 2000-06-20 2000-06-20
US60/212,553 2000-06-20

Publications (2)

Publication Number Publication Date
WO2001075662A2 true WO2001075662A2 (en) 2001-10-11
WO2001075662A8 WO2001075662A8 (en) 2002-02-14

Family

ID=26889521

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/010628 WO2001075662A2 (en) 2000-03-31 2001-04-02 Method and apparatus for providing multilingual translation over a network

Country Status (4)

Country Link
US (1) US20010029455A1 (en)
JP (1) JP2003529845A (en)
AU (1) AU2001249777A1 (en)
WO (1) WO2001075662A2 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008042845A1 (en) * 2006-10-02 2008-04-10 Google Inc. Displaying original text in a user interface with translated text
WO2015130984A3 (en) * 2014-02-28 2015-10-15 Ebay Inc. Improvement of automatic machine translation using user feedback
BE1021599B1 (en) * 2014-12-29 2015-12-17 Crosslang Nv MACHINE TRANSLATION SYSTEM FOR AUTOMATICALLY GENERATING A TEXT TRANSLATION
JP2016509312A (en) * 2013-02-08 2016-03-24 マシーン・ゾーン・インコーポレイテッドMachine Zone, Inc. System and method for multi-user multilingual communication
US9530161B2 (en) 2014-02-28 2016-12-27 Ebay Inc. Automatic extraction of multilingual dictionary items from non-parallel, multilingual, semi-structured data
US9798720B2 (en) 2008-10-24 2017-10-24 Ebay Inc. Hybrid machine translation
US9881006B2 (en) 2014-02-28 2018-01-30 Paypal, Inc. Methods for automatic generation of parallel corpora
US9940658B2 (en) 2014-02-28 2018-04-10 Paypal, Inc. Cross border transaction machine translation

Families Citing this family (475)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11143616A (en) * 1997-11-10 1999-05-28 Sega Enterp Ltd Character communication equipment
US8290809B1 (en) 2000-02-14 2012-10-16 Ebay Inc. Determining a community rating for a user using feedback ratings of related users in an electronic environment
US7428505B1 (en) 2000-02-29 2008-09-23 Ebay, Inc. Method and system for harvesting feedback and comments regarding multiple items from users of a network-based transaction facility
US9614934B2 (en) 2000-02-29 2017-04-04 Paypal, Inc. Methods and systems for harvesting comments regarding users on a network-based facility
US8645137B2 (en) 2000-03-16 2014-02-04 Apple Inc. Fast, language-independent method for user authentication by voice
US6604107B1 (en) 2000-04-24 2003-08-05 Ebay Inc. Generic attribute database system for storing items of different categories having shared attributes
US7437669B1 (en) * 2000-05-23 2008-10-14 International Business Machines Corporation Method and system for dynamic creation of mixed language hypertext markup language content through machine translation
US8396859B2 (en) 2000-06-26 2013-03-12 Oracle International Corporation Subject matter context search engine
US7865358B2 (en) * 2000-06-26 2011-01-04 Oracle International Corporation Multi-user functionality for converting data from a first form to a second form
US20020007382A1 (en) * 2000-07-06 2002-01-17 Shinichi Nojima Computer having character input function,method of carrying out process depending on input characters, and storage medium
US7052396B2 (en) * 2000-09-11 2006-05-30 Nintendo Co., Ltd. Communication system and method using pictorial characters
US6816885B1 (en) * 2000-09-21 2004-11-09 International Business Machines Corporation Method and system to handle large volume of E-mail received from a plurality of senders intelligently
US6832244B1 (en) * 2000-09-21 2004-12-14 International Business Machines Corporation Graphical e-mail content analyser and prioritizer including hierarchical email classification system in an email
US7660740B2 (en) 2000-10-16 2010-02-09 Ebay Inc. Method and system for listing items globally and regionally, and customized listing according to currency or shipping area
JP4135307B2 (en) * 2000-10-17 2008-08-20 株式会社日立製作所 Voice interpretation service method and voice interpretation server
US6859820B1 (en) * 2000-11-01 2005-02-22 Microsoft Corporation System and method for providing language localization for server-based applications
US20020078152A1 (en) 2000-12-19 2002-06-20 Barry Boone Method and apparatus for providing predefined feedback
US20020082825A1 (en) * 2000-12-22 2002-06-27 Ge Medical Systems Information Technologies, Inc. Method for organizing and using a statement library for generating clinical reports and retrospective queries
US7062482B1 (en) * 2001-02-22 2006-06-13 Drugstore. Com Techniques for phonetic searching
JP2002268665A (en) * 2001-03-13 2002-09-20 Oki Electric Ind Co Ltd Text voice synthesizer
US20060253784A1 (en) * 2001-05-03 2006-11-09 Bower James M Multi-tiered safety control system and methods for online communities
US20020169592A1 (en) * 2001-05-11 2002-11-14 Aityan Sergey Khachatur Open environment for real-time multilingual communication
US20020184340A1 (en) * 2001-05-31 2002-12-05 Alok Srivastava XML aware logical caching system
US8214196B2 (en) 2001-07-03 2012-07-03 University Of Southern California Syntax-based statistical translation model
US20030043186A1 (en) * 2001-08-30 2003-03-06 Marina Libman Method and apparatus for storing real-time text messages
US6658260B2 (en) 2001-09-05 2003-12-02 Telecommunication Systems, Inc. Inter-carrier short messaging service providing phone number only experience
JP2003091344A (en) * 2001-09-19 2003-03-28 Sony Corp Information processor, information processing method, recording medium, data structure and program
US8438004B2 (en) * 2001-10-03 2013-05-07 Hewlett-Packard Development Company L.P. System and methods for language translation printing
US7752266B2 (en) * 2001-10-11 2010-07-06 Ebay Inc. System and method to facilitate translation of communications between entities over a network
US7221933B2 (en) 2001-10-22 2007-05-22 Kyocera Wireless Corp. Messaging system for mobile communication
GB2382678A (en) * 2001-11-28 2003-06-04 Symbio Ip Ltd a knowledge database
US20030125927A1 (en) * 2001-12-28 2003-07-03 Microsoft Corporation Method and system for translating instant messages
US7272377B2 (en) * 2002-02-07 2007-09-18 At&T Corp. System and method of ubiquitous language translation for wireless devices
US20030191801A1 (en) * 2002-03-19 2003-10-09 Sanjoy Paul Method and apparatus for enabling services in a cache-based network
AU2003269808A1 (en) 2002-03-26 2004-01-06 University Of Southern California Constructing a translation lexicon from comparable, non-parallel corpora
EP1353280B1 (en) * 2002-04-12 2006-06-14 Targit A/S A method of processing multi-lingual queries
CN1452102A (en) * 2002-04-19 2003-10-29 英业达股份有限公司 Incomplete prompting sentence-making system and method
JP4558482B2 (en) * 2002-06-05 2010-10-06 ス、ロンビン National language character information optimization digital operation coding and input method and information processing system
US7941348B2 (en) * 2002-06-10 2011-05-10 Ebay Inc. Method and system for scheduling transaction listings at a network-based transaction facility
US8078505B2 (en) 2002-06-10 2011-12-13 Ebay Inc. Method and system for automatically updating a seller application utilized in a network-based transaction facility
US8719041B2 (en) 2002-06-10 2014-05-06 Ebay Inc. Method and system for customizing a network-based transaction facility seller application
US7110937B1 (en) * 2002-06-20 2006-09-19 Siebel Systems, Inc. Translation leveraging
US7236923B1 (en) 2002-08-07 2007-06-26 Itt Manufacturing Enterprises, Inc. Acronym extraction system and method of identifying acronyms and extracting corresponding expansions from text
US7113960B2 (en) 2002-08-22 2006-09-26 International Business Machines Corporation Search on and search for functions in applications with varying data types
US20040044517A1 (en) * 2002-08-30 2004-03-04 Robert Palmquist Translation system
US20060015923A1 (en) * 2002-09-03 2006-01-19 Mei Chuah Collaborative interactive services synchronized with real events
JP4664076B2 (en) * 2002-09-30 2011-04-06 キューナチュラリー システムズ インコーポレイテッド Blinking annotation callouts to highlight cross-language search results
US20040102201A1 (en) * 2002-11-22 2004-05-27 Levin Robert E. System and method for language translation via remote devices
US8392173B2 (en) * 2003-02-10 2013-03-05 At&T Intellectual Property I, L.P. Message translations
US7627479B2 (en) 2003-02-21 2009-12-01 Motionpoint Corporation Automation tool for web site content language translation
JP3944102B2 (en) * 2003-03-13 2007-07-11 株式会社日立製作所 Document retrieval system using semantic network
US8170863B2 (en) * 2003-04-01 2012-05-01 International Business Machines Corporation System, method and program product for portlet-based translation of web content
US9881308B2 (en) 2003-04-11 2018-01-30 Ebay Inc. Method and system to facilitate an online promotion relating to a network-based marketplace
US20040243531A1 (en) * 2003-04-28 2004-12-02 Dean Michael Anthony Methods and systems for representing, using and displaying time-varying information on the Semantic Web
US20040230898A1 (en) * 2003-05-13 2004-11-18 International Business Machines Corporation Identifying topics in structured documents for machine translation
JP3920812B2 (en) * 2003-05-27 2007-05-30 株式会社東芝 Communication support device, support method, and support program
US7742985B1 (en) 2003-06-26 2010-06-22 Paypal Inc. Multicurrency exchanges between participants of a network-based transaction facility
US7272406B2 (en) * 2003-06-30 2007-09-18 Sybase 365, Inc. System and method for in-transit SMS language translation
US7149971B2 (en) * 2003-06-30 2006-12-12 American Megatrends, Inc. Method, apparatus, and system for providing multi-language character strings within a computer
US8548794B2 (en) 2003-07-02 2013-10-01 University Of Southern California Statistical noun phrase translation
US7346487B2 (en) * 2003-07-23 2008-03-18 Microsoft Corporation Method and apparatus for identifying translations
US20050049997A1 (en) * 2003-08-27 2005-03-03 Microsoft Corporation Method for persisting a unicode compatible offline address
KR100565289B1 (en) * 2003-08-30 2006-03-30 엘지전자 주식회사 Data management method for mobile communication device using hyper link
US7607097B2 (en) * 2003-09-25 2009-10-20 International Business Machines Corporation Translating emotion to braille, emoticons and other special symbols
GB0322915D0 (en) * 2003-10-01 2003-11-05 Ibm System and method for application sharing
US8489769B2 (en) * 2003-10-02 2013-07-16 Accenture Global Services Limited Intelligent collaborative expression in support of socialization of devices
US20050131744A1 (en) * 2003-12-10 2005-06-16 International Business Machines Corporation Apparatus, system and method of automatically identifying participants at a videoconference who exhibit a particular expression
US20050131697A1 (en) * 2003-12-10 2005-06-16 International Business Machines Corporation Speech improving apparatus, system and method
MXPA06006205A (en) * 2003-12-17 2006-09-04 Speechgear Inc Translation tool.
FI115274B (en) * 2003-12-19 2005-03-31 Nokia Corp Electronic device e.g. palm computer selects language package for use in voice user interface used for controlling device functions
US8458277B2 (en) * 2004-01-22 2013-06-04 Verizon Business Global Llc Method and system for providing universal relay services
US7542971B2 (en) * 2004-02-02 2009-06-02 Fuji Xerox Co., Ltd. Systems and methods for collaborative note-taking
CN1950820A (en) * 2004-03-02 2007-04-18 梅林格有限公司 Embedded translation document method and system
US8296127B2 (en) 2004-03-23 2012-10-23 University Of Southern California Discovery of parallel text portions in comparable collections of corpora and training using comparable texts
US20050234700A1 (en) * 2004-04-15 2005-10-20 International Business Machines Corporation Autonomic method, system and program product for translating content
US8666725B2 (en) 2004-04-16 2014-03-04 University Of Southern California Selection and use of nonstatistical translation components in a statistical machine translation framework
US9189568B2 (en) 2004-04-23 2015-11-17 Ebay Inc. Method and system to display and search in a language independent manner
US7835982B2 (en) * 2004-07-02 2010-11-16 Manheim Investments, Inc. Computer-assisted method and apparatus for absentee sellers to participate in auctions and other sales
US7536634B2 (en) * 2005-06-13 2009-05-19 Silver Creek Systems, Inc. Frame-slot architecture for data conversion
US20060059424A1 (en) * 2004-09-15 2006-03-16 Petri Jonah W Real-time data localization
JP5452868B2 (en) 2004-10-12 2014-03-26 ユニヴァーシティー オブ サザン カリフォルニア Training for text-to-text applications that use string-to-tree conversion for training and decoding
US7499928B2 (en) * 2004-10-15 2009-03-03 Microsoft Corporation Obtaining and displaying information related to a selection within a hierarchical data structure
US20060085253A1 (en) * 2004-10-18 2006-04-20 Matthew Mengerink Method and system to utilize a user network within a network-based commerce platform
US7376648B2 (en) * 2004-10-20 2008-05-20 Oracle International Corporation Computer-implemented methods and systems for entering and searching for non-Roman-alphabet characters and related search systems
US7669198B2 (en) * 2004-11-18 2010-02-23 International Business Machines Corporation On-demand translator for localized operating systems
US7451188B2 (en) * 2005-01-07 2008-11-11 At&T Corp System and method for text translations and annotation in an instant messaging session
US7584209B2 (en) * 2005-02-04 2009-09-01 Microsoft Corporation Flexible file format for updating an address book
US7675874B2 (en) * 2005-02-24 2010-03-09 International Business Machines Corporation Peer-to-peer instant messaging and chat system
JP2006276915A (en) * 2005-03-25 2006-10-12 Fuji Xerox Co Ltd Translating processing method, document translating device and program
JP4050755B2 (en) * 2005-03-30 2008-02-20 株式会社東芝 Communication support device, communication support method, and communication support program
US7490079B2 (en) * 2005-04-14 2009-02-10 Microsoft Corporation Client side indexing of offline address book files
US20060235932A1 (en) * 2005-04-18 2006-10-19 International Business Machines Corporation Chat server mute capability
US7765098B2 (en) * 2005-04-26 2010-07-27 Content Analyst Company, Llc Machine translation using vector space representations
US7548849B2 (en) * 2005-04-29 2009-06-16 Research In Motion Limited Method for generating text that meets specified characteristics in a handheld electronic device and a handheld electronic device incorporating the same
US20060245005A1 (en) * 2005-04-29 2006-11-02 Hall John M System for language translation of documents, and methods
US7958446B2 (en) * 2005-05-17 2011-06-07 Yahoo! Inc. Systems and methods for language translation in network browsing applications
US9582602B2 (en) 2005-05-17 2017-02-28 Excalibur Ip, Llc Systems and methods for improving access to syndication feeds in network browsing applications
US20070174286A1 (en) * 2005-05-17 2007-07-26 Yahoo!, Inc. Systems and methods for providing features and user interface in network browsing applications
US7428537B2 (en) * 2005-05-23 2008-09-23 Tyloon, Inc Searching method and system for commercial information
US20060277189A1 (en) * 2005-06-02 2006-12-07 Microsoft Corporation Translation of search result display elements
US8676563B2 (en) 2009-10-01 2014-03-18 Language Weaver, Inc. Providing human-generated and machine-generated trusted translations
US8886517B2 (en) 2005-06-17 2014-11-11 Language Weaver, Inc. Trust scoring for language translation systems
US20070027670A1 (en) * 2005-07-13 2007-02-01 Siemens Medical Solutions Health Services Corporation User Interface Update System
US8677377B2 (en) 2005-09-08 2014-03-18 Apple Inc. Method and apparatus for building an intelligent automated assistant
US8275399B2 (en) 2005-09-21 2012-09-25 Buckyball Mobile Inc. Dynamic context-data tag cloud
US20160344581A9 (en) * 2005-09-21 2016-11-24 Amit Karmarkar Text message including a contextual attribute of a mobile device
US8515468B2 (en) * 2005-09-21 2013-08-20 Buckyball Mobile Inc Calculation of higher-order data from context data
US8498999B1 (en) * 2005-10-14 2013-07-30 Wal-Mart Stores, Inc. Topic relevant abbreviations
KR100643801B1 (en) * 2005-10-26 2006-11-10 엔에이치엔(주) System and method for providing automatically completed recommendation word by interworking a plurality of languages
US10319252B2 (en) 2005-11-09 2019-06-11 Sdl Inc. Language capability assessment and training apparatus and techniques
TW200719174A (en) * 2005-11-11 2007-05-16 Inventec Appliances Corp Translation system and method
CN101366023A (en) * 2005-11-14 2009-02-11 野田文隆 Multi language exchange system
US20070136068A1 (en) * 2005-12-09 2007-06-14 Microsoft Corporation Multimodal multilingual devices and applications for enhanced goal-interpretation and translation for service providers
US7849144B2 (en) * 2006-01-13 2010-12-07 Cisco Technology, Inc. Server-initiated language translation of an instant message based on identifying language attributes of sending and receiving users
US7555534B2 (en) * 2006-02-15 2009-06-30 Microsoft Corporation Phonetic name support in an electronic directory
KR101431194B1 (en) * 2006-02-17 2014-08-18 구글 인코포레이티드 Encoding and adaptive, scalable accessing of distributed models
US8660244B2 (en) 2006-02-17 2014-02-25 Microsoft Corporation Machine translation instant messaging applications
US20100280818A1 (en) * 2006-03-03 2010-11-04 Childers Stephen R Key Talk
KR100707970B1 (en) * 2006-03-10 2007-04-16 (주)인피니티 텔레콤 Method for translation service using the cellular phone
ITUD20060067A1 (en) * 2006-03-15 2007-09-16 D Agostini Organizzazione Srl METHOD AND SYSTEM OF SPEED OF AUTOMATIC TRANSLATION TO THE COMPUTER
US8706560B2 (en) 2011-07-27 2014-04-22 Ebay Inc. Community based network shopping
US8943080B2 (en) 2006-04-07 2015-01-27 University Of Southern California Systems and methods for identifying parallel documents and sentence fragments in multilingual document collections
US20070250711A1 (en) * 2006-04-25 2007-10-25 Phonified Llc System and method for presenting and inputting information on a mobile device
US9704174B1 (en) 2006-05-25 2017-07-11 Sean I. Mcghie Conversion of loyalty program points to commerce partner points per terms of a mutual agreement
US8684265B1 (en) 2006-05-25 2014-04-01 Sean I. Mcghie Rewards program website permitting conversion/transfer of non-negotiable credits to entity independent funds
US10062062B1 (en) 2006-05-25 2018-08-28 Jbshbm, Llc Automated teller machine (ATM) providing money for loyalty points
US7703673B2 (en) 2006-05-25 2010-04-27 Buchheit Brian K Web based conversion of non-negotiable credits associated with an entity to entity independent negotiable funds
US8668146B1 (en) 2006-05-25 2014-03-11 Sean I. Mcghie Rewards program with payment artifact permitting conversion/transfer of non-negotiable credits to entity independent funds
US20070282594A1 (en) * 2006-06-02 2007-12-06 Microsoft Corporation Machine translation in natural language application development
KR100810999B1 (en) * 2006-06-30 2008-03-11 엔에이치엔(주) On-line e mail service system, and service method thereof
US8886518B1 (en) * 2006-08-07 2014-11-11 Language Weaver, Inc. System and method for capitalizing machine translated text
US8639782B2 (en) 2006-08-23 2014-01-28 Ebay, Inc. Method and system for sharing metadata between interfaces
US20080065446A1 (en) * 2006-08-25 2008-03-13 Microsoft Corporation Web collaboration in multiple languages
US8626486B2 (en) * 2006-09-05 2014-01-07 Google Inc. Automatic spelling correction for machine translation
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
US20080086310A1 (en) * 2006-10-09 2008-04-10 Kent Campbell Automated Contextually Specific Audio File Generator
US8195447B2 (en) 2006-10-10 2012-06-05 Abbyy Software Ltd. Translating sentences between languages using language-independent semantic structures and ratings of syntactic constructions
US9645993B2 (en) 2006-10-10 2017-05-09 Abbyy Infopoisk Llc Method and system for semantic searching
US9047275B2 (en) 2006-10-10 2015-06-02 Abbyy Infopoisk Llc Methods and systems for alignment of parallel text corpora
US8145473B2 (en) 2006-10-10 2012-03-27 Abbyy Software Ltd. Deep model statistics method for machine translation
US8214199B2 (en) * 2006-10-10 2012-07-03 Abbyy Software, Ltd. Systems for translating sentences between languages using language-independent semantic structures and ratings of syntactic constructions
US9984071B2 (en) 2006-10-10 2018-05-29 Abbyy Production Llc Language ambiguity detection of text
US8548795B2 (en) * 2006-10-10 2013-10-01 Abbyy Software Ltd. Method for translating documents from one language into another using a database of translations, a terminology dictionary, a translation dictionary, and a machine translation system
US20080086298A1 (en) * 2006-10-10 2008-04-10 Anisimovich Konstantin Method and system for translating sentences between langauges
US9633005B2 (en) 2006-10-10 2017-04-25 Abbyy Infopoisk Llc Exhaustive automatic processing of textual information
US9235573B2 (en) 2006-10-10 2016-01-12 Abbyy Infopoisk Llc Universal difference measure
KR100834549B1 (en) * 2006-10-19 2008-06-02 한국전자통신연구원 System for language translation and method of providing language translation service
US8972268B2 (en) * 2008-04-15 2015-03-03 Facebook, Inc. Enhanced speech-to-speech translation system and methods for adding a new word
US9070363B2 (en) * 2007-10-26 2015-06-30 Facebook, Inc. Speech translation with back-channeling cues
US11222185B2 (en) 2006-10-26 2022-01-11 Meta Platforms, Inc. Lexicon development via shared translation database
US8433556B2 (en) 2006-11-02 2013-04-30 University Of Southern California Semi-supervised training for statistical word alignment
US8799218B2 (en) 2006-12-01 2014-08-05 Ebay Inc. Business channel synchronization
US9122674B1 (en) 2006-12-15 2015-09-01 Language Weaver, Inc. Use of annotations in statistical machine translation
JP4997966B2 (en) * 2006-12-28 2012-08-15 富士通株式会社 Parallel translation example sentence search program, parallel translation example sentence search device, and parallel translation example sentence search method
US8606606B2 (en) * 2007-01-03 2013-12-10 Vistaprint Schweiz Gmbh System and method for translation processing
US8131536B2 (en) * 2007-01-12 2012-03-06 Raytheon Bbn Technologies Corp. Extraction-empowered machine translation
US8468149B1 (en) * 2007-01-26 2013-06-18 Language Weaver, Inc. Multi-lingual online community
US7913178B2 (en) * 2007-01-31 2011-03-22 Ebay Inc. Method and system for collaborative and private sessions
US8768689B2 (en) * 2007-02-14 2014-07-01 Nuance Communications, Inc. Method and system for translation management of source language text phrases
US8112402B2 (en) * 2007-02-26 2012-02-07 Microsoft Corporation Automatic disambiguation based on a reference resource
US8615389B1 (en) 2007-03-16 2013-12-24 Language Weaver, Inc. Generation and exploitation of an approximate language model
US8959011B2 (en) 2007-03-22 2015-02-17 Abbyy Infopoisk Llc Indicating and correcting errors in machine translation systems
US8515728B2 (en) 2007-03-29 2013-08-20 Microsoft Corporation Language translation of visual and audio input
US8977255B2 (en) 2007-04-03 2015-03-10 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
US8831928B2 (en) 2007-04-04 2014-09-09 Language Weaver, Inc. Customizable machine translation service
US20080274756A1 (en) * 2007-05-02 2008-11-06 Research In Motion Limited Message handling based on receiver display size
US8799307B2 (en) * 2007-05-16 2014-08-05 Google Inc. Cross-language information retrieval
US8205151B2 (en) * 2007-05-31 2012-06-19 Red Hat, Inc. Syndication of documents in increments
US10296588B2 (en) * 2007-05-31 2019-05-21 Red Hat, Inc. Build of material production system
US9361294B2 (en) * 2007-05-31 2016-06-07 Red Hat, Inc. Publishing tool for translating documents
US8825466B1 (en) 2007-06-08 2014-09-02 Language Weaver, Inc. Modification of annotated bilingual segment pairs in syntax-based machine translation
KR101358224B1 (en) * 2007-06-19 2014-02-05 엘지전자 주식회사 Apparatus and method for supporting multi language
US8812296B2 (en) 2007-06-27 2014-08-19 Abbyy Infopoisk Llc Method and system for natural language dictionary generation
MY151645A (en) * 2007-06-27 2014-06-30 Mimos Berhad A system and method of language translation
US8051061B2 (en) 2007-07-20 2011-11-01 Microsoft Corporation Cross-lingual query suggestion
US9129031B2 (en) * 2007-08-29 2015-09-08 International Business Machines Corporation Dynamically configurable portlet
US7890539B2 (en) * 2007-10-10 2011-02-15 Raytheon Bbn Technologies Corp. Semantic matching using predicate-argument structure
US8275606B2 (en) 2007-10-25 2012-09-25 Disney Enterprises, Inc. System and method for localizing assets using flexible metadata
US20090132257A1 (en) * 2007-11-19 2009-05-21 Inventec Corporation System and method for inputting edited translation words or sentence
JP5340584B2 (en) * 2007-11-28 2013-11-13 インターナショナル・ビジネス・マシーンズ・コーポレーション Device and method for supporting reading of electronic message
US9418061B2 (en) * 2007-12-14 2016-08-16 International Business Machines Corporation Prioritized incremental asynchronous machine translation of structured documents
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US8117242B1 (en) 2008-01-18 2012-02-14 Boadin Technology, LLC System, method, and computer program product for performing a search in conjunction with use of an online application
US8117225B1 (en) * 2008-01-18 2012-02-14 Boadin Technology, LLC Drill-down system, method, and computer program product for focusing a search
US10320717B2 (en) 2008-01-24 2019-06-11 Ebay Inc. System and method of using conversational agent to collect information and trigger actions
US8175882B2 (en) * 2008-01-25 2012-05-08 International Business Machines Corporation Method and system for accent correction
US9817822B2 (en) 2008-02-07 2017-11-14 International Business Machines Corporation Managing white space in a portal web page
US8473276B2 (en) * 2008-02-19 2013-06-25 Google Inc. Universal language input
US8301705B2 (en) * 2008-02-29 2012-10-30 Sap Ag Subject line personalization
US7917488B2 (en) * 2008-03-03 2011-03-29 Microsoft Corporation Cross-lingual search re-ranking
US20090248392A1 (en) * 2008-03-25 2009-10-01 International Business Machines Corporation Facilitating language learning during instant messaging sessions through simultaneous presentation of an original instant message and a translated version
US7698688B2 (en) 2008-03-28 2010-04-13 International Business Machines Corporation Method for automating an internationalization test in a multilingual web application
US8996376B2 (en) 2008-04-05 2015-03-31 Apple Inc. Intelligent text-to-speech conversion
US8249858B2 (en) * 2008-04-24 2012-08-21 International Business Machines Corporation Multilingual administration of enterprise data with default target languages
US8249857B2 (en) * 2008-04-24 2012-08-21 International Business Machines Corporation Multilingual administration of enterprise data with user selected target language translation
US8594995B2 (en) * 2008-04-24 2013-11-26 Nuance Communications, Inc. Multilingual asynchronous communications of speech messages recorded in digital media files
US9483466B2 (en) * 2008-05-12 2016-11-01 Abbyy Development Llc Translation system and method
US10496753B2 (en) 2010-01-18 2019-12-03 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US20090287471A1 (en) * 2008-05-16 2009-11-19 Bennett James D Support for international search terms - translate as you search
US8312032B2 (en) 2008-07-10 2012-11-13 Google Inc. Dictionary suggestions for partial user entries
US20100030549A1 (en) 2008-07-31 2010-02-04 Lee Michael M Mobile device having human language translation capability with positional feedback
US9262409B2 (en) 2008-08-06 2016-02-16 Abbyy Infopoisk Llc Translation of a selected text fragment of a screen
US8078397B1 (en) 2008-08-22 2011-12-13 Boadin Technology, LLC System, method, and computer program product for social networking utilizing a vehicular assembly
US8131458B1 (en) 2008-08-22 2012-03-06 Boadin Technology, LLC System, method, and computer program product for instant messaging utilizing a vehicular assembly
US8073590B1 (en) 2008-08-22 2011-12-06 Boadin Technology, LLC System, method, and computer program product for utilizing a communication channel of a mobile device by a vehicular assembly
US8265862B1 (en) 2008-08-22 2012-09-11 Boadin Technology, LLC System, method, and computer program product for communicating location-related information
US8190692B1 (en) 2008-08-22 2012-05-29 Boadin Technology, LLC Location-based messaging system, method, and computer program product
KR20100036841A (en) * 2008-09-30 2010-04-08 삼성전자주식회사 Display apparatus and control method thereof
US8275600B2 (en) * 2008-10-10 2012-09-25 Google Inc. Machine learning for transliteration
US9959870B2 (en) 2008-12-11 2018-05-01 Apple Inc. Speech recognition involving a mobile device
US9323854B2 (en) * 2008-12-19 2016-04-26 Intel Corporation Method, apparatus and system for location assisted translation
KR101589433B1 (en) * 2009-03-11 2016-01-28 삼성전자주식회사 Simultaneous Interpretation System
US8577910B1 (en) * 2009-05-15 2013-11-05 Google Inc. Selecting relevant languages for query translation
US8572109B1 (en) * 2009-05-15 2013-10-29 Google Inc. Query translation quality confidence
US8577909B1 (en) * 2009-05-15 2013-11-05 Google Inc. Query translation using bilingual search refinements
US20100299134A1 (en) * 2009-05-22 2010-11-25 Microsoft Corporation Contextual commentary of textual images
US8538957B1 (en) 2009-06-03 2013-09-17 Google Inc. Validating translations using visual similarity between visual media search results
US10241644B2 (en) 2011-06-03 2019-03-26 Apple Inc. Actionable reminder entries
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US20120311585A1 (en) 2011-06-03 2012-12-06 Apple Inc. Organizing task items that represent tasks to perform
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US8840400B2 (en) * 2009-06-22 2014-09-23 Rosetta Stone, Ltd. Method and apparatus for improving language communication
US9431006B2 (en) 2009-07-02 2016-08-30 Apple Inc. Methods and apparatuses for automatic speech recognition
US8352244B2 (en) * 2009-07-21 2013-01-08 International Business Machines Corporation Active learning systems and methods for rapid porting of machine translation systems to new language pairs or new domains
US8990064B2 (en) 2009-07-28 2015-03-24 Language Weaver, Inc. Translating documents based on content
US8655644B2 (en) 2009-09-30 2014-02-18 International Business Machines Corporation Language translation in an environment associated with a virtual application
US8380486B2 (en) 2009-10-01 2013-02-19 Language Weaver, Inc. Providing machine-generated translations and corresponding trust levels
CN102063425A (en) * 2009-11-17 2011-05-18 阿里巴巴集团控股有限公司 Translation method and device
US8732577B2 (en) 2009-11-24 2014-05-20 Clear Channel Management Services, Inc. Contextual, focus-based translation for broadcast automation software
US10679605B2 (en) 2010-01-18 2020-06-09 Apple Inc. Hands-free list-reading by intelligent automated assistant
US10553209B2 (en) 2010-01-18 2020-02-04 Apple Inc. Systems and methods for hands-free notification summaries
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US10705794B2 (en) 2010-01-18 2020-07-07 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US8682667B2 (en) 2010-02-25 2014-03-25 Apple Inc. User profiling for selecting user specific voice input processing information
US20110219299A1 (en) * 2010-03-07 2011-09-08 DSNR Labs Ltd. Method and system of providing completion suggestion to a partial linguistic element
US10417646B2 (en) 2010-03-09 2019-09-17 Sdl Inc. Predicting the cost associated with translating textual content
WO2011136768A1 (en) * 2010-04-29 2011-11-03 Hewlett-Packard Development Company, L.P. Processing content in a plurality of languages
US9767095B2 (en) 2010-05-21 2017-09-19 Western Standard Publishing Company, Inc. Apparatus, system, and method for computer aided translation
US20120330643A1 (en) * 2010-06-04 2012-12-27 John Frei System and method for translation
US8380487B2 (en) * 2010-06-21 2013-02-19 International Business Machines Corporation Language translation of selected content in a web conference
WO2012009441A2 (en) 2010-07-13 2012-01-19 Motionpoint Corporation Dynamic language translation of web site content
US9509521B2 (en) * 2010-08-30 2016-11-29 Disney Enterprises, Inc. Contextual chat based on behavior and usage
US9713774B2 (en) 2010-08-30 2017-07-25 Disney Enterprises, Inc. Contextual chat message generation in online environments
US20120116751A1 (en) * 2010-11-09 2012-05-10 International Business Machines Corporation Providing message text translations
US8639701B1 (en) 2010-11-23 2014-01-28 Google Inc. Language selection for information retrieval
US10762293B2 (en) 2010-12-22 2020-09-01 Apple Inc. Using parts-of-speech tagging and named entity recognition for spelling correction
US9552353B2 (en) 2011-01-21 2017-01-24 Disney Enterprises, Inc. System and method for generating phrases
US20120209589A1 (en) * 2011-02-11 2012-08-16 Samsung Electronics Co. Ltd. Message handling method and system
US8527259B1 (en) * 2011-02-28 2013-09-03 Google Inc. Contextual translation of digital content
US10140320B2 (en) * 2011-02-28 2018-11-27 Sdl Inc. Systems, methods, and media for generating analytical data
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US20120253784A1 (en) * 2011-03-31 2012-10-04 International Business Machines Corporation Language translation based on nearby devices
US11003838B2 (en) 2011-04-18 2021-05-11 Sdl Inc. Systems and methods for monitoring post translation editing
CA2835110C (en) * 2011-05-05 2017-04-11 Ortsbo, Inc. Cross-language communication between proximate mobile devices
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US8694303B2 (en) 2011-06-15 2014-04-08 Language Weaver, Inc. Systems and methods for tuning parameters in statistical machine translation
US20140127653A1 (en) * 2011-07-11 2014-05-08 Moshe Link Language-learning system
KR20130015472A (en) * 2011-08-03 2013-02-14 삼성전자주식회사 Display apparatus, control method and server thereof
US9245253B2 (en) 2011-08-19 2016-01-26 Disney Enterprises, Inc. Soft-sending chat messages
US9176947B2 (en) 2011-08-19 2015-11-03 Disney Enterprises, Inc. Dynamically generated phrase-based assisted input
US9984054B2 (en) 2011-08-24 2018-05-29 Sdl Inc. Web interface including the review and manipulation of a web document and utilizing permission based control
US8994660B2 (en) 2011-08-29 2015-03-31 Apple Inc. Text correction processing
US8954315B2 (en) 2011-10-10 2015-02-10 Ca, Inc. System and method for mixed-language support for applications
US8886515B2 (en) 2011-10-19 2014-11-11 Language Weaver, Inc. Systems and methods for enhancing machine translation post edit review processes
US20130138421A1 (en) * 2011-11-28 2013-05-30 Micromass Uk Limited Automatic Human Language Translation
US8738364B2 (en) * 2011-12-14 2014-05-27 International Business Machines Corporation Adaptation of vocabulary levels for enhanced collaboration
WO2013102052A1 (en) * 2011-12-28 2013-07-04 Bloomberg Finance L.P. System and method for interactive automatic translation
US9652452B2 (en) 2012-01-06 2017-05-16 Yactraq Online Inc. Method and system for constructing a language model
US9268762B2 (en) * 2012-01-16 2016-02-23 Google Inc. Techniques for generating outgoing messages based on language, internationalization, and localization preferences of the recipient
US9330082B2 (en) * 2012-02-14 2016-05-03 Facebook, Inc. User experience with customized user dictionary
US9330083B2 (en) * 2012-02-14 2016-05-03 Facebook, Inc. Creating customized user dictionary
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9122673B2 (en) 2012-03-07 2015-09-01 International Business Machines Corporation Domain specific natural language normalization
US8942973B2 (en) 2012-03-09 2015-01-27 Language Weaver, Inc. Content page URL translation
US9141606B2 (en) 2012-03-29 2015-09-22 Lionbridge Technologies, Inc. Methods and systems for multi-engine machine translation
US8989485B2 (en) 2012-04-27 2015-03-24 Abbyy Development Llc Detecting a junction in a text line of CJK characters
US8971630B2 (en) 2012-04-27 2015-03-03 Abbyy Development Llc Fast CJK character recognition
US9280610B2 (en) 2012-05-14 2016-03-08 Apple Inc. Crowd sourcing information to fulfill user requests
US10261994B2 (en) 2012-05-25 2019-04-16 Sdl Inc. Method and system for automatic management of reputation of translators
US9721563B2 (en) 2012-06-08 2017-08-01 Apple Inc. Name recognition system
CN102760122A (en) * 2012-06-12 2012-10-31 上海量明科技发展有限公司 Method, client and system for translating interactive contents in common language
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US8918308B2 (en) 2012-07-06 2014-12-23 International Business Machines Corporation Providing multi-lingual searching of mono-lingual content
US9116886B2 (en) * 2012-07-23 2015-08-25 Google Inc. Document translation including pre-defined term translator and translation model
US9304990B2 (en) * 2012-08-20 2016-04-05 International Business Machines Corporation Translation of text into multiple languages
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US9547647B2 (en) 2012-09-19 2017-01-17 Apple Inc. Voice-based media searching
US9165329B2 (en) 2012-10-19 2015-10-20 Disney Enterprises, Inc. Multi layer chat detection and classification
US9916306B2 (en) 2012-10-19 2018-03-13 Sdl Inc. Statistical linguistic analysis of source content
CA2851585C (en) 2012-11-06 2020-09-01 Lance Saleme Stack-based adaptive localization and internationalization of applications
US9152622B2 (en) 2012-11-26 2015-10-06 Language Weaver, Inc. Personalized machine translation via online adaptation
US9710463B2 (en) * 2012-12-06 2017-07-18 Raytheon Bbn Technologies Corp. Active error detection and resolution for linguistic translation
BR112015018905B1 (en) 2013-02-07 2022-02-22 Apple Inc Voice activation feature operation method, computer readable storage media and electronic device
US9031829B2 (en) 2013-02-08 2015-05-12 Machine Zone, Inc. Systems and methods for multi-user multi-lingual communications
US9231898B2 (en) 2013-02-08 2016-01-05 Machine Zone, Inc. Systems and methods for multi-user multi-lingual communications
US8990068B2 (en) 2013-02-08 2015-03-24 Machine Zone, Inc. Systems and methods for multi-user multi-lingual communications
US8996353B2 (en) 2013-02-08 2015-03-31 Machine Zone, Inc. Systems and methods for multi-user multi-lingual communications
US8996352B2 (en) 2013-02-08 2015-03-31 Machine Zone, Inc. Systems and methods for correcting translations in multi-user multi-lingual communications
US8996355B2 (en) 2013-02-08 2015-03-31 Machine Zone, Inc. Systems and methods for reviewing histories of text messages from multi-user multi-lingual communications
US10650103B2 (en) 2013-02-08 2020-05-12 Mz Ip Holdings, Llc Systems and methods for incentivizing user feedback for translation processing
US9298703B2 (en) 2013-02-08 2016-03-29 Machine Zone, Inc. Systems and methods for incentivizing user feedback for translation processing
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US10652394B2 (en) 2013-03-14 2020-05-12 Apple Inc. System and method for processing voicemail
WO2014144579A1 (en) 2013-03-15 2014-09-18 Apple Inc. System and method for updating an adaptive speech recognition model
KR101759009B1 (en) 2013-03-15 2017-07-17 애플 인크. Training an at least partial voice command system
US10742577B2 (en) 2013-03-15 2020-08-11 Disney Enterprises, Inc. Real-time search and validation of phrases using linguistic phrase components
US10303762B2 (en) 2013-03-15 2019-05-28 Disney Enterprises, Inc. Comprehensive safety schema for ensuring appropriateness of language in online chat
US20140272820A1 (en) * 2013-03-15 2014-09-18 Media Mouth Inc. Language learning environment
US9183198B2 (en) 2013-03-19 2015-11-10 International Business Machines Corporation Customizable and low-latency interactive computer-aided translation
KR20140120192A (en) * 2013-04-02 2014-10-13 삼성전자주식회사 Method for processing data and an electronic device thereof
US9292271B2 (en) 2013-05-24 2016-03-22 Medidata Solutions, Inc. Apparatus and method for managing software translation
KR101743686B1 (en) * 2013-06-03 2017-06-20 머신 존, 인크. Systems and methods for multi-user multi-lingual communications
WO2014197334A2 (en) 2013-06-07 2014-12-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
WO2014197336A1 (en) 2013-06-07 2014-12-11 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
WO2014197335A1 (en) 2013-06-08 2014-12-11 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
CN105264524B (en) 2013-06-09 2019-08-02 苹果公司 For realizing the equipment, method and graphic user interface of the session continuity of two or more examples across digital assistants
US9977684B2 (en) 2013-06-12 2018-05-22 Sap Se Self-learning localization service
CN105265005B (en) 2013-06-13 2019-09-17 苹果公司 System and method for the urgent call initiated by voice command
US9678952B2 (en) * 2013-06-17 2017-06-13 Ilya Ronin Cross-lingual E-commerce
US9262411B2 (en) * 2013-07-10 2016-02-16 International Business Machines Corporation Socially derived translation profiles to enhance translation quality of social content using a machine translation
JP6163266B2 (en) 2013-08-06 2017-07-12 アップル インコーポレイテッド Automatic activation of smart responses based on activation from remote devices
US9372672B1 (en) * 2013-09-04 2016-06-21 Tg, Llc Translation in visual context
US20150088485A1 (en) * 2013-09-24 2015-03-26 Moayad Alhabobi Computerized system for inter-language communication
US9213694B2 (en) 2013-10-10 2015-12-15 Language Weaver, Inc. Efficient online domain adaptation
US20150113072A1 (en) * 2013-10-17 2015-04-23 International Business Machines Corporation Messaging auto-correction using recipient feedback
US9870357B2 (en) * 2013-10-28 2018-01-16 Microsoft Technology Licensing, Llc Techniques for translating text via wearable computing device
US9424597B2 (en) * 2013-11-13 2016-08-23 Ebay Inc. Text translation using contextual information related to text objects in translated language
US9779087B2 (en) * 2013-12-13 2017-10-03 Google Inc. Cross-lingual discriminative learning of sequence models with posterior regularization
US9836530B2 (en) * 2013-12-16 2017-12-05 Entit Software Llc Determining preferred communication explanations using record-relevancy tiers
US20150169550A1 (en) * 2013-12-17 2015-06-18 Lenovo Enterprise Solutions (Singapore) Pte, Ltd. Translation Suggestion
RU2592395C2 (en) 2013-12-19 2016-07-20 Общество с ограниченной ответственностью "Аби ИнфоПоиск" Resolution semantic ambiguity by statistical analysis
RU2586577C2 (en) 2014-01-15 2016-06-10 Общество с ограниченной ответственностью "Аби ИнфоПоиск" Filtering arcs parser graph
JP2015138414A (en) * 2014-01-22 2015-07-30 富士通株式会社 Machine translation device, translation method, and program thereof
KR102054772B1 (en) * 2014-04-22 2019-12-13 한국전자통신연구원 Sentence hiding and displaying system and method
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US10592095B2 (en) 2014-05-23 2020-03-17 Apple Inc. Instantaneous speaking of content on touch devices
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
EP3149728B1 (en) 2014-05-30 2019-01-16 Apple Inc. Multi-command single utterance input method
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US10659851B2 (en) 2014-06-30 2020-05-19 Apple Inc. Real-time digital assistant knowledge updates
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US9483768B2 (en) * 2014-08-11 2016-11-01 24/7 Customer, Inc. Methods and apparatuses for modeling customer interaction experiences
US10446141B2 (en) 2014-08-28 2019-10-15 Apple Inc. Automatic speech recognition based on user feedback
RU2596600C2 (en) 2014-09-02 2016-09-10 Общество с ограниченной ответственностью "Аби Девелопмент" Methods and systems for processing images of mathematical expressions
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10789041B2 (en) 2014-09-12 2020-09-29 Apple Inc. Dynamic thresholds for always listening speech trigger
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US9372848B2 (en) 2014-10-17 2016-06-21 Machine Zone, Inc. Systems and methods for language detection
US10162811B2 (en) 2014-10-17 2018-12-25 Mz Ip Holdings, Llc Systems and methods for language detection
US9569430B2 (en) * 2014-10-24 2017-02-14 International Business Machines Corporation Language translation and work assignment optimization in a customer support environment
US10108712B2 (en) 2014-11-19 2018-10-23 Ebay Inc. Systems and methods for generating search query rewrites
US9727607B2 (en) * 2014-11-19 2017-08-08 Ebay Inc. Systems and methods for representing search query rewrites
US9626358B2 (en) 2014-11-26 2017-04-18 Abbyy Infopoisk Llc Creating ontologies by analyzing natural language texts
US10552013B2 (en) 2014-12-02 2020-02-04 Apple Inc. Data detection
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US9626430B2 (en) 2014-12-22 2017-04-18 Ebay Inc. Systems and methods for data mining and automated generation of search query rewrites
JP2016133861A (en) * 2015-01-16 2016-07-25 株式会社ぐるなび Information multilingual conversion system
CN105988990B (en) * 2015-02-26 2021-06-01 索尼公司 Chinese zero-reference resolution device and method, model training method and storage medium
KR20160105215A (en) * 2015-02-27 2016-09-06 삼성전자주식회사 Apparatus and method for processing text
US10380656B2 (en) 2015-02-27 2019-08-13 Ebay Inc. Dynamic predefined product reviews
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US10229674B2 (en) * 2015-05-15 2019-03-12 Microsoft Technology Licensing, Llc Cross-language speech recognition and translation
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US9578173B2 (en) 2015-06-05 2017-02-21 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US10789080B2 (en) * 2015-07-17 2020-09-29 Microsoft Technology Licensing, Llc Multi-tier customizable portal deployment system
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
US9678954B1 (en) * 2015-10-29 2017-06-13 Google Inc. Techniques for providing lexicon data for translation of a single word speech input
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
JP6070809B1 (en) * 2015-12-03 2017-02-01 国立大学法人静岡大学 Natural language processing apparatus and natural language processing method
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10765956B2 (en) 2016-01-07 2020-09-08 Machine Zone Inc. Named entity recognition on chat data
EP3203384A1 (en) * 2016-02-02 2017-08-09 Theo Hoffenberg Method, device, and computer program for providing a definition or a translation of a word belonging to a sentence as a function of neighbouring words and of databases
US10303799B2 (en) 2016-02-11 2019-05-28 International Business Machines Corporation Converging tool terminology
JP6766384B2 (en) * 2016-03-11 2020-10-14 富士ゼロックス株式会社 Information processing equipment and programs
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
CN105843476A (en) * 2016-03-25 2016-08-10 海信集团有限公司 Man-machine interaction method and system
CN105868188A (en) * 2016-03-25 2016-08-17 海信集团有限公司 Man-machine interaction method and system
US9602450B1 (en) 2016-05-16 2017-03-21 Machine Zone, Inc. Maintaining persistence of a messaging system
US10268683B2 (en) 2016-05-17 2019-04-23 Google Llc Generating output for presentation in response to user interface input, where the input and/or the output include chatspeak
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
DK179588B1 (en) 2016-06-09 2019-02-22 Apple Inc. Intelligent automated assistant in a home environment
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10586535B2 (en) 2016-06-10 2020-03-10 Apple Inc. Intelligent digital assistant in a multi-tasking environment
DK179049B1 (en) 2016-06-11 2017-09-18 Apple Inc Data driven natural language event detection and classification
DK201670540A1 (en) 2016-06-11 2018-01-08 Apple Inc Application integration with a digital assistant
DK179343B1 (en) 2016-06-11 2018-05-14 Apple Inc Intelligent task discovery
DK179415B1 (en) 2016-06-11 2018-06-14 Apple Inc Intelligent device arbitration and control
RU2718154C1 (en) * 2016-06-22 2020-03-30 Хуавэй Текнолоджиз Ко., Лтд. Method and device for displaying possible word and graphical user interface
CN106202060A (en) * 2016-06-24 2016-12-07 深圳信息职业技术学院 A kind of character input method and device
US10437933B1 (en) * 2016-08-16 2019-10-08 Amazon Technologies, Inc. Multi-domain machine translation system with training data clustering and dynamic domain adaptation
US10579742B1 (en) * 2016-08-30 2020-03-03 United Services Automobile Association (Usaa) Biometric signal analysis for communication enhancement and transformation
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
US10275459B1 (en) * 2016-09-28 2019-04-30 Amazon Technologies, Inc. Source language content scoring for localizability
US10235362B1 (en) 2016-09-28 2019-03-19 Amazon Technologies, Inc. Continuous translation refinement with automated delivery of re-translated content
US10223356B1 (en) 2016-09-28 2019-03-05 Amazon Technologies, Inc. Abstraction of syntax in localization through pre-rendering
US10229113B1 (en) 2016-09-28 2019-03-12 Amazon Technologies, Inc. Leveraging content dimensions during the translation of human-readable languages
US10261995B1 (en) * 2016-09-28 2019-04-16 Amazon Technologies, Inc. Semantic and natural language processing for content categorization and routing
US10380263B2 (en) * 2016-11-15 2019-08-13 International Business Machines Corporation Translation synthesizer for analysis, amplification and remediation of linguistic data across a translation supply chain
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
WO2018134878A1 (en) * 2017-01-17 2018-07-26 初実 田中 Multilingual communication system and multilingual communication provision method
US11057329B2 (en) 2017-01-25 2021-07-06 Huawei Technologies Co., Ltd. Message record combination and display method and terminal device
KR20180108973A (en) * 2017-03-24 2018-10-05 엔에이치엔엔터테인먼트 주식회사 Method and for providing automatic translation in user conversation using multiple languages
DK201770439A1 (en) 2017-05-11 2018-12-13 Apple Inc. Offline personal assistant
CN107193807B (en) * 2017-05-12 2021-05-28 北京百度网讯科技有限公司 Artificial intelligence-based language conversion processing method and device and terminal
DK179745B1 (en) 2017-05-12 2019-05-01 Apple Inc. SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT
DK179496B1 (en) 2017-05-12 2019-01-15 Apple Inc. USER-SPECIFIC Acoustic Models
DK201770431A1 (en) 2017-05-15 2018-12-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
DK201770432A1 (en) 2017-05-15 2018-12-21 Apple Inc. Hierarchical belief states for digital assistants
DK179549B1 (en) 2017-05-16 2019-02-12 Apple Inc. Far-field extension for digital assistant services
US10498675B2 (en) * 2017-06-15 2019-12-03 GM Global Technology Operations LLC Enhanced electronic chat efficiency
GB2563648A (en) * 2017-06-22 2018-12-26 Lingo App Ltd Translation system
WO2019060353A1 (en) 2017-09-21 2019-03-28 Mz Ip Holdings, Llc System and method for translating chat messages
CN109598001A (en) 2017-09-30 2019-04-09 阿里巴巴集团控股有限公司 A kind of information display method, device and equipment
US10558748B2 (en) * 2017-11-01 2020-02-11 International Business Machines Corporation Recognizing transliterated words using suffix and/or prefix outputs
US10992488B2 (en) * 2017-12-14 2021-04-27 Elizabeth K. Le System and method for an enhanced focus group platform for a plurality of user devices in an online communication environment
GB2569952A (en) * 2017-12-30 2019-07-10 Innoplexus Ag Method and system for identifying key terms in digital document
US10915183B2 (en) * 2018-03-30 2021-02-09 AVAST Software s.r.o. Automatic language selection in messaging application
JP6784718B2 (en) * 2018-04-13 2020-11-11 グリー株式会社 Game programs and game equipment
US10540452B1 (en) * 2018-06-21 2020-01-21 Amazon Technologies, Inc. Automated translation of applications
WO2020039807A1 (en) * 2018-08-24 2020-02-27 株式会社Nttドコモ Machine translation control device
US10795686B2 (en) * 2018-08-31 2020-10-06 International Business Machines Corporation Internationalization controller
KR101989052B1 (en) 2018-10-01 2019-06-13 주식회사 넥스트이노베이션 Braille editing method using the error output function and recording medium storing program for executing the same, and computer program stored in recording medium for executing the same
KR102498172B1 (en) * 2019-01-09 2023-02-09 이장호 Method of interactive foreign language learning by voice talking each other using voice recognition function and TTS function
CN109861904B (en) * 2019-02-19 2021-01-05 天津字节跳动科技有限公司 Name label display method and device
EP3899927A1 (en) 2019-05-02 2021-10-27 Google LLC Adapting automated assistants for use with multiple languages
US11521071B2 (en) * 2019-05-14 2022-12-06 Adobe Inc. Utilizing deep recurrent neural networks with layer-wise attention for punctuation restoration
US11397600B2 (en) * 2019-05-23 2022-07-26 HCL Technologies Italy S.p.A Dynamic catalog translation system
US10936813B1 (en) * 2019-05-31 2021-03-02 Amazon Technologies, Inc. Context-aware spell checker
US11153256B2 (en) * 2019-06-20 2021-10-19 Shopify Inc. Systems and methods for recommending merchant discussion groups based on settings in an e-commerce platform
JP7409064B2 (en) * 2019-12-18 2024-01-09 ブラザー工業株式会社 Control program, control system, and control method for information processing equipment
CN111179657A (en) * 2020-02-22 2020-05-19 李孝龙 Multi-language intelligent learning machine
US11687732B2 (en) * 2020-04-06 2023-06-27 Open Text Holdings, Inc. Content management systems for providing automated translation of content items
CN111814437A (en) * 2020-05-28 2020-10-23 杭州视氪科技有限公司 Method for converting braille into Chinese based on deep learning
US11664010B2 (en) 2020-11-03 2023-05-30 Florida Power & Light Company Natural language domain corpus data set creation based on enhanced root utterances
CN112487791A (en) * 2020-11-27 2021-03-12 江苏省舜禹信息技术有限公司 Multi-language hybrid intelligent translation method
CN112767924A (en) * 2021-02-26 2021-05-07 北京百度网讯科技有限公司 Voice recognition method and device, electronic equipment and storage medium
US11886446B2 (en) * 2021-04-05 2024-01-30 Baidu Usa Llc Cross-lingual language models and pretraining of cross-lingual language models

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4706212A (en) * 1971-08-31 1987-11-10 Toma Peter P Method using a programmed digital computer system for translation between natural languages
US5201042A (en) * 1986-04-30 1993-04-06 Hewlett-Packard Company Software process and tools for development of local language translations of text portions of computer source code
US4852003A (en) * 1987-11-18 1989-07-25 International Business Machines Corporation Method for removing enclitic endings from verbs in romance languages
US5077669A (en) * 1989-12-27 1991-12-31 International Business Machines Corporation Method for quasi-key search within a national language support (nls) data processing system
US5289375A (en) * 1990-01-22 1994-02-22 Sharp Kabushiki Kaisha Translation machine
JPH03222065A (en) * 1990-01-26 1991-10-01 Sharp Corp Machine translation device
JPH0594436A (en) * 1990-10-10 1993-04-16 Fuji Xerox Co Ltd Document processor
US5477451A (en) * 1991-07-25 1995-12-19 International Business Machines Corp. Method and system for natural language translation
JPH05298360A (en) * 1992-04-17 1993-11-12 Hitachi Ltd Method and device for evaluating translated sentence, machine translation system with translated sentence evaluating function and machine translation system evaluating device
US5510981A (en) * 1993-10-28 1996-04-23 International Business Machines Corporation Language translation apparatus and method using context-based translation models
DE69430421T2 (en) * 1994-01-14 2003-03-06 Sun Microsystems Inc Method and device for automating the environment adaptation of computer programs
US5659765A (en) * 1994-03-15 1997-08-19 Toppan Printing Co., Ltd. Machine translation system
US5584024A (en) * 1994-03-24 1996-12-10 Software Ag Interactive database query system and method for prohibiting the selection of semantically incorrect query parameters
JP3385146B2 (en) * 1995-06-13 2003-03-10 シャープ株式会社 Conversational sentence translator
US6993471B1 (en) * 1995-11-13 2006-01-31 America Online, Inc. Integrated multilingual browser
JPH09259126A (en) * 1996-03-21 1997-10-03 Sharp Corp Data processor
JP3121548B2 (en) * 1996-10-15 2001-01-09 インターナショナル・ビジネス・マシーンズ・コーポレ−ション Machine translation method and apparatus
US6085162A (en) * 1996-10-18 2000-07-04 Gedanken Corporation Translation system and method in which words are translated by a specialized dictionary and then a general dictionary
US5884246A (en) * 1996-12-04 1999-03-16 Transgate Intellectual Properties Ltd. System and method for transparent translation of electronically transmitted messages
WO1998054655A1 (en) * 1997-05-28 1998-12-03 Shinar Linguistic Technologies Inc. Translation system
US6285978B1 (en) * 1998-09-24 2001-09-04 International Business Machines Corporation System and method for estimating accuracy of an automatic natural language translation
US6453312B1 (en) * 1998-10-14 2002-09-17 Unisys Corporation System and method for developing a selectably-expandable concept-based search

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
No Search *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9547643B2 (en) 2006-10-02 2017-01-17 Google Inc. Displaying original text in a user interface with translated text
US7801721B2 (en) 2006-10-02 2010-09-21 Google Inc. Displaying original text in a user interface with translated text
US8095355B2 (en) 2006-10-02 2012-01-10 Google Inc. Displaying original text in a user interface with translated text
US8577668B2 (en) 2006-10-02 2013-11-05 Google Inc. Displaying original text in a user interface with translated text
US10114820B2 (en) 2006-10-02 2018-10-30 Google Llc Displaying original text in a user interface with translated text
WO2008042845A1 (en) * 2006-10-02 2008-04-10 Google Inc. Displaying original text in a user interface with translated text
US9798720B2 (en) 2008-10-24 2017-10-24 Ebay Inc. Hybrid machine translation
JP2018041474A (en) * 2013-02-08 2018-03-15 マシーン・ゾーン・インコーポレイテッドMachine Zone, Inc. System and method for multiuser multilingual communication
JP2016509312A (en) * 2013-02-08 2016-03-24 マシーン・ゾーン・インコーポレイテッドMachine Zone, Inc. System and method for multi-user multilingual communication
US9530161B2 (en) 2014-02-28 2016-12-27 Ebay Inc. Automatic extraction of multilingual dictionary items from non-parallel, multilingual, semi-structured data
US9569526B2 (en) 2014-02-28 2017-02-14 Ebay Inc. Automatic machine translation using user feedback
CN106233281A (en) * 2014-02-28 2016-12-14 电子湾有限公司 The automatic machine translation using user feedback improves
US9805031B2 (en) 2014-02-28 2017-10-31 Ebay Inc. Automatic extraction of multilingual dictionary items from non-parallel, multilingual, semi-structured data
US9881006B2 (en) 2014-02-28 2018-01-30 Paypal, Inc. Methods for automatic generation of parallel corpora
US9940658B2 (en) 2014-02-28 2018-04-10 Paypal, Inc. Cross border transaction machine translation
WO2015130984A3 (en) * 2014-02-28 2015-10-15 Ebay Inc. Improvement of automatic machine translation using user feedback
BE1021599B1 (en) * 2014-12-29 2015-12-17 Crosslang Nv MACHINE TRANSLATION SYSTEM FOR AUTOMATICALLY GENERATING A TEXT TRANSLATION

Also Published As

Publication number Publication date
JP2003529845A (en) 2003-10-07
US20010029455A1 (en) 2001-10-11
WO2001075662A8 (en) 2002-02-14
AU2001249777A1 (en) 2001-10-15

Similar Documents

Publication Publication Date Title
US20010029455A1 (en) Method and apparatus for providing multilingual translation over a network
US6396951B1 (en) Document-based query data for information retrieval
US6647364B1 (en) Hypertext markup language document translating machine
EP0968475B1 (en) Translation system
US5963205A (en) Automatic index creation for a word processor
US8346536B2 (en) System and method for multi-lingual information retrieval
US6393389B1 (en) Using ranked translation choices to obtain sequences indicating meaning of multi-token expressions
US5708825A (en) Automatic summary page creation and hyperlink generation
US6269189B1 (en) Finding selected character strings in text and providing information relating to the selected character strings
US20020193986A1 (en) Pre-translated multi-lingual email system, method, and computer program product
US20020123879A1 (en) Translation system & method
US7848916B2 (en) System, method and program product for bidirectional text translation
KR20070117554A (en) Embedded translation-enhanced search
CN1950820A (en) Embedded translation document method and system
US20080120087A1 (en) Translation Information Segment
US20070011160A1 (en) Literacy automation software
US6370497B1 (en) Natural language transformations for propagating hypertext label changes
AU743538B2 (en) Translation
US20060195313A1 (en) Method and system for selecting and conjugating a verb
Danielsson Simple Perl programming for corpus work
Deksne et al. The modern electronic dictionary that always provides an answer
Šostaka et al. The Semi-Algorithmic Approach to Formation of Latvian Information and Communication Technology Terms.
Vale et al. Building a large dictionary of abbreviations for named entity recognition in Portuguese historical corpora
Budin et al. Terminology resources on the internet
JPH09265469A (en) Translation method for hyper text type document and translation device for html document

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: C1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: C1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

D17 Declaration under article 17(2)a
ENP Entry into the national phase

Ref country code: JP

Ref document number: 2001 573273

Kind code of ref document: A

Format of ref document f/p: F

122 Ep: pct application non-entry in european phase