CA2216387C - Integrated multilingual browser - Google Patents
Integrated multilingual browser Download PDFInfo
- Publication number
- CA2216387C CA2216387C CA002216387A CA2216387A CA2216387C CA 2216387 C CA2216387 C CA 2216387C CA 002216387 A CA002216387 A CA 002216387A CA 2216387 A CA2216387 A CA 2216387A CA 2216387 C CA2216387 C CA 2216387C
- Authority
- CA
- Canada
- Prior art keywords
- language
- document
- browser
- documents
- html
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q99/00—Subject matter not provided for in other groups of this subclass
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/131—Fragmentation of text files, e.g. creating reusable text-blocks; Linking to fragments, e.g. using XInclude; Namespaces
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/14—Tree-structured documents
- G06F40/143—Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/151—Transformation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/221—Parsing markup language streams
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
- G06F40/47—Machine-assisted translation, e.g. using translation memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
Abstract
The disclosed system translates into dif-ferent languages HTML, documents (16) avail-able through the World Wide Web. HTML
documents (16) are translated by machine trans-lation software (10) bundled in a browser (12). Alternatively, documents are retrieved as needed, translated, and stored on a Web server so user requests are serviced with a document that has been translated from a different lan-guage. The disclosed invention expands usage of the Internet for non-English speakers.
documents (16) are translated by machine trans-lation software (10) bundled in a browser (12). Alternatively, documents are retrieved as needed, translated, and stored on a Web server so user requests are serviced with a document that has been translated from a different lan-guage. The disclosed invention expands usage of the Internet for non-English speakers.
Description
INTEGRATED MULTILINGUAL SROWSER
BACKGROUND AND SUMMARY OF THE INVENTION
The present invention relates generally to the field of electronic communication over a computer network. Particularly, the present invention relates to the expansion of mufti-lingual electronic communication through translation services for documents and messages available through the Internet.
The recent surge in media attention to the Internet, and especially the World Wide Web, coupled with the continuing growth in home PC ownership have resulted in a growing diversity of the Internet user population. No longer is the Internet the province of software experts; thousands of novice users have begun to come online each day.
Software like CompuServe's Web Browser lets users quickly connect to and find useful content online. This phenomenon is not restricted to the United States or to English-speaking countries. Grrowth in online usage in Europe and Asia is increasing even more quickly than in the U.S.
While interest in the online world is at a peak, a significant obstacle exists to broad usage of the Internet for non-English speakers. The vast majority of Internet content is in English, and is therefore inaccessible to users with other native languages.
Translation of Internet documents by a human translator is not a practical solution for two reasons. First, human translation is costly and slow. A translator can typically produce 300-400 words per hour at costs of 12ø per word or more. Second, in order to have a translator convert Internet documents to the user's native language, the user would have to download every document he was interested in to provide it to the translator. This is a time-consuming process, and if the user knows no English, he will not even be able to assess the relevance of the document before downloading it. This would result in wasted time and translation costs since inevitably, some of the documents selected will not prove to be worthwhile.
The present invention allows non-English speaking Internet users to access and understand information available from the Word Wide Web and related sources.
Language translation software (known as machine translation, or MT) is combined with Internet software to allow non-English speaking Internet users to quickly generate translations of online text. The process is automated and therefore, less costly and time-consuming than human translation.
Accordingly, the present invention provides a method for translating a plurality of 1o documents comprising codes and data characters in a first language, comprising the steps o~
specifying a second language;
transmitting a request for a first document from a browser to a web server;
transmitting said first document to a preprocessor, said preprocessor adapted to insert markers around said codes in said first document;
15 inserting markers around said codes in said first document;
transmitting said first document to a language translator, said language translator adapted to leave said marked codes untranslated;
translating said data characters in said first language to data characters in said second language using said language translator;
20 transmitting said first translated document to a postprocessor, said postprocessor adapted to remove said markers from said first translated document;
removing said markers from said first translated document;
displaying said first translated document at said browser;
selecting in said first translated document a link to a second document in said first 25 language;
transmitting a request for said second document from a browser to a web server;
transmitting said second document to said preprocessor; inserting markers around codes in said second document;
transmitting said second document to said language translator;
3o translating said data characters in said first language to data characters in said second language using said language translator;
transmitting said second translated document to said postprocessor;
removing said markers from said second translated document;
and displaying said second translated document at said browser.
BACKGROUND AND SUMMARY OF THE INVENTION
The present invention relates generally to the field of electronic communication over a computer network. Particularly, the present invention relates to the expansion of mufti-lingual electronic communication through translation services for documents and messages available through the Internet.
The recent surge in media attention to the Internet, and especially the World Wide Web, coupled with the continuing growth in home PC ownership have resulted in a growing diversity of the Internet user population. No longer is the Internet the province of software experts; thousands of novice users have begun to come online each day.
Software like CompuServe's Web Browser lets users quickly connect to and find useful content online. This phenomenon is not restricted to the United States or to English-speaking countries. Grrowth in online usage in Europe and Asia is increasing even more quickly than in the U.S.
While interest in the online world is at a peak, a significant obstacle exists to broad usage of the Internet for non-English speakers. The vast majority of Internet content is in English, and is therefore inaccessible to users with other native languages.
Translation of Internet documents by a human translator is not a practical solution for two reasons. First, human translation is costly and slow. A translator can typically produce 300-400 words per hour at costs of 12ø per word or more. Second, in order to have a translator convert Internet documents to the user's native language, the user would have to download every document he was interested in to provide it to the translator. This is a time-consuming process, and if the user knows no English, he will not even be able to assess the relevance of the document before downloading it. This would result in wasted time and translation costs since inevitably, some of the documents selected will not prove to be worthwhile.
The present invention allows non-English speaking Internet users to access and understand information available from the Word Wide Web and related sources.
Language translation software (known as machine translation, or MT) is combined with Internet software to allow non-English speaking Internet users to quickly generate translations of online text. The process is automated and therefore, less costly and time-consuming than human translation.
Accordingly, the present invention provides a method for translating a plurality of 1o documents comprising codes and data characters in a first language, comprising the steps o~
specifying a second language;
transmitting a request for a first document from a browser to a web server;
transmitting said first document to a preprocessor, said preprocessor adapted to insert markers around said codes in said first document;
15 inserting markers around said codes in said first document;
transmitting said first document to a language translator, said language translator adapted to leave said marked codes untranslated;
translating said data characters in said first language to data characters in said second language using said language translator;
20 transmitting said first translated document to a postprocessor, said postprocessor adapted to remove said markers from said first translated document;
removing said markers from said first translated document;
displaying said first translated document at said browser;
selecting in said first translated document a link to a second document in said first 25 language;
transmitting a request for said second document from a browser to a web server;
transmitting said second document to said preprocessor; inserting markers around codes in said second document;
transmitting said second document to said language translator;
3o translating said data characters in said first language to data characters in said second language using said language translator;
transmitting said second translated document to said postprocessor;
removing said markers from said second translated document;
and displaying said second translated document at said browser.
2 The present invention also provides a document translation system for translating documents comprising codes and data characters in a first language, said system comprising:
a browser for requesting said documents from a web server in accordance with links in said documents and for specifying a second language;
a preprocessor for marking said codes in said documents;
a language translator for translating into said second language said data characters in said preprocessed documents; and a postprocessor for unmarking said codes in said translated documents.
The present invention also provides a method for translating documents, comprising 1o the steps o~
defining in an original language a document containing display and reference codes and data characters exclusive of said display and reference codes;
storing said document at a server;
selecting a target language, said target language specified by a user of a browser;
15 requesting said document from said server, said document requested by said user in accordance with a link;
preprocessing said document to insert markers around said reference codes;
translating said data characters in said document from said original language to said 2o target language;
postprocessing said document to remove said markers from around said reference codes; and transmitting said document in said target language to a browser.
In a further aspect, the present invention provides a system for automated translation 25 of HTML documents comprising:
a browser for requesting and displaying HTML documents;
a server for processing browser requests for HTML documents;
a connection between said browser to said server for transmitting requests from said browser to said server and HTML documents from said server to said browser;
3o a request from said browser for a HTML document in a source language, said HTML
document requested in accordance with a link to said HTML document in said source language;
2a a target language specified in accordance with said browser; and machine translation software for translating said HTML document in said source language to a HTML
document in said target language.
The present invention also provides a method for providing HTML documents in a plurality of languages comprising the steps of:
retrieving from a server a HTML document in a source language, said HTML
document selected in accordance with a link;
determining a target language, said target language specified by a user of a browser;
translating said HTML document in said source language to a HTML document in 1o said target language; and displaying said HTML document in said target language at said browser.
The present invention also provides a multilingual browser comprising:
a browser for requesting and displaying HTML documents, said browser adapted for specification of a target language; and 15 machine translation software integrated into said browser, said machine translation software adapted to translate a HTML document in a source language to a HTML
document in said target language, in accordance with a user of said browser selecting a link to said HTML document in said source document.
The present invention also provides a system for automated translation of HTML
2o documents, comprising:
a plurality of web servers adapted to store a plurality of HTML documents in a source language;
a personal computer equipped with a browser for accessing HTML documents;
a server for processing requests from said browser at said personal computer for access 25 to said plurality of HTML documents;
a target language specified by a user of said personal computer;
a request for one of said plurality of HTML documents in said source language;
a language translator for translating said requested HTML document in said source language to a HTML document in said target language; and 3o a display at said personal computer for displaying said HTML document in said target language.
Advantages of the present invention are explained further in relation to the following detailed description of the invention, drawings, and claims.
2b Figures IA and IB c;omprise a screen shot of a World Wide Web page;
Figure 2 is an example of a hypertext document;
Figure 3 is an example of a hypertext document preprocessed according to the method of the present invention;
Figure 4 illustratES a system for performing machine translation;
Figure SA and SB comprise axe example of a preprocessed hypertext document translated according to the rnethod of the present invention;
to Figure 6 is an example of a translated hypertext document postprocessed according to the method of the present invention;
Figure 7A and 7B comprise a screen shot of a World Wide Web page that has been translated according to tl~e rr~etlxod of tb,e pxesexxt izzvention;
Figure 8 is a diagrammatic view of one embodiment of the present invention in which machine translation is inte~-ated into a Web browser; and 2c Figure 9 is a diagrammatic view of one embodiment of the present invention in which pre-translated Web pages are accessible from a server.
DETAIL DESCRIPTION OF PREFERRED EMBODIMENTS) Although the detailed description of a preferred embodiment focuses on automatic translation of World Wide Web pages, the concept is adaptable to documents obtained from other sources.
The World Wide Web (WWW or the Web) is a distributed information system that may be accessed through a number of sources. It is comprised of software and a set of protocols and conventions. Information on the Web may be accessed using a browser program such as CompuServe's Web Browser. Browsers allow users to read documents and to locate documents from other sources. They present an interface for interacting with the system and they process requests on behalf of the user.
Information providers on the WWW make their information available through programs that understand the HyperText Transfer Protocol (HTTP}. Browsers assist users in 'visiting" Web sites where information is stored. Information is displayed in pages of text and graphics called '~Veb Pages." An example of a Web page as viewed through CompuServe's Web Browser is provided in Figures 1A and 1B. The Web page shown in Figures 1A
and 1B
contains both text 14, 18 and graphics 10, 12, 16. The title bar 20, menu options 22, buttons 24, and document information 26 appearing at the top of the screen are part of the browser used to view the Web page.
In most cases, information providers make information available through a Web server.
The server responds to information requests by delivering the requested information to the user's browser for viewing. Some providers may make their information available through a
a browser for requesting said documents from a web server in accordance with links in said documents and for specifying a second language;
a preprocessor for marking said codes in said documents;
a language translator for translating into said second language said data characters in said preprocessed documents; and a postprocessor for unmarking said codes in said translated documents.
The present invention also provides a method for translating documents, comprising 1o the steps o~
defining in an original language a document containing display and reference codes and data characters exclusive of said display and reference codes;
storing said document at a server;
selecting a target language, said target language specified by a user of a browser;
15 requesting said document from said server, said document requested by said user in accordance with a link;
preprocessing said document to insert markers around said reference codes;
translating said data characters in said document from said original language to said 2o target language;
postprocessing said document to remove said markers from around said reference codes; and transmitting said document in said target language to a browser.
In a further aspect, the present invention provides a system for automated translation 25 of HTML documents comprising:
a browser for requesting and displaying HTML documents;
a server for processing browser requests for HTML documents;
a connection between said browser to said server for transmitting requests from said browser to said server and HTML documents from said server to said browser;
3o a request from said browser for a HTML document in a source language, said HTML
document requested in accordance with a link to said HTML document in said source language;
2a a target language specified in accordance with said browser; and machine translation software for translating said HTML document in said source language to a HTML
document in said target language.
The present invention also provides a method for providing HTML documents in a plurality of languages comprising the steps of:
retrieving from a server a HTML document in a source language, said HTML
document selected in accordance with a link;
determining a target language, said target language specified by a user of a browser;
translating said HTML document in said source language to a HTML document in 1o said target language; and displaying said HTML document in said target language at said browser.
The present invention also provides a multilingual browser comprising:
a browser for requesting and displaying HTML documents, said browser adapted for specification of a target language; and 15 machine translation software integrated into said browser, said machine translation software adapted to translate a HTML document in a source language to a HTML
document in said target language, in accordance with a user of said browser selecting a link to said HTML document in said source document.
The present invention also provides a system for automated translation of HTML
2o documents, comprising:
a plurality of web servers adapted to store a plurality of HTML documents in a source language;
a personal computer equipped with a browser for accessing HTML documents;
a server for processing requests from said browser at said personal computer for access 25 to said plurality of HTML documents;
a target language specified by a user of said personal computer;
a request for one of said plurality of HTML documents in said source language;
a language translator for translating said requested HTML document in said source language to a HTML document in said target language; and 3o a display at said personal computer for displaying said HTML document in said target language.
Advantages of the present invention are explained further in relation to the following detailed description of the invention, drawings, and claims.
2b Figures IA and IB c;omprise a screen shot of a World Wide Web page;
Figure 2 is an example of a hypertext document;
Figure 3 is an example of a hypertext document preprocessed according to the method of the present invention;
Figure 4 illustratES a system for performing machine translation;
Figure SA and SB comprise axe example of a preprocessed hypertext document translated according to the rnethod of the present invention;
to Figure 6 is an example of a translated hypertext document postprocessed according to the method of the present invention;
Figure 7A and 7B comprise a screen shot of a World Wide Web page that has been translated according to tl~e rr~etlxod of tb,e pxesexxt izzvention;
Figure 8 is a diagrammatic view of one embodiment of the present invention in which machine translation is inte~-ated into a Web browser; and 2c Figure 9 is a diagrammatic view of one embodiment of the present invention in which pre-translated Web pages are accessible from a server.
DETAIL DESCRIPTION OF PREFERRED EMBODIMENTS) Although the detailed description of a preferred embodiment focuses on automatic translation of World Wide Web pages, the concept is adaptable to documents obtained from other sources.
The World Wide Web (WWW or the Web) is a distributed information system that may be accessed through a number of sources. It is comprised of software and a set of protocols and conventions. Information on the Web may be accessed using a browser program such as CompuServe's Web Browser. Browsers allow users to read documents and to locate documents from other sources. They present an interface for interacting with the system and they process requests on behalf of the user.
Information providers on the WWW make their information available through programs that understand the HyperText Transfer Protocol (HTTP}. Browsers assist users in 'visiting" Web sites where information is stored. Information is displayed in pages of text and graphics called '~Veb Pages." An example of a Web page as viewed through CompuServe's Web Browser is provided in Figures 1A and 1B. The Web page shown in Figures 1A
and 1B
contains both text 14, 18 and graphics 10, 12, 16. The title bar 20, menu options 22, buttons 24, and document information 26 appearing at the top of the screen are part of the browser used to view the Web page.
In most cases, information providers make information available through a Web server.
The server responds to information requests by delivering the requested information to the user's browser for viewing. Some providers may make their information available through a
3 WO 97!18516 PCT/US96/18102 proxy server that converts information in one format to the format expected and understood by the browser.
Documents available on the WWW and displayed by browsers are hypertext , documents. Hypertext is text that contains references {or "links,"
'~.yperlinks," or 'dot spots") to other documents. The reference is similar to a footnote except the referenced document may be accessed directly from the original document. The related document may be viewed by selecting or clicking the mouse on the reference. The process of selecting hyperlinks to view referenced documents may be referred to as "traversing the hyperlinks."
Unlike a footnote, references usually do not appear as shorthand descriptions of related documents. Instead, references may be indicated by a combination of graphics, different fonts, different colors for the text, underlining, the mouse pointer turning into a hand, etc. The referenced documents may reside on different computers at different Web sites.
Hypertext documents are written in a 'markup language" call Hypertext Markup Language {HTML). HTNIL actually refers to both a document type and the markup language IS that represents instances of the document type. A hypertext document contains general semantics appropriate for representing display or presentation characteristics as well as information from a wide ranges of domains. A hypertext document consists of a sequence or stream of characters that comprise both data characters and markups. Markups are syntactically delimited characters (sucli as "<," ">," '~," etc.) added to the data characters to define the document's structure. Markups thus have special meanings and may represent such things as hypertext, news, mail, documentation, menus of options, and in-line graphics.
Markups may be combined with other characters or related values to create codes that also have special meaning. Data characters are those characters in the document that are not codes.
Documents available on the WWW and displayed by browsers are hypertext , documents. Hypertext is text that contains references {or "links,"
'~.yperlinks," or 'dot spots") to other documents. The reference is similar to a footnote except the referenced document may be accessed directly from the original document. The related document may be viewed by selecting or clicking the mouse on the reference. The process of selecting hyperlinks to view referenced documents may be referred to as "traversing the hyperlinks."
Unlike a footnote, references usually do not appear as shorthand descriptions of related documents. Instead, references may be indicated by a combination of graphics, different fonts, different colors for the text, underlining, the mouse pointer turning into a hand, etc. The referenced documents may reside on different computers at different Web sites.
Hypertext documents are written in a 'markup language" call Hypertext Markup Language {HTML). HTNIL actually refers to both a document type and the markup language IS that represents instances of the document type. A hypertext document contains general semantics appropriate for representing display or presentation characteristics as well as information from a wide ranges of domains. A hypertext document consists of a sequence or stream of characters that comprise both data characters and markups. Markups are syntactically delimited characters (sucli as "<," ">," '~," etc.) added to the data characters to define the document's structure. Markups thus have special meanings and may represent such things as hypertext, news, mail, documentation, menus of options, and in-line graphics.
Markups may be combined with other characters or related values to create codes that also have special meaning. Data characters are those characters in the document that are not codes.
4 Figure 2 is the hypertext document that describes the Web page shown in Figures 1A
and 1B. Figure 2 shows the markups and related words (that comprise codes) as well as data characters that may appear in a hypertext document. For example, the characters "<" and ">"
appearing throughout the document are markups. The characters "<" and ">"
combined with the word "head" ("<head>") 10 may be considered a code. Finally, the text '21L,T Home" 10 that is not surrounded by markups or codes may be considered data characters.
As indicated by the brief description, H'FML documents have a well-defined and documented structure defined by a grammar. The codes in a HTML document convey important information regarding both the display or presentation of the document itself as well as related references and commands. Display and presentation information may include color information, information about graphics that appear on the .page, information about text that appears on the page, etc. A HTML document is structured as a series of elements that are identified by the language markups and codes. A document includes a head (consisting of a title and other optional elements) and a body that is a text flow of paragraphs, lists, images, and other elements. The various parts of the document may be identified by looking at the markups or codes in the document. For example, referring again to Figure 2 which shows the hypertext for Figures 1A and 1B, the document head contains the title "NLT
Home" 10. An image contained in the document is identified in the line "<br><img src--'~le:/l/n~/iowebsrv/server/8100--~1.1/server 1/image/ntl jpg"
height=60 width=640></center>" 12.
As may be apparent, the process of translating a ITTML document requires examination of each character in document. Characters may be examined individually and in combination to determine whether they are markups, codes, or data characters.
To process a document, the processing software examines the character stream that comprises the
and 1B. Figure 2 shows the markups and related words (that comprise codes) as well as data characters that may appear in a hypertext document. For example, the characters "<" and ">"
appearing throughout the document are markups. The characters "<" and ">"
combined with the word "head" ("<head>") 10 may be considered a code. Finally, the text '21L,T Home" 10 that is not surrounded by markups or codes may be considered data characters.
As indicated by the brief description, H'FML documents have a well-defined and documented structure defined by a grammar. The codes in a HTML document convey important information regarding both the display or presentation of the document itself as well as related references and commands. Display and presentation information may include color information, information about graphics that appear on the .page, information about text that appears on the page, etc. A HTML document is structured as a series of elements that are identified by the language markups and codes. A document includes a head (consisting of a title and other optional elements) and a body that is a text flow of paragraphs, lists, images, and other elements. The various parts of the document may be identified by looking at the markups or codes in the document. For example, referring again to Figure 2 which shows the hypertext for Figures 1A and 1B, the document head contains the title "NLT
Home" 10. An image contained in the document is identified in the line "<br><img src--'~le:/l/n~/iowebsrv/server/8100--~1.1/server 1/image/ntl jpg"
height=60 width=640></center>" 12.
As may be apparent, the process of translating a ITTML document requires examination of each character in document. Characters may be examined individually and in combination to determine whether they are markups, codes, or data characters.
To process a document, the processing software examines the character stream that comprises the
5 document. The steps needed to translate a HTML, document from one language to another may be summarized as follows:
Step 1. Preprocess the HTML document by placing boundary markers around , H'IT~IL codes to be preserved during the translation process. The translation software recognizes the boundary markers and does not translate text and symbols appearing between the markers.
Step 2. Translate the preprocessed HTML document from the original language to the target language.
Step 3. Postprocess the translated HTML document to remove the boundary markers.
Step 1. The codes in a HTMI. document convey important information describing the characteristics of the Web page. Referring again to Figure 2, an example of the type of information contained in a hypertext document is shown. Certain information contained in the document of Figure 2 may be interpreted by a Web browser so that to the browser user, the images shown in Figures 1A and 1B appear. Certain information in the hypertext document is preserved during the translation process so that the translated page has, in general, the same appearance and behavior as the original page. Because HTML documents have a well-defined and known structure described by a grammar, automated translation of a I-fIZVIL document is possible. The codes in the document may be discerned by the preprocessing software. Special boundary markers placed in the document by the preprocessing ~ software indicate to ' the translation software that the intervening text should not be translated.
Consequently, the resulting page may have the same appearance and behavior as the original page.
Referring to Figure 3, an example of a preprocessed HTML document is shown.
The I~TML, document of Figure 3 is the preprocessed version of the HTMI, document shown in
Step 1. Preprocess the HTML document by placing boundary markers around , H'IT~IL codes to be preserved during the translation process. The translation software recognizes the boundary markers and does not translate text and symbols appearing between the markers.
Step 2. Translate the preprocessed HTML document from the original language to the target language.
Step 3. Postprocess the translated HTML document to remove the boundary markers.
Step 1. The codes in a HTMI. document convey important information describing the characteristics of the Web page. Referring again to Figure 2, an example of the type of information contained in a hypertext document is shown. Certain information contained in the document of Figure 2 may be interpreted by a Web browser so that to the browser user, the images shown in Figures 1A and 1B appear. Certain information in the hypertext document is preserved during the translation process so that the translated page has, in general, the same appearance and behavior as the original page. Because HTML documents have a well-defined and known structure described by a grammar, automated translation of a I-fIZVIL document is possible. The codes in the document may be discerned by the preprocessing software. Special boundary markers placed in the document by the preprocessing ~ software indicate to ' the translation software that the intervening text should not be translated.
Consequently, the resulting page may have the same appearance and behavior as the original page.
Referring to Figure 3, an example of a preprocessed HTML document is shown.
The I~TML, document of Figure 3 is the preprocessed version of the HTMI, document shown in
6 Figure 2. In this example, the boundary markers used to identify the HTML
codes are the character pairs "{." and ".}". Any character or character combination that does not normally occur in text may be used as a boundary marker. The line that appeared as "<head><title>NLT Home<title><head>" in Figure 2 (10) is preprocessed in Step 1 to the line "{.<head>.}{.<title>.}NLT Home{.<title>.}{.<head>.}" in Figure 3 (10). Other Lines in the document are preprocessed similarly.
Step 2. Machine Translation (MT) software performs the translation of text from one language to another language. There are many commercially available MT
software packages.
Figure 4 is an illustration of a system in which MT software 10 takes as input text in one language 12 and generates a rough draft translation of the text in another language 14 using an electronic dictionary 16 and a set of linguistic and/or statistical rules encoded in the program 18. MT software can perform language conversion operations very quickly; in some cases, at speeds ofup to 3,000 words per minute. The translated texts are not high quality translations, but they are usually adequate for understanding what the document is about.
Referring to Figures SA and 5B, an example of a translated H'TML document is shown. The HTMI, document of Figures SA and SB is the translated version of the preprocessed HTML document shown in Figure 3. As described above, the boundary markers used to identify the HTML codes are the character pairs "{." and ".}".
Consequently, the MT
software ignores all text that falls between the boundary markers. Data characters that are not surrounded by boundary markers are translated by the MT software. The preprocessed line that appeared as "{.<head>.}{.<title>.}NLT Home{.<title>.}{.<head>.}" in Figure 3 (10) is translated in Step 2 to the line "{.<head>.}{.<title>.}NLT
Maison{.<title>.}{.<head>.}" in Figure SA (10).
codes are the character pairs "{." and ".}". Any character or character combination that does not normally occur in text may be used as a boundary marker. The line that appeared as "<head><title>NLT Home<title><head>" in Figure 2 (10) is preprocessed in Step 1 to the line "{.<head>.}{.<title>.}NLT Home{.<title>.}{.<head>.}" in Figure 3 (10). Other Lines in the document are preprocessed similarly.
Step 2. Machine Translation (MT) software performs the translation of text from one language to another language. There are many commercially available MT
software packages.
Figure 4 is an illustration of a system in which MT software 10 takes as input text in one language 12 and generates a rough draft translation of the text in another language 14 using an electronic dictionary 16 and a set of linguistic and/or statistical rules encoded in the program 18. MT software can perform language conversion operations very quickly; in some cases, at speeds ofup to 3,000 words per minute. The translated texts are not high quality translations, but they are usually adequate for understanding what the document is about.
Referring to Figures SA and 5B, an example of a translated H'TML document is shown. The HTMI, document of Figures SA and SB is the translated version of the preprocessed HTML document shown in Figure 3. As described above, the boundary markers used to identify the HTML codes are the character pairs "{." and ".}".
Consequently, the MT
software ignores all text that falls between the boundary markers. Data characters that are not surrounded by boundary markers are translated by the MT software. The preprocessed line that appeared as "{.<head>.}{.<title>.}NLT Home{.<title>.}{.<head>.}" in Figure 3 (10) is translated in Step 2 to the line "{.<head>.}{.<title>.}NLT
Maison{.<title>.}{.<head>.}" in Figure SA (10).
7 Step 3. In the final step, postprocessing software removes boundary markers from the translated document. Referring to Figure 6, an example of a postprocessed HTML
document is shown. The HTML document of Figure 6 is the postprocessed version of the translated , HTML document shown in Figures 5A and 5B. As described above, the boundary markers used to identify the HTML codes are the character pairs "{." and ".)". During postprocessing, these boundary markers are removed. The translated line that appeared as "{.<head>.){.<title>.}NLT Maison{.<title>.}{.head>.}" in Figure 5A (10) is postprocessed in Step 3 to the Iine "<head><title>NLT Maison<title><head>" in Figure 6 {
10). The postprocessed ITTML document of Figure 6 may then be displayed by the browser as shown in Figures 7A and 7B.
Figure 8 is a diagrammatic view of one embodiment of the present invention in which machine translation is integrated into a Web browser. MT software 10 may be combined with a browscr 12 to allow the user 14 to rapidly and automatically translate online documents from the World Wide Web 1b into his native language. The MT software 10 may be bundled with the browser 12 to form an integrated multilingual browser. The user 14 of the multilingual browser I6 selects the desired target language, (e.g. French if the user speaks French), and the Web document retrieved by the browser I8 may be rapidly translated on-the fly with a mouse click. The Web Browser I2 then displays for the user 14 the translated document 20. Optionally, the user may be able to update and edit parts of the MT software's electronic dictionaries to include terminology common to the Web sites he visits.
Although a document may be translated at the time that a user requests access to the document, a document may also be 'ire-translated" and stored in a cache for later retrieval before a user seeks access to it. Documents that have been accessed at least once may also be stored following translation. The advantage of storing documents that have been translated is
document is shown. The HTML document of Figure 6 is the postprocessed version of the translated , HTML document shown in Figures 5A and 5B. As described above, the boundary markers used to identify the HTML codes are the character pairs "{." and ".)". During postprocessing, these boundary markers are removed. The translated line that appeared as "{.<head>.){.<title>.}NLT Maison{.<title>.}{.head>.}" in Figure 5A (10) is postprocessed in Step 3 to the Iine "<head><title>NLT Maison<title><head>" in Figure 6 {
10). The postprocessed ITTML document of Figure 6 may then be displayed by the browser as shown in Figures 7A and 7B.
Figure 8 is a diagrammatic view of one embodiment of the present invention in which machine translation is integrated into a Web browser. MT software 10 may be combined with a browscr 12 to allow the user 14 to rapidly and automatically translate online documents from the World Wide Web 1b into his native language. The MT software 10 may be bundled with the browser 12 to form an integrated multilingual browser. The user 14 of the multilingual browser I6 selects the desired target language, (e.g. French if the user speaks French), and the Web document retrieved by the browser I8 may be rapidly translated on-the fly with a mouse click. The Web Browser I2 then displays for the user 14 the translated document 20. Optionally, the user may be able to update and edit parts of the MT software's electronic dictionaries to include terminology common to the Web sites he visits.
Although a document may be translated at the time that a user requests access to the document, a document may also be 'ire-translated" and stored in a cache for later retrieval before a user seeks access to it. Documents that have been accessed at least once may also be stored following translation. The advantage of storing documents that have been translated is
8 that delivery time to the user may be reduced. Although storing documents requires disk space, it may represent a better use of system resources because documents that are accessed frequently are translated once rather than every time they are accessed.
Figure 9 is a diagrammatic view of an alternative implementation in which pre-translated Web pages are stored on a Web server 14. The translation software resides on a translation server 14 (possibly the same machine as the Web server). Popular Web pages 24 are pre-translated and stored in a cache 28, with additional pages being added as they are requested by users 20. The cache is a dynamic storage device with a finite capacity. New, pretranslated pages are added to the cache, but pages will also be removed from the cache if they are used infrequently or ifthere are constraints on storage capacity.
In accessing the system, a user 10, sends to the Web Server 14 a request for a specific page in a specific language 12. The Web Server 14 then sends a request to get the desired page 16. The method for servicing the request depends on where the page is located. If the page lias been pre-translated 24 and stored in the cache of pages in multiple languages 28, it is retrieved from the cache 26 and returned to the user in the requested language 30. If the page has not been pre-translated, then the page is retrieved 20 from the World Wide Web 22, translated into the requested language, and cached before being sent to the user 30.
Translation of Web pages, in either the bundled browser/MT configuration or the Web Server configuration, requires processing of HTML codes containing reference, command, and display information. Preferably, the HTML codes are identified prior to translation, then surrounded by special boundary markers to block the translation process on the codes. The HTMI. preprocessor uses its knowledge regarding the markups, codes, data characters and the structure of HTML documents to determine which codes should be blocked from the translation process. After translation is complete, a postprocessing program removes the
Figure 9 is a diagrammatic view of an alternative implementation in which pre-translated Web pages are stored on a Web server 14. The translation software resides on a translation server 14 (possibly the same machine as the Web server). Popular Web pages 24 are pre-translated and stored in a cache 28, with additional pages being added as they are requested by users 20. The cache is a dynamic storage device with a finite capacity. New, pretranslated pages are added to the cache, but pages will also be removed from the cache if they are used infrequently or ifthere are constraints on storage capacity.
In accessing the system, a user 10, sends to the Web Server 14 a request for a specific page in a specific language 12. The Web Server 14 then sends a request to get the desired page 16. The method for servicing the request depends on where the page is located. If the page lias been pre-translated 24 and stored in the cache of pages in multiple languages 28, it is retrieved from the cache 26 and returned to the user in the requested language 30. If the page has not been pre-translated, then the page is retrieved 20 from the World Wide Web 22, translated into the requested language, and cached before being sent to the user 30.
Translation of Web pages, in either the bundled browser/MT configuration or the Web Server configuration, requires processing of HTML codes containing reference, command, and display information. Preferably, the HTML codes are identified prior to translation, then surrounded by special boundary markers to block the translation process on the codes. The HTMI. preprocessor uses its knowledge regarding the markups, codes, data characters and the structure of HTML documents to determine which codes should be blocked from the translation process. After translation is complete, a postprocessing program removes the
9 special boundary markers so that the necessary references, commands, and display characteristics are available in the translated text.
The primary objective of the present invention is to allow a user of the Internet to read , hypertext documents that are available only in a language foreign to the user.
The readable text of the hypertext document is changed in accordance with the user's preferred language.
Steps are taken to preserve the document's appearance and behavior so that the only noticeable difference between the original document and the translated document is the language of the text. Users may interact with the translated document and reference related documents in the same manner that users interact with the original document.
The primary objective of the present invention is to allow a user of the Internet to read , hypertext documents that are available only in a language foreign to the user.
The readable text of the hypertext document is changed in accordance with the user's preferred language.
Steps are taken to preserve the document's appearance and behavior so that the only noticeable difference between the original document and the translated document is the language of the text. Users may interact with the translated document and reference related documents in the same manner that users interact with the original document.
Claims (29)
OR PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS:
1. A method for translating a plurality of documents comprising codes and data characters in a first language, comprising the steps of:
specifying a second language;
transmitting a request for a first document from a browser to a web server;
transmitting said first document to a preprocessor, said preprocessor adapted to insert markers around said codes in said first document;
inserting markers around said codes in said first document;
transmitting said first document to a language translator, said language translator adapted to leave said marked codes untranslated;
translating said data characters in said first language to data characters in said second language using said language translator;
transmitting said first translated document to a postprocessor, said postprocessor adapted to remove said markers from said first translated document;
removing said markers from said first translated document;
displaying said first translated document at said browser;
selecting in said first translated document a link to a second document in said first language;
transmitting a request for said second document from a browser to a web server;
transmitting said second document to said preprocessor; inserting markers around codes in said second document;
transmitting said second document to said language translator;
translating said data characters in said first language to data characters in said second language using said language translator;
transmitting said second translated document to said postprocessor;
removing said markers from said second translated document;
and displaying said second translated document at said browser.
specifying a second language;
transmitting a request for a first document from a browser to a web server;
transmitting said first document to a preprocessor, said preprocessor adapted to insert markers around said codes in said first document;
inserting markers around said codes in said first document;
transmitting said first document to a language translator, said language translator adapted to leave said marked codes untranslated;
translating said data characters in said first language to data characters in said second language using said language translator;
transmitting said first translated document to a postprocessor, said postprocessor adapted to remove said markers from said first translated document;
removing said markers from said first translated document;
displaying said first translated document at said browser;
selecting in said first translated document a link to a second document in said first language;
transmitting a request for said second document from a browser to a web server;
transmitting said second document to said preprocessor; inserting markers around codes in said second document;
transmitting said second document to said language translator;
translating said data characters in said first language to data characters in said second language using said language translator;
transmitting said second translated document to said postprocessor;
removing said markers from said second translated document;
and displaying said second translated document at said browser.
2. The method of claim 1, wherein said codes are HyperText Markup Language codes.
3. The method of claim 1, wherein said first and seconds documents are translated by said language translator and cached at said server prior to transmission of said requests for said first and second documents.
4. The method of claim 1, wherein said browser performs the steps of transmitting said first and second documents to a preprocessor, inserting markers around said codes in said first and second documents, transmitting said first and second documents to a language translator, translating said data characters in said first language to data characters in a second language, transmitting said first and second documents to a post processor, and removing said markers from said first and second translated documents.
5. The method of claim 1, wherein said server performs the steps of transmitting said first and second documents to a preprocessor, inserting markers around said codes in said first and second documents, transmitting said first and second documents to a language translator, translating said data characters in said first language to data characters in a second language, transmitting said first and second documents to a post processor, and removing said markers from said first and second translated documents.
6. A document translation system for translating documents comprising codes and data characters in a first language, said system comprising: a browser for requesting said documents from a web server in accordance with links in said documents and for specifying a second language;
a preprocessor for marking said codes in said documents;
a language translator for translating into said second language said data characters in said preprocessed documents; and a postprocessor for unmarking said codes in said translated documents.
a preprocessor for marking said codes in said documents;
a language translator for translating into said second language said data characters in said preprocessed documents; and a postprocessor for unmarking said codes in said translated documents.
7. The system of claim 6, wherein said codes are HyperText Markup Language codes.
8. The system of claim 6, wherein said preprocessor, said language translator, and said postprocessor are integrated into said browser.
9. The system of claim 6, wherein said preprocessor, said language translator, and said postprocessor are integrated into said server.
10. A method for translating documents, comprising the steps o~
defining in an original language a document containing display and reference codes and data characters exclusive of said display and reference codes;
storing said document at a server;
selecting a target language, said target language specified by a user of a browser;
requesting said document from said server, said document requested by said user in accordance with a link;
preprocessing said document to insert markers around said reference codes;
translating said data characters in said document from said original language to said target language;
postprocessing said document to remove said markers from around said reference codes;
and transmitting said document in said target language to a browser.
defining in an original language a document containing display and reference codes and data characters exclusive of said display and reference codes;
storing said document at a server;
selecting a target language, said target language specified by a user of a browser;
requesting said document from said server, said document requested by said user in accordance with a link;
preprocessing said document to insert markers around said reference codes;
translating said data characters in said document from said original language to said target language;
postprocessing said document to remove said markers from around said reference codes;
and transmitting said document in said target language to a browser.
11. The method of claim 10 wherein said codes are HyperText Markup Language codes.
12. The method of claim 10 further comprising the step of storing said document in said target language in a cache at said server.
13. A system for automated translation of HTML documents comprising:
a browser for requesting and displaying HTML documents;
a server for processing browser requests for HTML documents;
a connection between said browser to said server for transmitting requests from said browser to said server and HTML documents from said server to said browser;
a request from said browser for a HTML document in a source language, said HTML
document requested in accordance with a link to said HTML document in said source language;
a target language specified in accordance with said browser; and machine translation software for translating said HTML document in said source language to a HTML
document in said target language.
a browser for requesting and displaying HTML documents;
a server for processing browser requests for HTML documents;
a connection between said browser to said server for transmitting requests from said browser to said server and HTML documents from said server to said browser;
a request from said browser for a HTML document in a source language, said HTML
document requested in accordance with a link to said HTML document in said source language;
a target language specified in accordance with said browser; and machine translation software for translating said HTML document in said source language to a HTML
document in said target language.
14. The system of claim 13 wherein said machine translation software comprises a preprocessor for inserting markers around codes in said HTML document in said source language, a language translator for translating said HTML document in said source language to a document in said target language, and a postprocessor for removing markers from around codes in said document in said target language.
15. The system of claim 13 wherein said machine translation software is integrated with said browser.
16. The system of claim 13 wherein said machine translation software is operable at said server.
17. The system of claim 13 further comprising a cache at said server for storing said HTML document in said target language.
18. A method for providing HTML documents in a plurality of languages comprising the steps of:
retrieving from a server a HTML document in a source language, said HTML
document selected in accordance with a link;
determining a target language, said target language specified by a user of a browser;
translating said HTML document in said source language to a HTML document in said target language; and displaying said HTML document in said target language at said browser.
retrieving from a server a HTML document in a source language, said HTML
document selected in accordance with a link;
determining a target language, said target language specified by a user of a browser;
translating said HTML document in said source language to a HTML document in said target language; and displaying said HTML document in said target language at said browser.
19. The method of claim 18 further comprising the step of storing said HTML
document in said target language in a cache at said server.
document in said target language in a cache at said server.
20. The method of claim 19 further comprising the steps of:
requesting from said server a HTML document in a source language;
determining a target language, said target language specified by a user of a browser requesting said HTML document in a source language;
locating said HTML document in said target language in said cache at said server; and displaying said HTML document in said target language at said browser.
requesting from said server a HTML document in a source language;
determining a target language, said target language specified by a user of a browser requesting said HTML document in a source language;
locating said HTML document in said target language in said cache at said server; and displaying said HTML document in said target language at said browser.
21. The method of claim 18 wherein the step of translating said HTML document in a source language to a HTML document in said target language is performed by said browser.
22. The method of claim 18 wherein the step of translating said HTML document in a source language to a HTML document in said target language is performed by said server.
23. The method of claim 18 wherein the step of translating said HTML document in a source language to a HTML document in said target language comprises the steps of:
preprocessing said HTML document in said source language to insert markers around codes in said HTML document in said source language;
translating said HTML document in said source language to a document in said target language; and postprocessing said document in said target language to remove markers from around codes in said document in said target language.
preprocessing said HTML document in said source language to insert markers around codes in said HTML document in said source language;
translating said HTML document in said source language to a document in said target language; and postprocessing said document in said target language to remove markers from around codes in said document in said target language.
24. A multilingual browser comprising:
a browser for requesting and displaying HTML documents, said browser adapted for specification of a target language; and machine translation software integrated into said browser, said machine translation software adapted to translate a HTML document in a source language to a HTML
document in said target language, in accordance with a user of said browser selecting a link to said HTML
document in said source document.
a browser for requesting and displaying HTML documents, said browser adapted for specification of a target language; and machine translation software integrated into said browser, said machine translation software adapted to translate a HTML document in a source language to a HTML
document in said target language, in accordance with a user of said browser selecting a link to said HTML
document in said source document.
25. The multilingual browser of claim 24 wherein said machine translation software comprises a preprocessor for inserting markers around codes in said HTML
document in said source language, a language translator for translating said HTML document in said source language to a document in said target language, and a postprocessor for removing markers from around codes in said document in said target language.
document in said source language, a language translator for translating said HTML document in said source language to a document in said target language, and a postprocessor for removing markers from around codes in said document in said target language.
26. A system for automated translation of HTML documents, comprising:
a plurality of web servers adapted to store a plurality of HTML documents in a source language;
a personal computer equipped with a browser for accessing HTML documents;
a server for processing requests from said browser at said personal computer for access to said plurality of HTML documents;
a target language specified by a user of said personal computer;
a request for one of said plurality of HTML documents in said source language;
a language translator for translating said requested HTML document in said source language to a HTML document in said target language; and a display at said personal computer for displaying said HTML document in said target language.
a plurality of web servers adapted to store a plurality of HTML documents in a source language;
a personal computer equipped with a browser for accessing HTML documents;
a server for processing requests from said browser at said personal computer for access to said plurality of HTML documents;
a target language specified by a user of said personal computer;
a request for one of said plurality of HTML documents in said source language;
a language translator for translating said requested HTML document in said source language to a HTML document in said target language; and a display at said personal computer for displaying said HTML document in said target language.
27. The system of claim 26 wherein said language translator is integrated with said browser.
28. The system of claim 26 wherein said language translator is operable at said server.
29. The system of claim 28 wherein said HTML document in said target language is stored in a cache at said server.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/555,916 | 1995-11-13 | ||
US08/555,916 US6993471B1 (en) | 1995-11-13 | 1995-11-13 | Integrated multilingual browser |
PCT/US1996/018102 WO1997018516A1 (en) | 1995-11-13 | 1996-11-13 | Integrated multilingual browser |
Publications (2)
Publication Number | Publication Date |
---|---|
CA2216387A1 CA2216387A1 (en) | 1997-05-22 |
CA2216387C true CA2216387C (en) | 2003-07-15 |
Family
ID=24219113
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002216387A Expired - Lifetime CA2216387C (en) | 1995-11-13 | 1996-11-13 | Integrated multilingual browser |
Country Status (5)
Country | Link |
---|---|
US (3) | US6993471B1 (en) |
EP (1) | EP0829053A4 (en) |
AU (1) | AU1406197A (en) |
CA (1) | CA2216387C (en) |
WO (1) | WO1997018516A1 (en) |
Families Citing this family (99)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6993471B1 (en) * | 1995-11-13 | 2006-01-31 | America Online, Inc. | Integrated multilingual browser |
US6470306B1 (en) | 1996-04-23 | 2002-10-22 | Logovista Corporation | Automated translation of annotated text based on the determination of locations for inserting annotation tokens and linked ending, end-of-sentence or language tokens |
US6996609B2 (en) * | 1996-05-01 | 2006-02-07 | G&H Nevada Tek | Method and apparatus for accessing a wide area network |
EP0810533B1 (en) * | 1996-05-29 | 2002-04-10 | Matsushita Electric Industrial Co., Ltd. | Document conversion apparatus |
JP2001503540A (en) * | 1996-06-14 | 2001-03-13 | ロゴヴィスタ株式会社 | Automatic translation of annotated text |
AU7753998A (en) | 1997-05-28 | 1998-12-30 | Shinar Linguistic Technologies Inc. | Translation system |
GB9716887D0 (en) * | 1997-08-08 | 1997-10-15 | British Telecomm | Translation |
GB9727322D0 (en) | 1997-12-29 | 1998-02-25 | Xerox Corp | Multilingual information retrieval |
US6526426B1 (en) * | 1998-02-23 | 2003-02-25 | David Lakritz | Translation management system |
US8489980B2 (en) * | 1998-02-23 | 2013-07-16 | Transperfect Global, Inc. | Translation management system |
US10541973B2 (en) * | 1998-02-23 | 2020-01-21 | Transperfect Global, Inc. | Service of cached translated content in a requested language |
US6623529B1 (en) | 1998-02-23 | 2003-09-23 | David Lakritz | Multilingual electronic document translation, management, and delivery system |
GB2337611A (en) | 1998-05-20 | 1999-11-24 | Sharp Kk | Multilingual document retrieval system |
DE19936314A1 (en) * | 1998-08-05 | 2000-02-17 | Spyglass Inc | Conversion process for document data that is communicated over the Internet uses data base of conversion preferences |
US6925595B1 (en) | 1998-08-05 | 2005-08-02 | Spyglass, Inc. | Method and system for content conversion of hypertext data using data mining |
US7191393B1 (en) | 1998-09-25 | 2007-03-13 | International Business Machines Corporation | Interface for providing different-language versions of markup-language resources |
JP3055545B1 (en) * | 1999-01-19 | 2000-06-26 | 富士ゼロックス株式会社 | Related sentence retrieval device |
US6353855B1 (en) | 1999-03-01 | 2002-03-05 | America Online | Providing a network communication status description based on user characteristics |
US7607085B1 (en) * | 1999-05-11 | 2009-10-20 | Microsoft Corporation | Client side localizations on the world wide web |
AU6405900A (en) * | 1999-06-21 | 2001-01-09 | Cleverlearn.Com | Language teaching and translation system and method |
SE9903986L (en) * | 1999-11-03 | 2001-05-04 | Tony Norman | Procedure for creating a presentation in multiple versions |
AU3741200A (en) * | 1999-12-20 | 2001-07-03 | Netzero, Inc. | Method and apparatus employing a proxy server for modifying an html document supplied by a web server to a web client |
AU765001B2 (en) * | 2000-02-02 | 2003-09-04 | Transperfect Global, Inc. | Translation ordering system |
AUPQ539700A0 (en) * | 2000-02-02 | 2000-02-24 | Worldlingo.Com Pty Ltd | Translation ordering system |
US7216072B2 (en) * | 2000-02-29 | 2007-05-08 | Fujitsu Limited | Relay device, server device, terminal device, and translation server system utilizing these devices |
KR100450881B1 (en) * | 2000-03-16 | 2004-10-01 | 주식회사 유니소프트 | System and Method for multi language translation |
JP2003529845A (en) * | 2000-03-31 | 2003-10-07 | アミカイ・インコーポレイテッド | Method and apparatus for providing multilingual translation over a network |
KR100367675B1 (en) * | 2000-04-27 | 2003-01-15 | 엘지전자 주식회사 | Tv text information translation system and control method the same |
US7437669B1 (en) * | 2000-05-23 | 2008-10-14 | International Business Machines Corporation | Method and system for dynamic creation of mixed language hypertext markup language content through machine translation |
FR2809509B1 (en) | 2000-05-26 | 2003-09-12 | Bull Sa | SYSTEM AND METHOD FOR INTERNATIONALIZING THE CONTENT OF TAGGED DOCUMENTS IN A COMPUTER SYSTEM |
WO2002001387A2 (en) * | 2000-06-23 | 2002-01-03 | Medtronic, Inc. | Human language translation of patient session information from implantable medical devices |
JP4011268B2 (en) * | 2000-07-05 | 2007-11-21 | 株式会社アイアイエス | Multilingual translation system |
KR100387918B1 (en) * | 2000-07-11 | 2003-06-18 | 이수성 | Interpreter |
US6993568B1 (en) | 2000-11-01 | 2006-01-31 | Microsoft Corporation | System and method for providing language localization for server-based applications with scripts |
AUPR329501A0 (en) * | 2001-02-22 | 2001-03-22 | Worldlingo, Inc | Translation information segment |
JP3379090B2 (en) * | 2001-03-02 | 2003-02-17 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Machine translation system, machine translation method, and machine translation program |
US6999916B2 (en) * | 2001-04-20 | 2006-02-14 | Wordsniffer, Inc. | Method and apparatus for integrated, user-directed web site text translation |
US8214196B2 (en) | 2001-07-03 | 2012-07-03 | University Of Southern California | Syntax-based statistical translation model |
JP3809863B2 (en) * | 2002-02-28 | 2006-08-16 | インターナショナル・ビジネス・マシーンズ・コーポレーション | server |
AU2003269808A1 (en) | 2002-03-26 | 2004-01-06 | University Of Southern California | Constructing a translation lexicon from comparable, non-parallel corpora |
JP2003296223A (en) * | 2002-03-29 | 2003-10-17 | Fuji Xerox Co Ltd | Method and device, and program for providing web page information |
US7627479B2 (en) | 2003-02-21 | 2009-12-01 | Motionpoint Corporation | Automation tool for web site content language translation |
JP2004280352A (en) * | 2003-03-14 | 2004-10-07 | Ricoh Co Ltd | Method and program for translating document data |
US8230112B2 (en) * | 2003-03-27 | 2012-07-24 | Siebel Systems, Inc. | Dynamic support of multiple message formats |
US8548794B2 (en) | 2003-07-02 | 2013-10-01 | University Of Southern California | Statistical noun phrase translation |
US20050010419A1 (en) * | 2003-07-07 | 2005-01-13 | Ahmad Pourhamid | System and Method for On-line Translation of documents and Advertisement |
US7321852B2 (en) * | 2003-10-28 | 2008-01-22 | International Business Machines Corporation | System and method for transcribing audio files of various languages |
US8296127B2 (en) | 2004-03-23 | 2012-10-23 | University Of Southern California | Discovery of parallel text portions in comparable collections of corpora and training using comparable texts |
US8666725B2 (en) | 2004-04-16 | 2014-03-04 | University Of Southern California | Selection and use of nonstatistical translation components in a statistical machine translation framework |
WO2006042321A2 (en) | 2004-10-12 | 2006-04-20 | University Of Southern California | Training for a text-to-text application which uses string to tree conversion for training and decoding |
US8676563B2 (en) | 2009-10-01 | 2014-03-18 | Language Weaver, Inc. | Providing human-generated and machine-generated trusted translations |
US8886517B2 (en) | 2005-06-17 | 2014-11-11 | Language Weaver, Inc. | Trust scoring for language translation systems |
US10319252B2 (en) | 2005-11-09 | 2019-06-11 | Sdl Inc. | Language capability assessment and training apparatus and techniques |
JP2007207328A (en) * | 2006-01-31 | 2007-08-16 | Toshiba Corp | Information storage medium, program, information reproducing method, information reproducing device, data transfer method, and data processing method |
US8943080B2 (en) | 2006-04-07 | 2015-01-27 | University Of Southern California | Systems and methods for identifying parallel documents and sentence fragments in multilingual document collections |
US8886518B1 (en) | 2006-08-07 | 2014-11-11 | Language Weaver, Inc. | System and method for capitalizing machine translated text |
US8433556B2 (en) | 2006-11-02 | 2013-04-30 | University Of Southern California | Semi-supervised training for statistical word alignment |
US9122674B1 (en) * | 2006-12-15 | 2015-09-01 | Language Weaver, Inc. | Use of annotations in statistical machine translation |
US8468149B1 (en) | 2007-01-26 | 2013-06-18 | Language Weaver, Inc. | Multi-lingual online community |
US8615389B1 (en) | 2007-03-16 | 2013-12-24 | Language Weaver, Inc. | Generation and exploitation of an approximate language model |
US8831928B2 (en) | 2007-04-04 | 2014-09-09 | Language Weaver, Inc. | Customizable machine translation service |
US9361294B2 (en) | 2007-05-31 | 2016-06-07 | Red Hat, Inc. | Publishing tool for translating documents |
US10296588B2 (en) * | 2007-05-31 | 2019-05-21 | Red Hat, Inc. | Build of material production system |
US8825466B1 (en) | 2007-06-08 | 2014-09-02 | Language Weaver, Inc. | Modification of annotated bilingual segment pairs in syntax-based machine translation |
US20090007128A1 (en) * | 2007-06-28 | 2009-01-01 | International Business Machines Corporation | method and system for orchestrating system resources with energy consumption monitoring |
JP5656353B2 (en) * | 2007-11-07 | 2015-01-21 | インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation | Method and apparatus for controlling access of multilingual text resources |
US7974832B2 (en) * | 2007-12-12 | 2011-07-05 | Microsoft Corporation | Web translation provider |
US20090162818A1 (en) * | 2007-12-21 | 2009-06-25 | Martin Kosakowski | Method for the determination of supplementary content in an electronic device |
US9201870B2 (en) * | 2008-01-25 | 2015-12-01 | First Data Corporation | Method and system for providing translated dynamic web page content |
US9110890B2 (en) * | 2008-02-15 | 2015-08-18 | International Business Machines Corporation | Selecting a language encoding of a static communication in a virtual universe |
US7698688B2 (en) * | 2008-03-28 | 2010-04-13 | International Business Machines Corporation | Method for automating an internationalization test in a multilingual web application |
CA2755427C (en) * | 2009-03-18 | 2017-03-14 | Google Inc. | Web translation with display replacement |
US8990064B2 (en) | 2009-07-28 | 2015-03-24 | Language Weaver, Inc. | Translating documents based on content |
US8380486B2 (en) | 2009-10-01 | 2013-02-19 | Language Weaver, Inc. | Providing machine-generated translations and corresponding trust levels |
US10417646B2 (en) | 2010-03-09 | 2019-09-17 | Sdl Inc. | Predicting the cost associated with translating textual content |
EP2680159B1 (en) | 2010-07-13 | 2020-01-15 | Motionpoint Corporation | Dynamic language translation of a message |
CN102467497B (en) * | 2010-10-29 | 2014-11-05 | 国际商业机器公司 | Method and system for text translation in verification program |
US9164988B2 (en) * | 2011-01-14 | 2015-10-20 | Lionbridge Technologies, Inc. | Methods and systems for the dynamic creation of a translated website |
US9063931B2 (en) * | 2011-02-16 | 2015-06-23 | Ming-Yuan Wu | Multiple language translation system |
US11003838B2 (en) | 2011-04-18 | 2021-05-11 | Sdl Inc. | Systems and methods for monitoring post translation editing |
US8694303B2 (en) | 2011-06-15 | 2014-04-08 | Language Weaver, Inc. | Systems and methods for tuning parameters in statistical machine translation |
CN102855107B (en) | 2011-06-30 | 2015-05-27 | 国际商业机器公司 | Method and system for demonstrating file on computer |
US8812295B1 (en) * | 2011-07-26 | 2014-08-19 | Google Inc. | Techniques for performing language detection and translation for multi-language content feeds |
US8886515B2 (en) | 2011-10-19 | 2014-11-11 | Language Weaver, Inc. | Systems and methods for enhancing machine translation post edit review processes |
US8942973B2 (en) | 2012-03-09 | 2015-01-27 | Language Weaver, Inc. | Content page URL translation |
US10261994B2 (en) | 2012-05-25 | 2019-04-16 | Sdl Inc. | Method and system for automatic management of reputation of translators |
US9116886B2 (en) * | 2012-07-23 | 2015-08-25 | Google Inc. | Document translation including pre-defined term translator and translation model |
US9152622B2 (en) | 2012-11-26 | 2015-10-06 | Language Weaver, Inc. | Personalized machine translation via online adaptation |
US9213694B2 (en) | 2013-10-10 | 2015-12-15 | Language Weaver, Inc. | Efficient online domain adaptation |
US20150254236A1 (en) * | 2014-03-13 | 2015-09-10 | Michael Lewis Moravitz | Translation software built into internet |
US9690780B2 (en) | 2014-05-23 | 2017-06-27 | International Business Machines Corporation | Document translation based on predictive use |
US10713699B1 (en) * | 2014-11-14 | 2020-07-14 | Andersen Corporation | Generation of guide materials |
CN105930320A (en) * | 2016-04-15 | 2016-09-07 | 惠州Tcl移动通信有限公司 | Word crossing and searching method and system based on mobile terminals |
KR102056999B1 (en) * | 2018-02-26 | 2019-12-17 | 러브랜드 가부시키가이샤 | Web page translation system, web page translation device, web page providing device and web page translation method |
US10803257B2 (en) * | 2018-03-22 | 2020-10-13 | Microsoft Technology Licensing, Llc | Machine translation locking using sequence-based lock/unlock classification |
US10540452B1 (en) * | 2018-06-21 | 2020-01-21 | Amazon Technologies, Inc. | Automated translation of applications |
US10922496B2 (en) | 2018-11-07 | 2021-02-16 | International Business Machines Corporation | Modified graphical user interface-based language learning |
US11373048B2 (en) * | 2019-09-11 | 2022-06-28 | International Business Machines Corporation | Translation of multi-format embedded files |
US11385916B2 (en) * | 2020-03-16 | 2022-07-12 | Servicenow, Inc. | Dynamic translation of graphical user interfaces |
Family Cites Families (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4774655A (en) | 1984-10-24 | 1988-09-27 | Telebase Systems, Inc. | System for retrieving information from a plurality of remote databases having at least two different languages |
JPS61105671A (en) | 1984-10-29 | 1986-05-23 | Hitachi Ltd | Natural language processing device |
JP2654001B2 (en) * | 1986-05-08 | 1997-09-17 | 株式会社東芝 | Machine translation method |
GB2198565A (en) * | 1986-11-28 | 1988-06-15 | Sharp Kk | Translation apparatus |
GB2199170A (en) | 1986-11-28 | 1988-06-29 | Sharp Kk | Translation apparatus |
US4870610A (en) | 1987-08-25 | 1989-09-26 | Bell Communications Research, Inc. | Method of operating a computer system to provide customed I/O information including language translation |
US5005127A (en) * | 1987-10-26 | 1991-04-02 | Sharp Kabushiki Kaisha | System including means to translate only selected portions of an input sentence and means to translate selected portions according to distinct rules |
JP2831647B2 (en) * | 1988-03-31 | 1998-12-02 | 株式会社東芝 | Machine translation system |
US5140521A (en) | 1989-04-26 | 1992-08-18 | International Business Machines Corporation | Method for deleting a marked portion of a structured document |
US5289375A (en) | 1990-01-22 | 1994-02-22 | Sharp Kabushiki Kaisha | Translation machine |
JPH03268062A (en) * | 1990-03-19 | 1991-11-28 | Fujitsu Ltd | Register for private use word in machine translation electronic mail device |
JP3114181B2 (en) | 1990-03-27 | 2000-12-04 | 株式会社日立製作所 | Interlingual communication translation method and system |
US5175684A (en) | 1990-12-31 | 1992-12-29 | Trans-Link International Corp. | Automatic text translation and routing system |
US5497319A (en) | 1990-12-31 | 1996-03-05 | Trans-Link International Corp. | Machine translation and telecommunications system |
JP2815714B2 (en) | 1991-01-11 | 1998-10-27 | シャープ株式会社 | Translation equipment |
JP2765665B2 (en) | 1991-08-01 | 1998-06-18 | 富士通株式会社 | Translation device for documents with typographical information |
JP2848729B2 (en) * | 1991-12-06 | 1999-01-20 | 株式会社東芝 | Translation method and translation device |
US5243519A (en) | 1992-02-18 | 1993-09-07 | International Business Machines Corporation | Method and system for language translation within an interactive software application |
JP3038079B2 (en) | 1992-04-28 | 2000-05-08 | シャープ株式会社 | Automatic translation device |
JP3220560B2 (en) | 1992-05-26 | 2001-10-22 | シャープ株式会社 | Machine translation equipment |
US5373442A (en) | 1992-05-29 | 1994-12-13 | Sharp Kabushiki Kaisha | Electronic translating apparatus having pre-editing learning capability |
US5608622A (en) * | 1992-09-11 | 1997-03-04 | Lucent Technologies Inc. | System for analyzing translations |
JPH07210558A (en) | 1994-01-20 | 1995-08-11 | Fujitsu Ltd | Machine translation device |
US5822720A (en) * | 1994-02-16 | 1998-10-13 | Sentius Corporation | System amd method for linking streams of multimedia data for reference material for display |
US5740231A (en) | 1994-09-16 | 1998-04-14 | Octel Communications Corporation | Network-based multimedia communications and directory system and method of operation |
US5678039A (en) * | 1994-09-30 | 1997-10-14 | Borland International, Inc. | System and methods for translating software into localized versions |
US5675817A (en) * | 1994-12-05 | 1997-10-07 | Motorola, Inc. | Language translating pager and method therefor |
US5855015A (en) * | 1995-03-20 | 1998-12-29 | Interval Research Corporation | System and method for retrieval of hyperlinked information resources |
US5963205A (en) * | 1995-05-26 | 1999-10-05 | Iconovex Corporation | Automatic index creation for a word processor |
US5752246A (en) * | 1995-06-07 | 1998-05-12 | International Business Machines Corporation | Service agent for fulfilling requests of a web browser |
US5710918A (en) * | 1995-06-07 | 1998-01-20 | International Business Machines Corporation | Method for distributed task fulfillment of web browser requests |
US5721908A (en) * | 1995-06-07 | 1998-02-24 | International Business Machines Corporation | Computer network for WWW server data access over internet |
US5745360A (en) * | 1995-08-14 | 1998-04-28 | International Business Machines Corp. | Dynamic hypertext link converter system and process |
US5781785A (en) * | 1995-09-26 | 1998-07-14 | Adobe Systems Inc | Method and apparatus for providing an optimized document file of multiple pages |
US6993471B1 (en) * | 1995-11-13 | 2006-01-31 | America Online, Inc. | Integrated multilingual browser |
US5870610A (en) * | 1996-06-28 | 1999-02-09 | Siemens Business Communication Systems, Inc. | Autoconfigurable method and system having automated downloading |
US6493735B1 (en) * | 1998-12-15 | 2002-12-10 | International Business Machines Corporation | Method system and computer program product for storing bi-directional language data in a text string object for display on non-bidirectional operating systems |
-
1995
- 1995-11-13 US US08/555,916 patent/US6993471B1/en not_active Expired - Fee Related
-
1996
- 1996-11-13 WO PCT/US1996/018102 patent/WO1997018516A1/en active Application Filing
- 1996-11-13 AU AU14061/97A patent/AU1406197A/en not_active Abandoned
- 1996-11-13 CA CA002216387A patent/CA2216387C/en not_active Expired - Lifetime
- 1996-11-13 EP EP96944191A patent/EP0829053A4/en not_active Ceased
-
2005
- 2005-02-17 US US11/059,752 patent/US7292987B2/en not_active Expired - Fee Related
-
2007
- 2007-11-06 US US11/935,837 patent/US7716038B2/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CA2216387A1 (en) | 1997-05-22 |
WO1997018516A1 (en) | 1997-05-22 |
US7716038B2 (en) | 2010-05-11 |
US20050149315A1 (en) | 2005-07-07 |
EP0829053A1 (en) | 1998-03-18 |
US20080059148A1 (en) | 2008-03-06 |
AU1406197A (en) | 1997-06-05 |
US6993471B1 (en) | 2006-01-31 |
US7292987B2 (en) | 2007-11-06 |
EP0829053A4 (en) | 1998-12-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2216387C (en) | Integrated multilingual browser | |
US6330529B1 (en) | Mark up language grammar based translation system | |
US20020065658A1 (en) | Universal translator/mediator server for improved access by users with special needs | |
RU2295150C2 (en) | Segment of translation data | |
US6119078A (en) | Systems, methods and computer program products for automatically translating web pages | |
US6405192B1 (en) | Navigation assistant-method and apparatus for providing user configured complementary information for data browsing in a viewer context | |
KR100317401B1 (en) | Apparatus and method for printing related web pages | |
US6073143A (en) | Document conversion system including data monitoring means that adds tag information to hyperlink information and translates a document when such tag information is included in a document retrieval request | |
US6925595B1 (en) | Method and system for content conversion of hypertext data using data mining | |
US5903727A (en) | Processing HTML to embed sound in a web page | |
US6961737B2 (en) | Serving signals | |
US6308198B1 (en) | Method and apparatus for dynamically adding functionality to a set of instructions for processing a web document based on information contained in the web document | |
EP0834853A2 (en) | Method and apparatus for presenting client side image maps | |
US20010014895A1 (en) | Method and apparatus for dynamic software customization | |
US7756849B2 (en) | Method of searching for text in browser frames | |
JP4990302B2 (en) | Data processing method, data processing program, and data processing apparatus | |
US20020188435A1 (en) | Interface for submitting richly-formatted documents for remote processing | |
KR20040101468A (en) | Method, system, computer program product and storage device for displaying a document | |
US6035338A (en) | Document browse support system and document processing system | |
Iaccarino et al. | Personalizable edge services for web accessibility | |
US6636235B1 (en) | Lettering adjustments for display resolution | |
WO2002080133A1 (en) | Non visual presentation of salient features in a document | |
US8806326B1 (en) | User preference based content linking | |
KR20020042026A (en) | Pre-processor and method and apparatus for processing web documents using the same | |
KR20010103545A (en) | Storage medium, system and apparatus for Internet translation with advertisement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
MKEX | Expiry |
Effective date: 20161114 |