CN100469109C - Automatic translation method for digital video captions - Google Patents

Automatic translation method for digital video captions Download PDF

Info

Publication number
CN100469109C
CN100469109C CNB2006100871328A CN200610087132A CN100469109C CN 100469109 C CN100469109 C CN 100469109C CN B2006100871328 A CNB2006100871328 A CN B2006100871328A CN 200610087132 A CN200610087132 A CN 200610087132A CN 100469109 C CN100469109 C CN 100469109C
Authority
CN
China
Prior art keywords
translation
phrase
language
digital video
captions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2006100871328A
Other languages
Chinese (zh)
Other versions
CN101090461A (en
Inventor
钱跃良
熊德意
刘群
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo Heng Jie Internet of things limited company
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CNB2006100871328A priority Critical patent/CN100469109C/en
Publication of CN101090461A publication Critical patent/CN101090461A/en
Application granted granted Critical
Publication of CN100469109C publication Critical patent/CN100469109C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

This invention discloses an automatic translation method for digital video captions including: picking up caption content expressed by source language and converting it to a text format, carrying out language judgement, selecting a translation phrase list from the source to target language, picking up a model of the target language according to the kind of the target, dividing languages in the caption into phrases to look up content of the phrase in the target language, connecting the translated phrases in order and computing marks of the translated result, selecting the result with the highest translation result as the content in the target language and converting the translated content to the format accepted by the transmission flow and outputting it.

Description

A kind of automatic translation method for digital video captions
Technical field
The present invention relates to the automatic translating method of captions, the automatic translating method of captions in particularly a kind of digital video.
Background technology
Along with popularizing of computer, home theater, and a large amount of films adopts the Digital Media mode to make, issue, and people can contact the film from country variant, different regions, different language fast, on a large scale.But the language understanding problem that subject matter of the thing followed is the film beholder.Generally speaking, to the dialogue or the monologue that major part comes from personage in the film of understanding of film, remaining sub-fraction is only and comes from picture, non-language factor such as music.Therefore the beholder then can not understand film basically if obstacle is arranged on language.This means to have lost a large amount of potential spectators for film making side; And, then lost the chance of appreciating a large amount of primos for the beholder.
The traditional method that addresses this problem has three kinds.The first, making side is the captions of film interpolation different language, but because cost of manufacture, making side can not be that each language adds captions.For those Non-English movies, general maximum also is to add extra English subtitles, can understand English from literal but problem is not all non-English spectators.Second kind is done rule is that join film and translates film introduction side, and obvious this way cost is higher, also can cause the time difference problem of film in the different regions distribution simultaneously.But topmost problem is that this way can not provide a large amount of films of translating of joining.The third does the combination that rule can be regarded as preceding two kinds of methods, promptly mix translated captions by spontaneous the organizing of the moviegoer of a certain language to the foreign language film, this method is owing to brought into play numerous fans' participation, therefore quantitatively can guarantee basically can guarantee basically in time in time in full amount.But also there are following problems in this method: 1) Fan Yi captions quality is very different; 2) captions and sound can not fine correspondences, and this is that the fan is because oneself technical reason or operational carelessness often cause top problem because the translation body joined in captions also is a technical strong job; 3) there is the dispute over copyright problem; 4) can not be used for picture digital satellite receiver and this real-time playing system of DVD player.
In order to overcome the existing the problems referred to above of traditional method, those skilled in the art has proposed a kind of caption translating engine, in application number is 200410038308.1 Chinese patent application this caption translating engine is illustrated.Described caption translating engine is used for and will translates into another kind of language from the caption content of video source, and its workflow mainly comprises following steps:
1, from video source, extracts first kind of language representation's captions;
2, query language dictionary will be second kind of language of identical meanings with the character translation of first kind of language representation's caption content;
3, the output of the captions after will translating.
Captions are being translated as the process of second kind of language from first kind of language, phrase to the basis of phrase on simply by the inquiry dictionary obtain translation, in query script, if the phrase with first kind of language representation can not obtain second kind of synon definite coupling of language in dictionary, then phrase is carried out the translation of word to word.
In this application, there is following problem:
A, in actual applications, a speech has the translation of a plurality of correspondences, and perhaps phrase has the translation of a plurality of correspondences, does not have explanation how to select the most suitable translation in this application from multiple translation;
Usually there is multiple phrase segmentation result to exist in b, a sentence, for example, a sentence may be " phrase a+ phrase b ", also might be " phrase c+ phrase d ", wherein phrase c may be whole preceding parts of phrase b that add of phrase a, and phrase d is the remaining part of phrase b.Do not have in this application explanation is how to realize the division of distich subphrase.
For the problems referred to above, those skilled in the art adopts following method to solve usually.
1, there are a plurality of translations as if a speech or phrase, are chosen in so and come the most preceding translation in the dictionary;
2, when the phrase in the sentence is divided, from left to right, each the longest phrase of coupling of selecting is translated.
Obviously, if adopted above-mentioned interpretation method and matching principle in translation process, then translation quality can be very bad, allows beholder's indigestion, even mislead the beholder.
In addition, this caption translating engine is merely able to the captions in the video source are carried out the real-time online translation, and can not do the pretranslation of off-line to captions, has dwindled the scope of application of translation engine, is unfavorable for promoting the use of.
Summary of the invention
The objective of the invention is to overcome existing captions automatic translating method or poor, the scope of application smaller defect of device translation quality, thereby a kind of captions automatic translating method that accuracy is higher, the scope of application is wider of translating is provided.
To achieve these goals, the invention provides a kind of automatic translation method for digital video captions, be used for and will translate into another kind of language, said method comprising the steps of from the caption content of digital video source:
1), from from extracting the caption content of representing with original language the transport stream of described digital video source, and be text formatting from image format conversion with caption content;
2), the caption content after the conversion carried out languages differentiate, judge which kind of language described original language is;
3), the classification of the Aim of Translation language of setting according to the classification of original language and user of wanting, the translation phrase table of selection from the original language to the object language, the object language phrase that comprises the original language phrase in the described translation phrase table, has identical meanings with the original language phrase, and the translation probability between described original language phrase and described object language phrase;
4), according to the languages classification of object language, extract the language model of object language;
5), the statement of representing with original language in the captions is divided into phrase, and search the implication of phrase in object language according to the translation phrase table that step 3) obtains;
Be divided in the process of phrase at described statement, a statement has different division methods, and all divisions are all listed;
, all implication of phrase in object language all listed when searching the implication of phrase in object language described;
6), from left to right, the implication of phrase in object language connected in turn, form the target translation, the target translation that has connected is called the part translation, in connection procedure, the possible translation of the phrase that the part translation and the next one is possible is connected, and forms new part translation, and calculates the mark of new part translation; Repeat above-mentioned connection procedure, finish up to whole word translation; Wherein,
The mark of the part translation that described calculating is new comprises:
A, initial part are translated, and promptly do not translate the somersault of any original language word and translate, and its mark is 1;
The translation probability of the phrase that b, current newly-generated part are translated is: the phrase translation probability of a last part translation multiply by the translation probability of the phrase that is connected; The translation probability of described phrase is obtained by the translation phrase table that step 3) obtains;
The probabilistic language model that current newly-generated part is translated is: the probabilistic language model of a last part translation multiply by the probabilistic language model of the phrase that is connected; The probabilistic language model of the phrase in the described connection is that latter two word of an above part translation calculates as forerunner's history;
The phrase translation probability of c, partly translation multiply by probabilistic language model, just obtains the mark of this part translation;
In above-mentioned translation process, the part translation that covers identical sources language part all is kept in the same storehouse, each storehouse the highest top n result of retention score, and the value of described N is between 10 to 100;
7), in the storehouse of depositing whole sentence translation result, select the highest translation result of mark as the implication of the statement in the captions in object language;
8), the caption content after will translating is converted to the form that transport stream is accepted again by text formatting, and be compound in the transport stream and export.
In the technique scheme, in described step 1), described text formatting comprises Unicode format.
In the technique scheme, in described step 1), adopting optical character recognition engine is text formatting with described caption content from image format conversion.
In the technique scheme, in described step 3), described translation phrase table obtains by dictionary or obtains from Parallel Corpus.
In the technique scheme, in described step 3), described translation probability is the value that original language phrase and object language phrase intertranslation number of times obtain after divided by original language phrase occurrence number.
In the technique scheme, in described step 4), the language model of described object language is the 3-gram model.
When the language model of described object language is the 3-gram model, being calculated as follows of described probabilistic language model:
P ( s ) = Π i P ( w i | w i - 1 , w i - 2 )
Wherein, P represents probability, and s is a sentence, and w is a word, and the probability between the described word obtains from real corpus in advance.
In the technique scheme, described digital video source comprises personal computer, DVD player, top box of digital machine and digital satellite receiver.
The invention has the advantages that:
1, in translation process, method of the present invention keeps all possible translation result to each phrase, and represents that with translation probability whose possibility is bigger, thereby has improved the accuracy rate of translation.
2, the inventive method is considered the phrase segmentation on all possible coupling when phrase segmentation made in statement, forms different final translation results by them, makes translation result more comprehensive, has improved the accuracy rate of translation.
3, automatic translation method for digital video captions of the present invention both can carry out pretranslation to captions before video playback, also can translate automatically simultaneously when video playback, had enlarged the scope of application.
4, method of the present invention the time only needs the original language captions of video itself in translation, just can in time provide the film audient understandable in large quantities, the object language captions of synchronous playing, and it is low to have a cost, does not have the advantage of dispute over copyright.
Description of drawings
Fig. 1 is the structure chart of digital video captions automatic translation system;
How Fig. 2 obtains the schematic diagram of phrase table automatically from parallel language material for automatic translation method for digital video captions of the present invention;
Fig. 3 is the flow chart of automatic translation method for digital video captions of the present invention.
The drawing explanation
Module 102 automatic translation module 103 multilingual knowledge bases chosen automatically in 101 captions
The automatic synthesis module 201 words aligning modules of 104 captions 202 phrase Automatic Extraction modules
Embodiment
Below in conjunction with the drawings and specific embodiments automatic translation method for digital video captions of the present invention is described further.
Automatic translation method for digital video captions of the present invention is applied in a kind of digital video captions automatic translation system, as shown in Figure 1, this system comprises that captions choose module 101, automatically translation module 102, multilingual knowledge base 103, the automatic synthesis module 104 of captions automatically.Wherein, described multilingual knowledge base 103 comprises three parts: 1) languages are differentiated sub-knowledge base, deposit the typical sample of each languages; 2) different language between each other the translation phrase table; 3) language model of object language; Described captions are chosen automatically in the module 101 and are included optical character recognition engine.
In the present embodiment, the English subtitles in the digital video are translated as Chinese subtitle, utilize automatic translation method for digital video captions of the present invention that English subtitles are translated, as shown in Figure 2, specifically comprise following steps:
1), the automatic extraction module 101 of captions is always caught the caption content of representing with original language, and captions is converted into text formatting in the transport stream of video source.Described text formatting such as Unicode format etc.
In the present embodiment, described transport stream is the system flow based on MPEG-2, comprises multiplexing MPEG-2 video flowing, audio stream and picture (SPC) stream in system flow.Captions in the film generally are included in the SPC stream.The automatic extraction module 101 of described captions will be caught the captions in the transport stream, needs only the SPC flow point in the transport stream from coming out.After the automatic extraction module 101 of captions obtains SPC stream, convert the caption content of picture format to text formatting with optical character recognition engine.
2), after captions are converted to text formatting, send in the automatic translation module 102, differentiate automatically, obtain the languages classification of these captions by 102 pairs of captions languages of automatic translation module.When languages are done to differentiate automatically, its basic thought is based on the method for statistics, n character before getting from the captions of text formatting differentiated typical sample of each languages in the sub-knowledge base and calculated similarity between these characters and various language according to leaving languages in then.This languages method of discrimination is ripe prior art, and those skilled in the art be if will understand its ins and outs, visible list of references 2:T.Dunning, " Statistical Identification of Language", in Technical report CRLMCCS-94-273, Computing Research Lab, New Mexico State University, March 1994.
3), automatically the multilingual automatic translation engine in the translation module 102 is extracted the translation phrase table of corresponding source languages to target language according to the classification of languages from multilingual knowledge base 103.Wherein the languages of object language are obtained by user's prior selection.
Each record in the described translation phrase table is made up of three parts: original language phrase, object language phrase and the translation probability between them.Phrase wherein is the phrase on the linguistic meaning not necessarily, the continuous word strings of saying so exactly, and the translation probability between original language phrase and object language phrase is to do statistics according to the operating position of ordinary language to obtain.
For example, following record is arranged in a phrase table:
" Oh, I ' m sorry||| is sorry || | 1 "
" I borrow some||| borrows to me || | 0.378 "
Wherein, " || | " is separated to be respectively the original language phrase, object language phrase and the translation probability between them.Can see that " borrowing to me " is not the proper phrase of linguistics, but these phrases are helpful to automatic translation.In addition, a phrase, the corresponding a plurality of different phrases of possibility in another kind of language for this situation, are write down the difference translation of phrase with different records in the translation phrase table.
Described translation phrase table can obtain by two kinds of methods, and a kind of is to obtain from dictionary, and another kind obtains from Parallel Corpus automatically.The translation probability of the phrase that obtains from dictionary can manually be arranged to higher value.
Described to obtain the right process of phrase translation from Parallel Corpus automatically as follows: it is right at first to collect the sentence that bilingual aligns each other, as translated captions; The automatic alignment of utilization software obtains the alignment relation between the sentence centering word; Last phrase extraction instrument is extracting phrase from the language material that it is good that word level is alignd, and calculates the translation probability between them.
Provided among Fig. 2 and how from parallel language material, to have obtained these phrases.At first will align good sentence to being defeated by words aligning module 201 to obtain the alignment between the word, and to " I have adog " → " I have a dog ", words aligning is closed and is as sentence: I → I, have → have, a → one, dog → dog.According to words aligning relation, phrase Automatic Extraction module 202 can the Automatic Extraction phrase, as " I have " → " I have ", and " I have a " → " I have one " or the like, and calculate the phrase translation probability.Described phrase translation probability is the value that original language object language phrase intertranslation number of times is obtained after divided by original language phrase occurrence number.For example " I have a " occurred in language material 100 times altogether, wherein have to be translated as " I have one " 20 times, other may be translated as " I have one " or " I have one " etc., then translation probability is 20/100=0.2.
4), automatically the multilingual automatic translation engine in the translation module 102 is extracted the language model of object language according to the classification of target language from multilingual knowledge base 103.Described language model is used for calculating the smooth degree of Aim of Translation language, and the smooth terms of degree speech model probability of sentence represents that the probability of language model is high more, and then the smooth degree of sentence is just high more, also with regard to approaching more true language.In multilingual knowledge base 103, described language model is the 3-gram model, and the probability of sentence by formula (1) calculates:
P ( s ) = Π i P ( w i | w i - 1 , w i - 2 ) - - - ( 1 )
Wherein P represents probability, and s is a sentence, and w is a word.Probability between these words obtains from real corpus in advance.
Such as for following sentence, its probabilistic language model may for:
I have a dog: 0.5;
I have a dog: 0.7;
My dog: 0.01;
I am a dog: 0.001.
Wherein, the probabilistic language model of " I have a dog " is the highest, also the most approaching true language.
5), based on general translation algorithm, automatically the statement of representing with original language in 102 pairs of captions of translation module is divided into different phrases, searches the implication of each phrase in object language according to the phrase table that obtains in the step 3) then.Statement is being divided in the process of phrase, a statement can be divided into different phrases, and in the methods of the invention, the phrase that may divide is all listed, and all phrases are all translated.In with the process of phrase from the source language translation to the object language, a phrase has different explanations, in the methods of the invention, all implications that can find is listed, and treats that subsequent operation does further selection.
If the original language sentence of input be " I have a dog ", this sentence is cut into different phrases and mates separately translation, have such two kinds of cuttings for this sentence:
1) " I have " → " I have " | " I " | " I ", " a " → " one " | " one " | " one ", " dog " → " dog "
2) " I have " → " I have " | " I " | " I ", " a dog " → " dog " | " dog " | " dog "
" | " is separated to be different translations.
6), from left to right, the implication of phrase in object language connected in turn, form the target translation, the target translation that has connected is called the part translation, in connection procedure, the possible translation of the phrase that the part translation and the next one is possible is connected, and forms new part translation, and calculates the mark of new part translation; Repeat above-mentioned connection procedure, finish up to whole word translation.
When calculating the mark of described part translation, may further comprise the steps:
A, initial part are translated, and promptly do not translate the somersault of any original language word and translate, and its mark is 1;
The translation probability of the phrase that b, current newly-generated part are translated is: the phrase translation probability of a last part translation multiply by the translation probability of the phrase that is connected; The translation probability of described phrase is obtained by the translation phrase table that step 3) obtains;
The probabilistic language model that current newly-generated part is translated is: the probabilistic language model of a last part translation multiply by the probabilistic language model of the phrase that is connected; The probabilistic language model of the phrase in the described connection is that latter two word of an above part translation calculates as forerunner's history;
The phrase translation probability of c, partly translation multiply by probabilistic language model, just obtains the mark of this part translation.
In above-mentioned translation process, the part translation that covers identical sources language part all is kept in the same storehouse each storehouse the highest top n result of retention score; Described N gets 50 in the present embodiment.
For example, the sentence to " the I have adog " that mention in the preceding step can have following different translation result, and each result's mark is also had nothing in common with each other:
I have a dog: 0.05
I have a dog: 0.07
My dog: 0.001
I am a dog: 0.0001
7), in the storehouse of depositing whole sentence translation result, select the highest translation result of mark as final translation, and output from automatic translation module 102.
In above-mentioned example, for the sentence of " I have a dog ", the mark of " I have a dog " is the highest, therefore selects this sentence as final translation.And on linguistics, obviously this translation result also is only.
8), after caption translating becomes object language, be sent in the automatic synthesis module 104 of captions, the captions after the translation are regenerated SPC stream, and original SPC stream and video flowing in the alternative transport stream, audio stream are multiplexing, form new transport stream, are transferred to video terminal.
Automatic translation method for digital video captions of the present invention is to the digital video caption on the personal computer, can translate whole captions automatically in advance, and then go to play in the digital video that is added to, also can when video playback, translate automatically, and in the synchronous digital video that is added to.
When automatic translation method for digital video captions of the present invention was used for DVD player, top box of digital machine and digital satellite receiver, the automatic translation of digital video caption was carried out synchronously with video playback.

Claims (8)

1, a kind of automatic translation method for digital video captions said method comprising the steps of:
1), from from extracting the caption content of representing with original language the transport stream of described digital video source, and be text formatting from image format conversion with caption content;
2), the caption content after the conversion carried out languages differentiate, judge which kind of language described original language is;
3), the classification of the Aim of Translation language of setting according to the classification of original language and user of wanting, the translation phrase table of selection from the original language to the object language, the object language phrase that comprises the original language phrase in the described translation phrase table, has identical meanings with the original language phrase, and the translation probability between described original language phrase and described object language phrase;
4), according to the languages classification of object language, extract the language model of object language;
5), the statement of representing with original language in the captions is divided into phrase, and search the implication of phrase in object language according to the translation phrase table that step 3) obtains;
Be divided in the process of phrase at described statement, a statement has different division methods, and all divisions are all listed;
, all implication of phrase in object language all listed when searching the implication of phrase in object language described;
6), from left to right, the implication of phrase in object language connected in turn, form the target translation, the target translation that has connected is called the part translation, in connection procedure, the possible translation of the phrase that the part translation and the next one is possible is connected, and forms new part translation, and calculates the mark of new part translation; Repeat above-mentioned connection procedure, finish up to whole word translation; Wherein,
The mark of the part translation that described calculating is new comprises:
A, initial part are translated, and promptly do not translate the somersault of any original language word and translate, and its mark is 1;
The translation probability of the phrase that b, current newly-generated part are translated is: the phrase translation probability of a last part translation multiply by the translation probability of the phrase that is connected; The translation probability of described phrase is obtained by the translation phrase table that step 3) obtains;
The probabilistic language model that current newly-generated part is translated is: the probabilistic language model of a last part translation multiply by the probabilistic language model of the phrase that is connected; The probabilistic language model of the phrase in the described connection is that latter two word of an above part translation calculates as forerunner's history;
The phrase translation probability of c, partly translation multiply by probabilistic language model, just obtains the mark of this part translation;
In above-mentioned translation process, the part translation that covers identical sources language part all is kept in the same storehouse, each storehouse the highest top n result of retention score, and the value of described N is between 10 to 100;
7), in the storehouse of depositing whole sentence translation result, select the highest translation result of mark as the implication of the statement in the captions in object language;
8), the caption content after will translating is converted to the form that transport stream is accepted again by text formatting, and be compound in the transport stream and export.
2, automatic translation method for digital video captions according to claim 1 is characterized in that, in described step 1), described text formatting comprises Unicode format.
3, automatic translation method for digital video captions according to claim 1 is characterized in that, in described step 1), adopting optical character recognition engine is text formatting with described caption content from image format conversion.
4, automatic translation method for digital video captions according to claim 1 is characterized in that, in described step 3), described translation phrase table obtains by dictionary or obtains from Parallel Corpus.
5, automatic translation method for digital video captions according to claim 1 is characterized in that, in described step 3), described translation probability is the value that original language phrase and object language phrase intertranslation number of times obtain after divided by original language phrase occurrence number.
6, automatic translation method for digital video captions according to claim 1 is characterized in that, in described step 4), the language model of described object language is the 3-gram model.
7, automatic translation method for digital video captions according to claim 6 is characterized in that, when the language model of described object language is the 3-gram model, and being calculated as follows of described probabilistic language model:
P ( s ) = Π i P ( w i | w i - 1 , w i - 2 )
Wherein, P represents probability, and s is a sentence, and w is a word, and the probability between the described word obtains from real corpus in advance.
8, automatic translation method for digital video captions according to claim 1 is characterized in that, described digital video source comprises personal computer, DVD player, top box of digital machine and digital satellite receiver.
CNB2006100871328A 2006-06-13 2006-06-13 Automatic translation method for digital video captions Expired - Fee Related CN100469109C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006100871328A CN100469109C (en) 2006-06-13 2006-06-13 Automatic translation method for digital video captions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006100871328A CN100469109C (en) 2006-06-13 2006-06-13 Automatic translation method for digital video captions

Publications (2)

Publication Number Publication Date
CN101090461A CN101090461A (en) 2007-12-19
CN100469109C true CN100469109C (en) 2009-03-11

Family

ID=38943598

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006100871328A Expired - Fee Related CN100469109C (en) 2006-06-13 2006-06-13 Automatic translation method for digital video captions

Country Status (1)

Country Link
CN (1) CN100469109C (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10409903B2 (en) 2016-05-31 2019-09-10 Microsoft Technology Licensing, Llc Unknown word predictor and content-integrated translator

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101883237B (en) * 2010-06-22 2012-08-22 武汉东太信息产业有限公司 Roll titles control method based on digital television STB (Set Top Box)
CN101882226B (en) * 2010-06-24 2013-07-24 汉王科技股份有限公司 Method and device for improving language discrimination among characters
CN102982024B (en) * 2011-09-02 2016-03-23 北京百度网讯科技有限公司 A kind of search need recognition methods and device
CN102708097A (en) * 2012-04-27 2012-10-03 曾立人 Online computer translation method and online computer translation system
CN103226947B (en) * 2013-03-27 2016-08-17 广东欧珀移动通信有限公司 A kind of audio-frequency processing method based on mobile terminal and device
CN103347126A (en) * 2013-06-27 2013-10-09 苏州创智宏云信息科技有限公司 Short message system
CN103561217A (en) * 2013-10-14 2014-02-05 深圳创维数字技术股份有限公司 Method and terminal for generating captions
CN104378692A (en) * 2014-11-17 2015-02-25 天脉聚源(北京)传媒科技有限公司 Method and device for processing video captions
CN105100665B (en) * 2015-08-21 2019-01-18 广州飞米电子科技有限公司 The method and apparatus for storing the multimedia messages of aircraft acquisition
CN110134973A (en) * 2019-04-12 2019-08-16 深圳壹账通智能科技有限公司 Video caption real time translating method, medium and equipment based on artificial intelligence
CN110516266A (en) * 2019-09-20 2019-11-29 张启 Video caption automatic translating method, device, storage medium and computer equipment
CN113343720A (en) * 2021-06-30 2021-09-03 北京搜狗科技发展有限公司 Subtitle translation method and device for subtitle translation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5477451A (en) * 1991-07-25 1995-12-19 International Business Machines Corp. Method and system for natural language translation
CN1241089A (en) * 1997-09-15 2000-01-12 美商·智译研发有限公司 Closed caption translation method and translator
US6304841B1 (en) * 1993-10-28 2001-10-16 International Business Machines Corporation Automatic construction of conditional exponential models from elementary features
CN1697515A (en) * 2004-05-14 2005-11-16 创新科技有限公司 Captions translation engine
CN2772159Y (en) * 2005-01-20 2006-04-12 英业达股份有限公司 Caption translating device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5477451A (en) * 1991-07-25 1995-12-19 International Business Machines Corp. Method and system for natural language translation
US6304841B1 (en) * 1993-10-28 2001-10-16 International Business Machines Corporation Automatic construction of conditional exponential models from elementary features
CN1241089A (en) * 1997-09-15 2000-01-12 美商·智译研发有限公司 Closed caption translation method and translator
CN1697515A (en) * 2004-05-14 2005-11-16 创新科技有限公司 Captions translation engine
CN2772159Y (en) * 2005-01-20 2006-04-12 英业达股份有限公司 Caption translating device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A real-time MT system for translating broadcast captions. Eric Nyberg,Teruko Mitamura.Proceedings of machine translation summit VI,Vol.1 No.1. 1997 *
统计机器翻译综述. 刘群.中文信息学报,第17卷第4期. 2003 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10409903B2 (en) 2016-05-31 2019-09-10 Microsoft Technology Licensing, Llc Unknown word predictor and content-integrated translator

Also Published As

Publication number Publication date
CN101090461A (en) 2007-12-19

Similar Documents

Publication Publication Date Title
CN100469109C (en) Automatic translation method for digital video captions
CN111968649B (en) Subtitle correction method, subtitle display method, device, equipment and medium
KR101990023B1 (en) Method for chunk-unit separation rule and display automated key word to develop foreign language studying, and system thereof
CN101131691B (en) Domain-adaptive portable machine translation device for translating closed captions using dynamic translation resources and method thereof
CN105704538A (en) Method and system for generating audio and video subtitles
US20030065655A1 (en) Method and apparatus for detecting query-driven topical events using textual phrases on foils as indication of topic
US20100299131A1 (en) Transcript alignment
US20080126074A1 (en) Method for matching of bilingual texts and increasing accuracy in translation systems
CN102667773A (en) Search device, search method, and program
Wang et al. Improving pre-trained multilingual models with vocabulary expansion
Meftouh et al. PADIC: extension and new experiments
JP5296598B2 (en) Voice information extraction device
Levin et al. Automated closed captioning for Russian live broadcasting
Yang et al. An automated analysis and indexing framework for lecture video portal
Gautam et al. Soccer game summarization using audio commentary, metadata, and captions
TW201039149A (en) Robust algorithms for video text information extraction and question-answer retrieval
Petukhova et al. SUMAT: Data Collection and Parallel Corpus Compilation for Machine Translation of Subtitles.
KR20210138311A (en) Apparatus for generating parallel corpus data between text language and sign language and method therefor
Armstrong Improving the quality of automated DVD subtitles via example-based machine translation
Pala et al. Real-time transcription, keyword spotting, archival and retrieval for telugu TV news using ASR
Ghosh et al. Multimodal indexing of multilingual news video
JP2018072979A (en) Parallel translation sentence extraction device, parallel translation sentence extraction method and program
KR20030014804A (en) Apparatus and Method for Database Construction of News Video based on Closed Caption and Method of Content-based Retrieval/Serching It
Pinnis et al. Developing a neural machine translation service for the 2017-2018 european union presidency
Jong et al. Language-based multimedia information retrieval

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: NINGBO HENGTUO CONTENT NETWORKING CO., LTD.

Free format text: FORMER OWNER: INSTITUTE OF COMPUTING TECHNOLOGY HINESE ACADEMY OF SCIENCES

Effective date: 20130104

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 100080 HAIDIAN, BEIJING TO: 315455 NINGBO, ZHEJIANG PROVINCE

TR01 Transfer of patent right

Effective date of registration: 20130104

Address after: 315455 No. 188 Yangming West Road, Ningbo, Zhejiang, Yuyao

Patentee after: Ningbo Heng Jie Internet of things limited company

Address before: 100080 Haidian District, Zhongguancun Academy of Sciences, South Road, No. 6, No.

Patentee before: Institute of Computing Technology, Chinese Academy of Sciences

C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090311

Termination date: 20130613