CN100469109C

CN100469109C - Automatic translation method for digital video captions

Info

Publication number: CN100469109C
Application number: CNB2006100871328A
Authority: CN
Inventors: 钱跃良; 熊德意; 刘群
Original assignee: Institute of Computing Technology of CAS
Current assignee: Ningbo Heng Jie Internet of things limited company
Priority date: 2006-06-13
Filing date: 2006-06-13
Publication date: 2009-03-11
Anticipated expiration: 2026-06-13
Also published as: CN101090461A

Abstract

This invention discloses an automatic translation method for digital video captions including: picking up caption content expressed by source language and converting it to a text format, carrying out language judgement, selecting a translation phrase list from the source to target language, picking up a model of the target language according to the kind of the target, dividing languages in the caption into phrases to look up content of the phrase in the target language, connecting the translated phrases in order and computing marks of the translated result, selecting the result with the highest translation result as the content in the target language and converting the translated content to the format accepted by the transmission flow and outputting it.

Description

A kind of automatic translation method for digital video captions

Technical field

The present invention relates to the automatic translating method of captions, the automatic translating method of captions in particularly a kind of digital video.

Background technology

Along with popularizing of computer, home theater, and a large amount of films adopts the Digital Media mode to make, issue, and people can contact the film from country variant, different regions, different language fast, on a large scale.But the language understanding problem that subject matter of the thing followed is the film beholder.Generally speaking, to the dialogue or the monologue that major part comes from personage in the film of understanding of film, remaining sub-fraction is only and comes from picture, non-language factor such as music.Therefore the beholder then can not understand film basically if obstacle is arranged on language.This means to have lost a large amount of potential spectators for film making side; And, then lost the chance of appreciating a large amount of primos for the beholder.

The traditional method that addresses this problem has three kinds.The first, making side is the captions of film interpolation different language, but because cost of manufacture, making side can not be that each language adds captions.For those Non-English movies, general maximum also is to add extra English subtitles, can understand English from literal but problem is not all non-English spectators.Second kind is done rule is that join film and translates film introduction side, and obvious this way cost is higher, also can cause the time difference problem of film in the different regions distribution simultaneously.But topmost problem is that this way can not provide a large amount of films of translating of joining.The third does the combination that rule can be regarded as preceding two kinds of methods, promptly mix translated captions by spontaneous the organizing of the moviegoer of a certain language to the foreign language film, this method is owing to brought into play numerous fans' participation, therefore quantitatively can guarantee basically can guarantee basically in time in time in full amount.But also there are following problems in this method: 1) Fan Yi captions quality is very different; 2) captions and sound can not fine correspondences, and this is that the fan is because oneself technical reason or operational carelessness often cause top problem because the translation body joined in captions also is a technical strong job; 3) there is the dispute over copyright problem; 4) can not be used for picture digital satellite receiver and this real-time playing system of DVD player.

In order to overcome the existing the problems referred to above of traditional method, those skilled in the art has proposed a kind of caption translating engine, in application number is 200410038308.1 Chinese patent application this caption translating engine is illustrated.Described caption translating engine is used for and will translates into another kind of language from the caption content of video source, and its workflow mainly comprises following steps:

1, from video source, extracts first kind of language representation's captions;

2, query language dictionary will be second kind of language of identical meanings with the character translation of first kind of language representation's caption content;

3, the output of the captions after will translating.

Captions are being translated as the process of second kind of language from first kind of language, phrase to the basis of phrase on simply by the inquiry dictionary obtain translation, in query script, if the phrase with first kind of language representation can not obtain second kind of synon definite coupling of language in dictionary, then phrase is carried out the translation of word to word.

In this application, there is following problem:

A, in actual applications, a speech has the translation of a plurality of correspondences, and perhaps phrase has the translation of a plurality of correspondences, does not have explanation how to select the most suitable translation in this application from multiple translation;

Usually there is multiple phrase segmentation result to exist in b, a sentence, for example, a sentence may be " phrase a+ phrase b ", also might be " phrase c+ phrase d ", wherein phrase c may be whole preceding parts of phrase b that add of phrase a, and phrase d is the remaining part of phrase b.Do not have in this application explanation is how to realize the division of distich subphrase.

For the problems referred to above, those skilled in the art adopts following method to solve usually.

1, there are a plurality of translations as if a speech or phrase, are chosen in so and come the most preceding translation in the dictionary;

2, when the phrase in the sentence is divided, from left to right, each the longest phrase of coupling of selecting is translated.

Obviously, if adopted above-mentioned interpretation method and matching principle in translation process, then translation quality can be very bad, allows beholder's indigestion, even mislead the beholder.

In addition, this caption translating engine is merely able to the captions in the video source are carried out the real-time online translation, and can not do the pretranslation of off-line to captions, has dwindled the scope of application of translation engine, is unfavorable for promoting the use of.

Summary of the invention

The objective of the invention is to overcome existing captions automatic translating method or poor, the scope of application smaller defect of device translation quality, thereby a kind of captions automatic translating method that accuracy is higher, the scope of application is wider of translating is provided.

To achieve these goals, the invention provides a kind of automatic translation method for digital video captions, be used for and will translate into another kind of language, said method comprising the steps of from the caption content of digital video source:

1), from from extracting the caption content of representing with original language the transport stream of described digital video source, and be text formatting from image format conversion with caption content;

2), the caption content after the conversion carried out languages differentiate, judge which kind of language described original language is;

3), the classification of the Aim of Translation language of setting according to the classification of original language and user of wanting, the translation phrase table of selection from the original language to the object language, the object language phrase that comprises the original language phrase in the described translation phrase table, has identical meanings with the original language phrase, and the translation probability between described original language phrase and described object language phrase;

4), according to the languages classification of object language, extract the language model of object language;

5), the statement of representing with original language in the captions is divided into phrase, and search the implication of phrase in object language according to the translation phrase table that step 3) obtains;

Be divided in the process of phrase at described statement, a statement has different division methods, and all divisions are all listed;

, all implication of phrase in object language all listed when searching the implication of phrase in object language described;

6), from left to right, the implication of phrase in object language connected in turn, form the target translation, the target translation that has connected is called the part translation, in connection procedure, the possible translation of the phrase that the part translation and the next one is possible is connected, and forms new part translation, and calculates the mark of new part translation; Repeat above-mentioned connection procedure, finish up to whole word translation; Wherein,

The mark of the part translation that described calculating is new comprises:

A, initial part are translated, and promptly do not translate the somersault of any original language word and translate, and its mark is 1;

The translation probability of the phrase that b, current newly-generated part are translated is: the phrase translation probability of a last part translation multiply by the translation probability of the phrase that is connected; The translation probability of described phrase is obtained by the translation phrase table that step 3) obtains;

The probabilistic language model that current newly-generated part is translated is: the probabilistic language model of a last part translation multiply by the probabilistic language model of the phrase that is connected; The probabilistic language model of the phrase in the described connection is that latter two word of an above part translation calculates as forerunner's history;

The phrase translation probability of c, partly translation multiply by probabilistic language model, just obtains the mark of this part translation;

In above-mentioned translation process, the part translation that covers identical sources language part all is kept in the same storehouse, each storehouse the highest top n result of retention score, and the value of described N is between 10 to 100;

7), in the storehouse of depositing whole sentence translation result, select the highest translation result of mark as the implication of the statement in the captions in object language;

8), the caption content after will translating is converted to the form that transport stream is accepted again by text formatting, and be compound in the transport stream and export.

In the technique scheme, in described step 1), described text formatting comprises Unicode format.

In the technique scheme, in described step 1), adopting optical character recognition engine is text formatting with described caption content from image format conversion.

In the technique scheme, in described step 3), described translation phrase table obtains by dictionary or obtains from Parallel Corpus.

In the technique scheme, in described step 3), described translation probability is the value that original language phrase and object language phrase intertranslation number of times obtain after divided by original language phrase occurrence number.

In the technique scheme, in described step 4), the language model of described object language is the 3-gram model.

When the language model of described object language is the 3-gram model, being calculated as follows of described probabilistic language model:

P (s) = \underset{i}{Π} P (w_{i} | w_{i - 1}, w_{i - 2})

Wherein, P represents probability, and s is a sentence, and w is a word, and the probability between the described word obtains from real corpus in advance.

In the technique scheme, described digital video source comprises personal computer, DVD player, top box of digital machine and digital satellite receiver.

The invention has the advantages that:

1, in translation process, method of the present invention keeps all possible translation result to each phrase, and represents that with translation probability whose possibility is bigger, thereby has improved the accuracy rate of translation.

2, the inventive method is considered the phrase segmentation on all possible coupling when phrase segmentation made in statement, forms different final translation results by them, makes translation result more comprehensive, has improved the accuracy rate of translation.

3, automatic translation method for digital video captions of the present invention both can carry out pretranslation to captions before video playback, also can translate automatically simultaneously when video playback, had enlarged the scope of application.

4, method of the present invention the time only needs the original language captions of video itself in translation, just can in time provide the film audient understandable in large quantities, the object language captions of synchronous playing, and it is low to have a cost, does not have the advantage of dispute over copyright.

Description of drawings

Fig. 1 is the structure chart of digital video captions automatic translation system;

How Fig. 2 obtains the schematic diagram of phrase table automatically from parallel language material for automatic translation method for digital video captions of the present invention;

Fig. 3 is the flow chart of automatic translation method for digital video captions of the present invention.

The drawing explanation

Module 102 automatic translation module 103 multilingual knowledge bases chosen automatically in 101 captions

The automatic synthesis module 201 words aligning modules of 104 captions 202 phrase Automatic Extraction modules

Embodiment

Below in conjunction with the drawings and specific embodiments automatic translation method for digital video captions of the present invention is described further.

Automatic translation method for digital video captions of the present invention is applied in a kind of digital video captions automatic translation system, as shown in Figure 1, this system comprises that captions choose module 101, automatically translation module 102, multilingual knowledge base 103, the automatic synthesis module 104 of captions automatically.Wherein, described multilingual knowledge base 103 comprises three parts: 1) languages are differentiated sub-knowledge base, deposit the typical sample of each languages; 2) different language between each other the translation phrase table; 3) language model of object language; Described captions are chosen automatically in the module 101 and are included optical character recognition engine.

In the present embodiment, the English subtitles in the digital video are translated as Chinese subtitle, utilize automatic translation method for digital video captions of the present invention that English subtitles are translated, as shown in Figure 2, specifically comprise following steps:

1), the automatic extraction module 101 of captions is always caught the caption content of representing with original language, and captions is converted into text formatting in the transport stream of video source.Described text formatting such as Unicode format etc.

In the present embodiment, described transport stream is the system flow based on MPEG-2, comprises multiplexing MPEG-2 video flowing, audio stream and picture (SPC) stream in system flow.Captions in the film generally are included in the SPC stream.The automatic extraction module 101 of described captions will be caught the captions in the transport stream, needs only the SPC flow point in the transport stream from coming out.After the automatic extraction module 101 of captions obtains SPC stream, convert the caption content of picture format to text formatting with optical character recognition engine.

2), after captions are converted to text formatting, send in the automatic translation module 102, differentiate automatically, obtain the languages classification of these captions by 102 pairs of captions languages of automatic translation module.When languages are done to differentiate automatically, its basic thought is based on the method for statistics, n character before getting from the captions of text formatting differentiated typical sample of each languages in the sub-knowledge base and calculated similarity between these characters and various language according to leaving languages in then.This languages method of discrimination is ripe prior art, and those skilled in the art be if will understand its ins and outs, visible list of references 2:T.Dunning, " Statistical Identification of Language", in Technical report CRLMCCS-94-273, Computing Research Lab, New Mexico State University, March 1994.

3), automatically the multilingual automatic translation engine in the translation module 102 is extracted the translation phrase table of corresponding source languages to target language according to the classification of languages from multilingual knowledge base 103.Wherein the languages of object language are obtained by user's prior selection.

Each record in the described translation phrase table is made up of three parts: original language phrase, object language phrase and the translation probability between them.Phrase wherein is the phrase on the linguistic meaning not necessarily, the continuous word strings of saying so exactly, and the translation probability between original language phrase and object language phrase is to do statistics according to the operating position of ordinary language to obtain.

For example, following record is arranged in a phrase table:

" Oh, I ' m sorry||| is sorry || | 1 "

" I borrow some||| borrows to me || | 0.378 "

Wherein, " || | " is separated to be respectively the original language phrase, object language phrase and the translation probability between them.Can see that " borrowing to me " is not the proper phrase of linguistics, but these phrases are helpful to automatic translation.In addition, a phrase, the corresponding a plurality of different phrases of possibility in another kind of language for this situation, are write down the difference translation of phrase with different records in the translation phrase table.

Described translation phrase table can obtain by two kinds of methods, and a kind of is to obtain from dictionary, and another kind obtains from Parallel Corpus automatically.The translation probability of the phrase that obtains from dictionary can manually be arranged to higher value.

Described to obtain the right process of phrase translation from Parallel Corpus automatically as follows: it is right at first to collect the sentence that bilingual aligns each other, as translated captions; The automatic alignment of utilization software obtains the alignment relation between the sentence centering word; Last phrase extraction instrument is extracting phrase from the language material that it is good that word level is alignd, and calculates the translation probability between them.

Provided among Fig. 2 and how from parallel language material, to have obtained these phrases.At first will align good sentence to being defeated by words aligning module 201 to obtain the alignment between the word, and to " I have adog " → " I have a dog ", words aligning is closed and is as sentence: I → I, have → have, a → one, dog → dog.According to words aligning relation, phrase Automatic Extraction module 202 can the Automatic Extraction phrase, as " I have " → " I have ", and " I have a " → " I have one " or the like, and calculate the phrase translation probability.Described phrase translation probability is the value that original language object language phrase intertranslation number of times is obtained after divided by original language phrase occurrence number.For example " I have a " occurred in language material 100 times altogether, wherein have to be translated as " I have one " 20 times, other may be translated as " I have one " or " I have one " etc., then translation probability is 20/100=0.2.

4), automatically the multilingual automatic translation engine in the translation module 102 is extracted the language model of object language according to the classification of target language from multilingual knowledge base 103.Described language model is used for calculating the smooth degree of Aim of Translation language, and the smooth terms of degree speech model probability of sentence represents that the probability of language model is high more, and then the smooth degree of sentence is just high more, also with regard to approaching more true language.In multilingual knowledge base 103, described language model is the 3-gram model, and the probability of sentence by formula (1) calculates:

P (s) = \underset{i}{Π} P (w_{i} | w_{i - 1}, w_{i - 2}) - - - (1)

Wherein P represents probability, and s is a sentence, and w is a word.Probability between these words obtains from real corpus in advance.

Such as for following sentence, its probabilistic language model may for:

I have a dog: 0.5;

I have a dog: 0.7;

My dog: 0.01;

I am a dog: 0.001.

Wherein, the probabilistic language model of " I have a dog " is the highest, also the most approaching true language.

5), based on general translation algorithm, automatically the statement of representing with original language in 102 pairs of captions of translation module is divided into different phrases, searches the implication of each phrase in object language according to the phrase table that obtains in the step 3) then.Statement is being divided in the process of phrase, a statement can be divided into different phrases, and in the methods of the invention, the phrase that may divide is all listed, and all phrases are all translated.In with the process of phrase from the source language translation to the object language, a phrase has different explanations, in the methods of the invention, all implications that can find is listed, and treats that subsequent operation does further selection.

If the original language sentence of input be " I have a dog ", this sentence is cut into different phrases and mates separately translation, have such two kinds of cuttings for this sentence:

1) " I have " → " I have " | " I " | " I ", " a " → " one " | " one " | " one ", " dog " → " dog "

2) " I have " → " I have " | " I " | " I ", " a dog " → " dog " | " dog " | " dog "

" | " is separated to be different translations.

6), from left to right, the implication of phrase in object language connected in turn, form the target translation, the target translation that has connected is called the part translation, in connection procedure, the possible translation of the phrase that the part translation and the next one is possible is connected, and forms new part translation, and calculates the mark of new part translation; Repeat above-mentioned connection procedure, finish up to whole word translation.

When calculating the mark of described part translation, may further comprise the steps:

The phrase translation probability of c, partly translation multiply by probabilistic language model, just obtains the mark of this part translation.

In above-mentioned translation process, the part translation that covers identical sources language part all is kept in the same storehouse each storehouse the highest top n result of retention score; Described N gets 50 in the present embodiment.

For example, the sentence to " the I have adog " that mention in the preceding step can have following different translation result, and each result's mark is also had nothing in common with each other:

I have a dog: 0.05

I have a dog: 0.07

My dog: 0.001

I am a dog: 0.0001

7), in the storehouse of depositing whole sentence translation result, select the highest translation result of mark as final translation, and output from automatic translation module 102.

In above-mentioned example, for the sentence of " I have a dog ", the mark of " I have a dog " is the highest, therefore selects this sentence as final translation.And on linguistics, obviously this translation result also is only.

8), after caption translating becomes object language, be sent in the automatic synthesis module 104 of captions, the captions after the translation are regenerated SPC stream, and original SPC stream and video flowing in the alternative transport stream, audio stream are multiplexing, form new transport stream, are transferred to video terminal.

Automatic translation method for digital video captions of the present invention is to the digital video caption on the personal computer, can translate whole captions automatically in advance, and then go to play in the digital video that is added to, also can when video playback, translate automatically, and in the synchronous digital video that is added to.

When automatic translation method for digital video captions of the present invention was used for DVD player, top box of digital machine and digital satellite receiver, the automatic translation of digital video caption was carried out synchronously with video playback.

Claims

1, a kind of automatic translation method for digital video captions said method comprising the steps of:

The mark of the part translation that described calculating is new comprises:

2, automatic translation method for digital video captions according to claim 1 is characterized in that, in described step 1), described text formatting comprises Unicode format.

3, automatic translation method for digital video captions according to claim 1 is characterized in that, in described step 1), adopting optical character recognition engine is text formatting with described caption content from image format conversion.

4, automatic translation method for digital video captions according to claim 1 is characterized in that, in described step 3), described translation phrase table obtains by dictionary or obtains from Parallel Corpus.

5, automatic translation method for digital video captions according to claim 1 is characterized in that, in described step 3), described translation probability is the value that original language phrase and object language phrase intertranslation number of times obtain after divided by original language phrase occurrence number.

6, automatic translation method for digital video captions according to claim 1 is characterized in that, in described step 4), the language model of described object language is the 3-gram model.

7, automatic translation method for digital video captions according to claim 6 is characterized in that, when the language model of described object language is the 3-gram model, and being calculated as follows of described probabilistic language model:

P (s) = \underset{i}{Π} P (w_{i} | w_{i - 1}, w_{i - 2})

8, automatic translation method for digital video captions according to claim 1 is characterized in that, described digital video source comprises personal computer, DVD player, top box of digital machine and digital satellite receiver.