WO2015043389A1

WO2015043389A1 - Participle information push method and device based on video search

Info

Publication number: WO2015043389A1
Application number: PCT/CN2014/086519
Authority: WO
Inventors: 崔代超
Original assignee: 北京奇虎科技有限公司; 奇智软件（北京）有限公司
Priority date: 2013-09-30
Filing date: 2014-09-15
Publication date: 2015-04-02

Abstract

Disclosed are a participle information push method and device based on a video search, a push method for an on-line playing entry object based on a video search and a push method for an associated resource address based on a video search. The method comprises: mapping feature text information about a received video search character string or acquired video resource data into one or more first participles; and searching for associated second participles of which the co-occurrence rate together with that of the one or more first participles is higher than a preset threshold value, the co-occurrence rate being the probability that the current one or more first participles and the second participles appear in the same video resource data together, thereby pushing a combination of the one or more first participles and the one or more associated second participles or an entry object of a video resource data related thereto or a network connection address associated with the video resource data. In the present application, data of the video resources which are seldom searched by users but have multiple relative resources in a video database are pushed, good-quality resources in the video database are deeply dug, resource digging efficiency is improved. An index table will be continuously expanded along with the constant accumulation of Internet video contents, and the number and scope of contents produced by various video stations will be far beyond the number of words already searched by users, thereby being beneficial to expand the recall rate.

Description

Word search information push method and device based on video search

Technical field

The present invention relates to the technical field of the Internet, and in particular, to a word segmentation information push method based on video search and a word segmentation information push device based on video search.

Background technique

Video search engine is a vertical search technology that is different from comprehensive search. The video search engine crawls the results of the video class in the Internet and builds an index. Since it can provide pure video results to the searcher, it can greatly save the time for netizens to find the video.

According to the relevant statistics of video search, entertainment, games, movies, news, animation and other types of video are the main search objects of users. This indicates the user's general need for video search itself. Users often do not have a strong purpose, the search results are not "not incompetent", but with a certain degree of scalability, as long as the target is within the scope of the user's favorite. Therefore, it is often the case that relevant recommendations are made to users outside of the search results.

However, the existing video search engines have insufficient shortcomings in related recommendations: some video search engines do not have relevant recommendations, and the related recommended video search engines are simply based on the user's search history data and manually collated to obtain an association system. Implement recommendations. This recommendation system is based on the user's existing search habits, and the recall rate is low. In addition, the user's search range is generally much smaller than the existing Internet resources, and the high-quality video in the Internet cannot be fully exploited.

Another method of search recommendation is to manually sort out a resource association system or obtain such a system from other knowledge systems and apply it to the recommendation system. For example, when a search engine searches for "square dance", it will get the recommended words of "social dance", "belly dance", "aerobics", etc. When searching for "dota", you will get "crossing the fire line", "World of Warcraft", etc. The recommended word, but this system has a low recall rate and generally cannot be recommended in long tail searches.

Summary of the invention

In view of the above problems, the present invention has been made in order to provide a video search-based word segmentation information push method and a corresponding video search-based word segmentation information push device that overcome the above problems or at least partially solve the above problems.

According to an aspect of the present invention, a method for word segmentation information push based on video search is provided, including:

Receiving a video search string;

Mapping the video search string to one or more first word segments;

Finding an associated second participle with a co-occurrence rate of the one or more first participles that is higher than a preset threshold; the co-occurrence rate is that the first one or more first participles and the second participle are in the same video resource data The probability of co-occurrence;

Pushing a combination of the one or more first word segments and the one or more associated second word segments.

According to another aspect of the present invention, a push method for an online play portal object based on video search is provided, including:

Receiving a video search string;

Mapping the video search string to one or more first word segments;

Obtaining a network address of one or more video data resources that match the one or more first word segments and the associated second word segment;

Constructing an entry object for playing the video data resource online according to the one or more video data resource network addresses;

Pushing the one or more ingress objects of the online play video data resource.

According to another aspect of the present invention, a method for pushing an associated resource address based on a video search is provided, including:

Obtaining feature text information of the first video resource data when receiving a loading or playing request of the first video resource data;

Mapping the feature text information into one or more first word segments;

Obtaining a network link address of the second video resource data that matches the one or more first word segments and the associated second word segment;

Pushing the network link address of the second video resource data.

According to another aspect of the present invention, a video search-based word segmentation information pushing apparatus is provided, including:

a video search string receiving module adapted to receive a video search string;

a first word segmentation mapping module, configured to map the video search string into one or more first word segments;

a second participle finding module, configured to find an associated second participle with a co-occurrence rate of the one or more first participles being higher than a preset threshold; the co-occurrence rate is one or more current participles and a second participle The probability of co-occurrence in the same video resource data;

A combined push module adapted to push a combination of the one or more first word segments and the one or more associated second word segments.

According to still another aspect of the present invention, there is provided a computer program comprising computer readable code, when said computer readable code is run on a computing device, causing said computing device to perform according to claims 1-7 Any of the video search-based word segmentation information push methods described.

According to still another aspect of the present invention, a computer readable medium storing the computer program according to claim 17 is provided.

The beneficial effects of the invention are:

The invention can be pushed according to the existing published content, so that the search engine can get rid of the dependence on the user's search habit, and the video resource data of the video library that has more relevant resources is pushed out, although the user searches for less, so that the video resource data is pushed out. Deeply mining high-quality resources in the video library and improving the efficiency of resource mining; in addition, the index table will continue to expand with the accumulation of Internet video content, and the amount and breadth of content produced by major video stations will far exceed the user's already The number of words searched is conducive to expanding the recall rate.

By pushing the combination of the first participle and the second participle, the user can directly perform more levels of searching based on the combination, so that the user can obtain more results by simply searching, and does not need to submit the search multiple times, thereby reducing the access server. The burden of reducing network resources and improving the user experience.

The above description is only an overview of the technical solutions of the present invention, and the above-described and other objects, features and advantages of the present invention can be more clearly understood. Specific embodiments of the invention are set forth below.

DRAWINGS

Various other advantages and benefits will become apparent to those skilled in the art from a The drawings are only for the purpose of illustrating the preferred embodiments and are not to be construed as limiting. Throughout the drawings, the same reference numerals are used to refer to the same parts. In the drawing:

FIG. 1 is a flow chart showing the steps of an embodiment of a method for segmenting word information based on video search according to an embodiment of the present invention; FIG.

2 is a flow chart showing the steps of an embodiment of a method for pushing an online play portal object based on video search according to an embodiment of the present invention;

3 is a flow chart showing the steps of a push embodiment of an associated resource address based on video search, in accordance with one embodiment of the present invention;

4 is a block diagram showing an embodiment of a video search-based word segmentation information pushing apparatus according to an embodiment of the present invention;

Figure 5 shows schematically a block diagram of a computing device for performing the method according to the invention;

Fig. 6 schematically shows a storage unit for holding or carrying program code implementing the method according to the invention.

detailed description

The invention is further described below in conjunction with the drawings and specific embodiments.

Referring to FIG. 1 , a flow chart of a step of a method for a word search information based on a video search according to an embodiment of the present invention is shown.

Step 101: Receive a video search string.

It should be noted that the video search string may be video search information input by the user, and may be used to request to search for video data resources related thereto.

In practical applications, the video search string can be a word, that is, including a semantically independent word, such as Mid-Autumn Festival, Dragon Boat Festival, National Day, etc.; the video search string can also be a compound word, that is, including two or more semantically independent Words, such as Mid-Autumn moon cakes, Dragon Boat Festival, National Day Tibet Tourism, etc.

Step 102: Map the video search string into one or more first word segments;

It should be noted that the mapped word segmentation may be preset and may be used to calculate the co-occurrence rate between different word segments.

The mapped rule may also be one or more presets, and may include words that remove the dirty words, modifiers, modal particles, broad words, and the like of the video search characters; may include setting stop words, that is, some common ones. Words, which are criteria for stopping a phrase, such as me, you, etc.; can also include the correspondence of associations, and correspond to multiple expressions of the same thing as an expression, for example, August 15th, Mid-Autumn Festival The moon cake section and the like are associated with the Mid-Autumn Festival; other mapping rules may also be included, which are not limited by the embodiment of the present invention.

English is based on words, words and words are separated by spaces, and Chinese is in words. All the words in a sentence can be combined to describe a meaning. For example, the English sentence I am a student, in Chinese is: "I am a student." The computer can easily know that student is a word by a space, but it is not easy to understand that the words "learning" and "sheng" are combined to represent a word. The Chinese character sequence is divided into meaningful words, which are Chinese word segments. For example, I am a student and the result of the participle is: me, yes, one, student.

Here are some common word segmentation methods:

1. Word segmentation based on string matching: refers to matching the Chinese character string to be analyzed with a term in a preset machine dictionary according to a certain strategy. If a string is found in the dictionary, the matching is successful ( Identify a word). The actual word segmentation system uses mechanical segmentation as a preliminary method, and further improves the accuracy of segmentation by using various other language information.

2. The word segmentation method based on feature scanning or mark segmentation: refers to prioritizing and segmenting some words with obvious features in the string to be analyzed. Using these words as breakpoints, the original string can be divided into Small strings come into mechanical participles to reduce the error rate of matching; or combine word segmentation with word class notation, use rich word class information to help segmentation decision making, and in turn, test and adjust the word segmentation results in the labeling process. , thereby improving the accuracy of the segmentation.

3. The word segmentation method based on understanding: refers to the effect of identifying words by letting the computer simulate the understanding of the sentence. The basic idea is to perform syntactic and semantic analysis at the same time as word segmentation, and use syntactic information and semantic information to deal with ambiguity. It usually consists of three parts: the word segmentation subsystem, the syntactic and semantic subsystem, and the general control part. Under the coordination of the general control part, the word segmentation subsystem can obtain the syntactic and semantic information about words, sentences, etc. to judge the participle ambiguity, that is, it simulates the process of human understanding of the sentence. This method of word segmentation requires a large amount of linguistic knowledge and information.

4. Statistical-based word segmentation method: It means that the frequency or probability of co-occurrence of words and words in Chinese information can better reflect the credibility of words, so each word in the corpus can be co-occurred. The frequency of the combination is counted, their mutual information is calculated, and the adjacent co-occurrence probability of the two Chinese characters X and Y is calculated. The mutual information can reflect the closeness of the relationship between Chinese characters. When the degree of tightness is above a certain threshold, the word group may be considered to constitute a word. This method only needs to count the frequency of the words in the corpus, and does not need to cut the dictionary.

In a preferred embodiment of the present invention, the step 102 may specifically include the following sub-steps:

Sub-step S11, extracting a participle mapped by the video search string;

For the case where the video search string is a word, the corresponding word segmentation can be directly extracted according to a preset mapping rule. For example, the video search string is "Mid-Autumn Festival", "My Mid-Autumn Festival" or "Mid-Autumn Festival", etc., the first participle of the map Can be "Mid-Autumn Festival". Of course, the video search string can also be the same word as the first participle of the mapping. For example, the video search string is “Mid-Autumn Festival”, and the first participle of the map can also be “Mid-Autumn Festival”.

or,

Sub-step S12, when the received video search string is a compound word, splitting the video search string into a plurality of search sub-words;

Sub-step S13, extracting a plurality of word segments mapped by the plurality of search sub-words.

For the case where the video search string is a compound word, the word segmentation may be performed according to a preset mapping rule to obtain a search subword, and then the word segment corresponding to the search subword is separately extracted. For example, the received video search string is “Mid-Autumn Festival Mooncake”, which can be split into two search sub-words of “Mid-Autumn Festival” and “Moon Cake”, and then “Mid-Autumn Festival” is mapped to “Mid-Autumn Festival”, “ The moon cake is mapped to "moon cake", and the first participles of "Mid-Autumn Festival" and "moon cake" are obtained.

Step 103: Search for an associated second participle with a co-occurrence rate of the one or more first participles that is higher than a preset threshold;

The co-occurrence rate is a probability that one or more first participles and a second participle coexist in the same video resource data;

It should be noted that the second participle may be a participle other than the first participle among all the preset participles. The associated second participle may be a second participle with the first participle having a co-occurrence rate higher than a preset threshold.

In practical applications, the video resource data may include feature text information, which may be used to record related information of the video resource data, and may also be used to extract word segmentation.

In a preferred embodiment of the invention, the feature text information may include a video title, a video keyword, and/or a video description.

For example, in a video resource data named "[Photographer] After the rainstorm in Dongguan, changed to Venice, more than a thousand cars flooding anchors - online play - XX network, video HD online viewing", the characteristic text information can be as follows:

Title: [Title] After the rainstorm in Dongguan, it became Venice. More than a thousand cars were flooded and anchored - online play - XX network, video HD online viewing;

Video Keywords: YY Reporter Life Information Dongguan Flooding;

Description: A heavy rain yesterday morning made the neighborhoods in some parts of Dongguan feel like they came to Venice. The driving car was hit by flooding during the heavy rain, and some of the neighborhoods were also a sea.

Specifically, the co-occurrence rate may be a probability that the current one or more participles and the second participle co-occur in the feature text information of the same video resource data, and specifically may include a co-occurrence rate of the first participle and the second participle, The co-occurrence rate of multiple participles and second participles.

In a preferred embodiment of the present invention, the step 103 may specifically include the following sub-steps:

Sub-step S21, when the video search string is mapped to a first word segment, extracting a preset index table corresponding to the first word segment; wherein the index table includes video resource data to which the first word segment belongs Information, and all the word segments in the video resource data; all the word segments in the video resource data are obtained by capturing video resource data, extracting feature text information of the video resource data, and performing the feature text information Word segmentation;

In a specific implementation, the search engine may be used to crawl the video resource data on each website platform by using a crawler, and then the index database is built: the feature text information of the video resource data is extracted for word segmentation, and an index table corresponding to each word segment is established. The index table may store information of video resource data (which may be a video identifier such as an ID, an intranet address, an external network address, or the like, or a record consisting of a current participle and other participles), and all of the video resource data. Participle (including the first participle and the second participle except the first participle).

For example, the index table for "Mid-Autumn Festival" can be as follows:

The first participle is “Mid-Autumn Festival”, and the information of the video resource data includes a video identifier. Of course, the information of the video resource data may not include the video identifier, but only the records formed by the first participle and the second participle (ie, the second participle of each line as one record).

Of course, the foregoing index table is only an example. When the embodiment of the present invention is implemented, other index tables may be set according to actual conditions, which is not limited by the embodiment of the present invention. In addition, in addition to the foregoing index table, other index tables may be used by those skilled in the art according to actual needs, and the embodiment of the present invention does not limit this.

It should be noted that the video resource data on each platform can be captured periodically or irregularly, and then the index database is updated, that is, each index table is updated.

Sub-step S22, calculating a co-occurrence rate of the first participle and each second participle in the index table, where the co-occurrence rate is the number of occurrences of each second participle in the index table and the video in the index table a ratio of the total number of pieces of information of the resource data; wherein the second participle is a participle of all the participles in the video resource data except the first participle;

Since the number of occurrences of each second participle in the index table is the same as the number of video material data to which it belongs, the co-occurrence rate may also be expressed as the number of video material data to which each second participle in the index table belongs and in the index table. The ratio of the total number of information for video resource data.

For example, the index table of the word segmentation "square dance" has a total of 100 pieces of video resource data information, and the index table "Bing brother" has a total of 200 pieces of video resource data information, and "square dance" and "bing brother" simultaneously There are 10 pieces of information on the video resource data appearing in the two index tables. For the "square dance", the co-occurrence rate of "square dance" and "Bing brother" is 10/100=10%. For "Bing Brother", the co-occurrence rate of "Bing Brother" and "Plaza Dance" is 10/200=5%.

Sub-step S23, extracting the second participle with the co-occurrence rate higher than the preset threshold as the associated second participle.

In a specific implementation, the preset threshold may be set by a person skilled in the art according to actual conditions, which is not limited by the embodiment of the present invention. The associated second participle extracted in the embodiment of the present invention may be empty or one or more.

Sub-step S31, when the video search string is mapped to a plurality of first word segments, respectively extracting a plurality of preset index tables corresponding to the plurality of first word segments; each index table includes the first word segment The information of the video resource data, and all the word segments in the video resource data; all the word segments in the video resource data are the feature text information of the video resource data by extracting the video resource data, Feature text information for word segmentation;

In a specific implementation, the search engine may be used in advance to crawl the video resource data on each platform through the crawler, and then the index is built: the feature text information of the video resource data is extracted for word segmentation, and an index table corresponding to each word segment is established. The index table may store information of video resource data (which may be a video identifier such as an ID, an intranet address, an external network address, or the like, or a record consisting of a current participle and other participles), and all of the video resource data. Participle (including the first participle and the second participle except the first participle).

Sub-step S32, extracting a second participle that appears together with the plurality of first participles as a candidate participle; wherein the second participle is a participle of all the participles in the video resource data except the first participle ;

Specifically, there are currently a plurality of first word segments, that is, there are multiple numbers corresponding to the index table, and the candidate word segments need to appear in each index table, that is, the candidate word segments are respectively present in the same index table together with the current first word segments. .

Sub-step S33, calculating a co-occurrence rate of the first participle and the candidate participle in each index table, where the co-occurrence rate is a number of occurrences of the candidate participle in the index table and a video resource in the index table The ratio of the total number of pieces of information;

For example, you can map the video search string “Mid-Autumn Festival Mooncake” to the first participle “Mid-Autumn Festival” and “Moon Cake”, and extract one of the candidate participles as “Moon”, then you can calculate the “Mid-Autumn Festival” and “Moon” respectively. The current rate (assumed to be 70%), the “moon cake” and the “moon” co-occurrence rate (assumed to be 60%).

Sub-step S34, respectively, a plurality of weights corresponding to the co-occurrence rate configuration of the plurality of first word segments and the candidate word segment;

The weight may be determined according to the ratio of the total number of information of the video resource data in the index table between the first participles, wherein The greater the total amount of information of the video resource data in the index table, the greater the weight. For example, in the index table of "Mid-Autumn Festival", the total amount of information of video resource data is 900, and in the index table of "moon cake", the total amount of information of video resource data is 100, and the co-occurrence rate of "Mid-Autumn Festival" and "Moon" The weight can be 0.9, and the weight of the "moon cake" and "moon" co-occurrence rate can be 0.1.

Of course, the foregoing weights are only examples. When implementing the embodiments of the present invention, other weights may be set according to actual conditions, for example, according to current social hotspots (news ranking, microblog ranking, etc.), corresponding weights are set according to the user's local and/or The online operation behavior (video playback, news reading, etc.) sets the corresponding weights and the like, which are not limited by the embodiment of the present invention. In addition, other weights may be used by those skilled in the art in addition to the above-mentioned weights, and the embodiments of the present invention do not limit this.

Sub-step S35, respectively calculating an average value of a plurality of co-occurrence rates configured with weights as a co-occurrence rate of the plurality of first word segments and the candidate participles;

In the embodiment of the present invention, the weighted average of multiple co-occurrence rates may be used as the final co-occurrence rate.

For example, the co-occurrence rate of Mid-Autumn Festival, Mooncake, and Moon can be (70%*0.9+60%*0.1)/2=34.5%.

Sub-step S36, extracting the candidate participle with the co-occurrence rate higher than the preset threshold as the associated second participle.

Sub-step S41, when the video search string is mapped to a plurality of first word segments, respectively extracting a plurality of preset index tables corresponding to the plurality of first word segments; wherein each index table includes the first The information of the video resource data to which the word segment belongs, and all the word segments in the video resource data; all the word segments in the video resource data are the feature text information of the video resource data by extracting the video resource data, The feature text information is generated by word segmentation;

Sub-step S42, determining the main participle by using the plurality of index tables, where the main participle is the first participle corresponding to the index table with the largest total number of pieces of information of the video resource data;

In order to improve the user experience, for a plurality of first participles in which the video resource data differs greatly, the first participle of the video resource data with a small amount of information may be ignored. For example, for the first participle "Mid-Autumn Festival" and "moon cake" mapped by the video search string "Mid-Autumn Festival Mooncake", the total amount of information of the video resource data in the index table of "Mid-Autumn Festival" is 900, and the index of "moon cake" If the total number of pieces of video resource data in the table is 100, you can set "Mid-Autumn Festival" as the main participle.

Sub-step S43, calculating a co-occurrence rate of each second participle in the index table and the corresponding index part, the co-occurrence rate is the number of occurrences of each second participle in the index table and the video resource in the index table a ratio of the total number of pieces of information of the data; wherein the second participle is a participle of all the participles in the video resource data except the first participle;

In the embodiment of the present invention, the co-occurrence rate of the main participle can be used as the final co-occurrence rate.

Sub-step S44, extracting the second participle with the co-occurrence rate higher than the preset threshold as the associated second participle.

Step 104: Push a combination of the one or more first word segments and the one or more associated second word segments.

Specifically, after the sub-step S23, a combination of the current first participle and one or more participles may be pushed at a position such as a pull-down menu of the input box of the webpage. For example, the video search string is “dota”, and the words with the same co-occurrence rate are: “funny”, “egg hurt”, “2009”, “sea Tao”, “first perspective” and “classic”, co-occurrence The rates are 40%, 35%, 30%, 25%, 20% and 10% respectively, which will push the combination "dota funny", "dota egg pain", "dota2009", "dota sea", "dota" A perspective" and "dota classic".

After sub-step S36, a combination of the current plurality of first word segments and one or more word segments may be pushed at a drop-down menu or the like of the input box of the web page. For example, the video search string is "square dance soldier brother", which is mapped to the first participle "square dance" and "bing brother", and extracts the second participle that appears at the same time as the two first participles, for example, the second participle " Teaching", which can be used as the second participle of the association, will eventually push the combination "tea dance of the square dance soldiers".

In a preferred embodiment of the present invention, step 104 may specifically include the following sub-steps:

Sub-step S51, pushing the main participle and the associated second participle.

After sub-step S44, the combination of the current main participle and one or more participles can be pushed at a drop-down menu or the like of the input box of the web page. For example, for the first participle "Mid-Autumn Festival" and "moon cake" mapped by the video search string "Mid-Autumn Festival Mooncake", you can set "Mid-Autumn Festival" as the main participle, and get the second participle "Moon", you can push the combination "Mid-Autumn Festival" moon".

Users can search for new video asset data by clicking on the push combination in the drop-down menu.

2 is a flow chart showing the steps of an embodiment of a method for pushing an online play portal object based on a video search according to an embodiment of the present invention, which may specifically include the following steps:

Step 201: Receive a video search string.

Step 202: Map the video search string into one or more first word segments;

In a preferred embodiment of the present invention, the step 202 may specifically include the following sub-steps:

Sub-step S61, extracting a participle mapped by the video search string;

or,

Sub-step S62, when the received video search string is a compound word, splitting the video search string into a plurality of search sub-words;

Sub-step S63, extracting a plurality of word segments mapped by the plurality of search sub-words.

Step 203: Search for an associated second participle with a co-occurrence rate of the one or more first participles that is higher than a preset threshold.

In a preferred embodiment of the present invention, the step 203 may specifically include the following sub-steps:

Sub-step S71, when the video search string is mapped to a first word segment, extracting a preset index table corresponding to the first word segment; wherein the index table includes video resource data to which the first word segment belongs Information, and all the word segments in the video resource data; all the word segments in the video resource data are obtained by capturing video resource data, extracting feature text information of the video resource data, and performing the feature text information Word segmentation;

Sub-step S72, calculating a co-occurrence rate of the first participle and each second participle in the index table, where the co-occurrence rate is the number of occurrences of each second participle in the index table and the video in the index table a ratio of the total number of pieces of information of the resource data; wherein the second participle is a participle of all the participles in the video resource data except the first participle;

Sub-step S73, extracting the second participle with the co-occurrence rate higher than the preset threshold as the associated second participle.

Sub-step S81, when the video search string is mapped into a plurality of first word segments, respectively extracting a plurality of preset index tables corresponding to the plurality of first word segments; each index table includes the first word segment The information of the video resource data, and all the word segments in the video resource data; all the word segments in the video resource data are the feature text information of the video resource data by extracting the video resource data, Feature text information for word segmentation;

Sub-step S82, extracting a second participle that appears together with the plurality of first participles as a candidate participle; wherein the second participle is a participle of all the participles in the video resource data except the first participle ;

Sub-step S83, calculating a co-occurrence rate of the first participle and the candidate participle in each index table, where the co-occurrence rate is the number of occurrences of the candidate participle in the index table and the video resource in the index table The ratio of the total number of pieces of information;

Sub-step S84, which are respectively a plurality of weights corresponding to the co-occurrence rate configuration of the plurality of first word segments and the candidate word segment;

Sub-step S85, respectively calculating an average value of a plurality of co-occurrence rates configured with weights as a co-occurrence rate of the plurality of first word segments and the candidate participles;

Sub-step S86, extracting the candidate participle with the co-occurrence rate higher than the preset threshold as the associated second participle.

Sub-step S91, when the video search string is mapped to a plurality of first word segments, respectively extracting a plurality of preset index tables corresponding to the plurality of first word segments; wherein each index table includes the first The information of the video resource data to which the word segment belongs, and all the word segments in the video resource data; all the word segments in the video resource data are the feature text information of the video resource data by extracting the video resource data, The feature text information is generated by word segmentation;

Sub-step S92, determining the main participle by using the plurality of index tables, where the main participle is the first participle corresponding to the index table with the largest total number of pieces of information of the video resource data;

Sub-step S93, calculating a co-occurrence rate of each second word segment in the index table and the corresponding index table, the co-occurrence rate is a number of occurrences of each second word segment in the index table and a video resource in the index table a ratio of the total number of pieces of information of the data; wherein the second participle is a participle of all the participles in the video resource data except the first participle;

Sub-step S94, extracting the second participle with the co-occurrence rate higher than the preset threshold as the associated second participle.

Step 204: Obtain a network address of one or more video data resources that match the one or more first word segments and the associated second word segment;

After sub-step S73, a combination of the current first participle and one or more participles can be obtained. For example, the video search string is “dota”, and the words with the same co-occurrence rate are: “funny”, “egg hurt”, “2009”, “sea Tao”, “first perspective” and “classic”, co-occurrence The rates are 40%, 35%, 30%, 25%, 20%, and 10%, respectively, and the combinations obtained are “dota funny”, “dota egg pain”, “dota2009”, “dota sea”, “dota” The first perspective" and the "dota classic".

After sub-step S86, a combination of the current plurality of first participles and one or more participles can be obtained. For example, the video search string is "square dance soldier brother", which is mapped to the first participle "square dance" and "bing brother", and extracts the second participle that appears at the same time as the two first participles, for example, the second participle " Teaching", which can be used as the second participle of the association, then the final combination "tea dance of the square dancers".

In a preferred embodiment of the present invention, step 204 may specifically include the following sub-steps:

Sub-step S101: Obtain a network address of one or more video data resources that match the primary participle and the associated second participle.

After sub-step S94, a combination of the current main participle and one or more participles can be obtained. For example, for the first participle "Mid-Autumn Festival" and "moon cake" mapped by the video search string "Mid-Autumn Festival Mooncake", you can set "Mid-Autumn Festival" as the main participle and get the second participle "moon", and finally get the combination "Mid-Autumn Festival" moon".

In the embodiment of the present invention, the search of the matched video data resources may be performed based on the combination of the first word segmentation and the second segment word segment. When searching, the network address may be recorded, which may be an intranet address, or may be External network address.

Step 205: Construct an ingress object for playing the video data resource online according to the one or more video data resource network addresses.

The entry object can be an icon or button in the web page that links to the online play URL. In a specific implementation, an icon or a button may be configured in the current page, and is associated with the video data resource network address in the extended window. When the user clicks the icon or button, and the video data resource network address is triggered, the database may be accessed from the database. The corresponding video data resource is loaded under the URL.

Step 206: Push the one or more ingress objects of the online play video data resource.

In an actual application, the entry object can be placed at any position of the current page, and the user can trigger the entry object to trigger the network address of the video data resource corresponding to the entry object, thereby loading the video data resource.

For example, the user inputs a video search string "steel" in the search box, which itself can be used as the first participle to obtain a video data resource matched by the combination of the first participle and the associated second participle "Iron Man 3", the video resource The entry object is an icon that says "Read Now" to prompt the user. When the user clicks on the icon, he can go to the play page of "Iron Man 3".

By pushing the entry object of the online video data resource, the user can directly obtain more video search results based on the entry object, so that the user can obtain more results by simply searching, and the search does not need to be submitted multiple times, thereby reducing the The burden of accessing the server reduces the occupation of network resources and improves the user experience.

Referring to FIG. 3, a flow chart of steps of a push-based embodiment of a video search-based associated resource address according to an embodiment of the present invention is shown.

Step 301, when receiving a loading or playing request of the first video resource data, acquiring feature text information of the first video resource data;

It should be noted that the first video resource data may be located on the terminal device or may be located on the network, and the feature text information may be information carried by the video resource data.

In a preferred embodiment of the present invention, the step 301 may specifically include the following sub-steps:

Sub-step S111, when receiving the play request of the first video data, receiving the feature text information of the first video resource data sent by the current terminal;

When the first video resource data is located on the terminal device, the feature text information of the first video resource data may be extracted by the terminal device, and then uploaded to the corresponding server side.

or,

Sub-step S112, when receiving the first video data loading request, extracting the feature text information of the video resource data preset locally.

When the first video resource data is located on the network, the feature text information of the first video resource data may be extracted by the server side.

Step 302: Map the feature text information into one or more first word segments;

In a preferred embodiment of the present invention, the step 302 may specifically include the following sub-steps:

Sub-step S121, extracting a participle mapped by the feature text information;

or,

Sub-step S122, when the received feature text information is a compound word, splitting the feature text information into a plurality of search sub-words;

Sub-step S123, extracting a plurality of word segments mapped by the plurality of search sub-words.

Step 303: Search for an associated second participle with a co-occurrence rate of the one or more first participles that is higher than a preset threshold.

In a preferred embodiment of the present invention, the step 303 may specifically include the following sub-steps:

Sub-step S131, when the feature text information is mapped to a first word segment, extracting a preset index table corresponding to the first word segment; wherein the index table includes video resource data to which the first word segment belongs Information, and all the word segments in the video resource data; all the word segments in the video resource data are obtained by capturing video resource data, extracting feature text information of the video resource data, and performing the feature text information Word segmentation;

Sub-step S132, calculating a co-occurrence rate of the first participle and each second participle in the index table, where the co-occurrence rate is the number of occurrences of each second participle in the index table and the video in the index table a ratio of the total number of pieces of information of the resource data; wherein the second participle is a participle of all the participles in the video resource data except the first participle;

Sub-step S133, extracting the second participle with the co-occurrence rate higher than the preset threshold as the associated second participle.

Sub-step S141, when the feature text information is mapped into a plurality of first word segments, respectively extracting a plurality of preset index tables corresponding to the plurality of first word segments; each index table includes the first word segment The information of the video resource data, and all the word segments in the video resource data; all the word segments in the video resource data are the feature text information of the video resource data by extracting the video resource data, Feature text information for word segmentation;

Sub-step S142, extracting a second participle that appears together with the plurality of first participles as a candidate participle; wherein the second participle is a participle of all the participles in the video resource data except the first participle ;

Sub-step S143, calculating a co-occurrence rate of the first participle and the candidate participle in each index table, where the co-occurrence rate is a number of occurrences of the candidate participle in the index table and a video resource in the index table The ratio of the total number of pieces of information;

Sub-step S144, respectively, a plurality of weights corresponding to the co-occurrence rate configuration of the plurality of first word segments and the candidate word segment;

Sub-step S145, respectively calculating an average value of a plurality of co-occurrence rates configured with weights as a co-occurrence rate of the plurality of first word segments and the candidate participles;

Sub-step S146, extracting the candidate participle with the co-occurrence rate higher than the preset threshold as the associated second participle.

Sub-step S151, when the feature text information is mapped into a plurality of first word segments, respectively extracting a plurality of preset index tables corresponding to the plurality of first word segments; wherein each index table includes the first The information of the video resource data to which the word segment belongs, and all the word segments in the video resource data; all the word segments in the video resource data are the feature text information of the video resource data by extracting the video resource data, The feature text information is generated by word segmentation;

Sub-step S152, determining a main participle by using the plurality of index tables, where the main participle is a first participle corresponding to an index table with the largest total number of pieces of information of the video resource data;

Sub-step S153, calculating a co-occurrence rate of each second word segment in the index table and the corresponding index table, the co-occurrence rate is the number of occurrences of each second word segment in the index table and the video resource in the index table a ratio of the total number of pieces of information of the data; wherein the second participle is a participle of all the participles in the video resource data except the first participle;

Sub-step S154, extracting the second participle with the co-occurrence rate higher than the preset threshold as the associated second participle.

Step 304: Obtain a network link address of the second video resource data that matches the one or more first word segments and the associated second word segment.

Specifically, after sub-step S133, a combination of the current first participle and one or more associated second participles can be obtained. For example, the character text information is “dota”, and the words with the same co-occurrence rate are: “funny”, “egg pain”, “2009”, “sea Tao”, “first perspective” and “classic”, co-occurrence The rates are 40%, 35%, 30%, 25%, 20%, and 10%, respectively, and the combinations obtained are “dota funny”, “dota egg pain”, “dota2009”, “dota sea”, “dota” The first perspective" and the "dota classic".

After sub-step S146, a combination of the current plurality of first participles and one or more associated second participles can be obtained. For example, the character text information is "square dance soldier brother", which is mapped to the first participle "square dance" and "bing brother", and extracts the second participle that appears at the same time as the two first participles, for example, the second participle " Teaching", which can be used as the second participle of the association, and finally obtain the combination "teaching of the square dance soldiers".

In a preferred embodiment of the present invention, step 304 may specifically include the following sub-steps:

Sub-step S161, acquiring a network link address of the second video resource data of the main participle and the associated second participle.

After sub-step S154, a combination of the current main participle and one or more associated second participles can be obtained. For example, for the first participle "Mid-Autumn Festival" and "moon cake" mapped by the character text "Mid-Autumn Festival Mooncake", you can set "Mid-Autumn Festival" as the main participle and get the second participle "moon", and finally get the combination "Mid-Autumn Festival" moon".

In the embodiment of the present invention, the search of the matched video data resource may be performed based on the combination of the first word segment and the second word segmentation. When searching, the network connection address may be recorded, which may be an intranet address, or Is the external network address.

Step 305: Push a network link address of the second video resource data.

In an actual application, the network link address of the second video resource data may be placed at any position on the current page, or may be pushed by embedding an icon or a button, and the user may load by triggering the network link address of the second video resource data. The video data resource.

The invention obtains the network connection address of the matched second video resource data of the first word segment and the second word segment, and the user can directly obtain the video data resource based on the address, so that the user can obtain more results by simply searching, without Submitting the search multiple times, reducing the burden of accessing the server, reducing the occupation of network resources and improving the user experience.

For the method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the embodiments of the present invention are not limited by the described action sequence, because the embodiment according to the present invention Some steps can be performed in other orders or at the same time. In the following, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions involved are not necessarily required by the embodiments of the present invention.

Referring to FIG. 4, a block diagram of an embodiment of a video search-based word segmentation information pushing apparatus according to an embodiment of the present invention is shown. Specifically, the following modules may be included:

The video search string receiving module 401 is adapted to receive a video search string;

The first part-of-word mapping module 402 is adapted to map the video search string into one or more first word segments;

a second participle finding module 403, configured to search for an associated second participle with a co-occurrence rate of the one or more first participles that is higher than a preset threshold; the co-occurrence rate is one or more current participles and a second participle The probability that a participle will co-occur in the same video resource data;

The pushing module 404 is adapted to push a combination of the one or more first word segments and the one or more associated second word segments.

In a preferred embodiment of the present invention, the first word segmentation mapping module 402 may further be adapted to:

Extracting a participle mapped by the video search string;

or,

When the received video search string is a compound word, the video search string is split into a plurality of search subwords; and a plurality of word segments mapped by the plurality of search subwords are extracted.

In a preferred embodiment of the present invention, the second word segmentation module 403 is further adapted to:

Extracting a preset index table corresponding to the first word segment when the video search string is mapped to a first word segment; wherein the index table includes information of video resource data to which the first word segment belongs, and All the word segments in the video resource data; all the word segments in the video resource data are obtained by capturing video resource data, extracting feature text information of the video resource data, and performing word segmentation on the feature text information;

Calculating a co-occurrence rate of the first participle and each second participle in the index table, where the co-occurrence rate is the number of occurrences of each second participle in the index table and information of video resource data in the index table a ratio of the total number; wherein the second participle is a participle of all the participles in the video resource data except the first participle;

Extracting the second participle with the co-occurrence rate higher than the preset threshold as the associated second participle.

When the video search string is mapped to the plurality of first word segments, respectively extracting a plurality of preset index tables corresponding to the plurality of first word segments; each index table includes video resource data to which the first word segment belongs Information, and all the word segments in the video resource data; all the word segments in the video resource data are obtained by capturing video resource data, extracting feature text information of the video resource data, and performing the feature text information Word segmentation;

Extracting a second participle that appears together with the plurality of first participles as a candidate participle; wherein the second participle is a participle of all the participles in the video resource data except the first participle;

Calculating a co-occurrence rate of the first participle and the candidate participle in each index table, where the co-occurrence rate is the number of occurrences of the candidate participle in the index table and the total information of the video resource data in the index table Ratio

a plurality of weights corresponding to the co-occurrence rate configuration of the plurality of first word segments and the candidate word segment;

Calculating, respectively, an average of a plurality of co-occurrence rates configured with weights as a co-occurrence rate of the plurality of first word segments and the candidate segmentation words;

The candidate participle with the co-occurrence rate higher than the preset threshold is extracted as the associated second participle.

When the video search string is mapped to a plurality of first word segments, a plurality of preset index tables corresponding to the plurality of first word segments are respectively extracted; wherein each index table includes a video to which the first word segment belongs The information of the resource data, and all the word segments in the video resource data; all the word segments in the video resource data are the feature text information of the video resource data by extracting the video resource data, and the feature text is extracted Information is generated by word segmentation;

Determining a main participle by using the plurality of index tables, where the main participle is a first participle corresponding to an index table with the largest total number of pieces of information of video resource data;

Calculating a co-occurrence rate of each of the second participles in the index table and the corresponding second participle in the index table, the co-occurrence rate being the number of occurrences of each second participle in the index table and the total information of the video resource data in the index table a ratio of the second participle being a part of all the participles in the video resource data except the first participle;

In a preferred embodiment of the invention, the feature text information includes a video title, a video keyword, and/or a video description.

In a preferred embodiment of the present invention, the combined push module 404 can also be adapted to:

Pushing a combination of the main participle and the associated second participle.

The various component embodiments of the present invention may be implemented in hardware, or in a software module running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of some or all of the components of the video search based word segmentation information push device in accordance with embodiments of the present invention may be implemented in practice using a microprocessor or digital signal processor (DSP). Features. The invention can also be implemented as a device or device program (e.g., a computer program and a computer program product) for performing some or all of the methods described herein. Such a program implementing the invention may be stored on a computer readable medium or may be in the form of one or more signals. Such signals may be downloaded from an Internet website, provided on a carrier signal, or provided in any other form.

For example, FIG. 5 illustrates a computing device, such as a user terminal device or an application server, that can implement video search based word segmentation information push in accordance with the present invention. The computing device conventionally includes a processor 510 and a computer program product or computer readable medium in the form of a memory 520. The memory 520 may be an electronic memory such as a flash memory, an EEPROM (Electrically Erasable Programmable Read Only Memory), an EPROM, a hard disk, or a ROM. Memory 520 has a memory space 530 for program code 531 for performing any of the method steps described above. For example, storage space 530 for program code may include various program code 531 for implementing various steps in the above methods, respectively. The program code can be read from or written to one or more computer program products. These computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks. Such computer program products are typically portable or fixed storage units as described with reference to FIG. The storage unit may have storage segments, storage spaces, and the like that are similarly arranged to memory 520 in the computing device of FIG. The program code can be compressed, for example, in an appropriate form. Typically, the storage unit includes computer readable code 531 ', ie, code readable by a processor, such as 510, that when executed by a computing device causes the computing device to perform each of the methods described above step.

"an embodiment," or "an embodiment," or "an embodiment," In addition, it is noted that the phrase "in one embodiment" is not necessarily referring to the same embodiment.

In the description provided herein, numerous specific details are set forth. However, it is understood that the embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures, and techniques are not shown in detail so as not to obscure the understanding of the description.

It is to be noted that the above-described embodiments are illustrative of the invention and are not intended to be limiting, and that the invention may be devised without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as a limitation. The word "comprising" does not exclude the presence of the elements or steps that are not recited in the claims. The word "a" or "an" The invention can be implemented by means of hardware comprising several distinct elements and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means can be embodied by the same hardware item. The use of the words first, second, and third does not indicate any order. These words can be interpreted as names.

In addition, it should be noted that the language used in the specification has been selected for the purpose of readability and teaching, and is not intended to be construed or limited. Therefore, many modifications and changes will be apparent to those skilled in the art without departing from the scope of the invention. The disclosure of the present invention is intended to be illustrative, and not restrictive, and the scope of the invention is defined by the appended claims.

Claims

A word search information pushing method based on video search, comprising:

Receiving a video search string;

Mapping the video search string to one or more first word segments;

Finding an associated second participle with a co-occurrence rate of the one or more first participles that is higher than a preset threshold; the co-occurrence rate is that the first one or more first participles and the second participle are in the same video resource data The probability of co-occurrence;

Pushing a combination of the one or more first word segments and the one or more associated second word segments.
The method of claim 1 wherein said step of mapping said video search string to one or more first word segments comprises:

Extracting a participle mapped by the video search string;

or,

When the received video search string is a compound word, the video search string is split into a plurality of search subwords; and a plurality of word segments mapped by the plurality of search subwords are extracted.
The method of claim 1 wherein said step of finding an associated second word segment having a co-occurrence rate with said one or more first word segments that is above a predetermined threshold comprises:

Extracting a preset index table corresponding to the first word segment when the video search string is mapped to a first word segment; wherein the index table includes information of video resource data to which the first word segment belongs, and All the word segments in the video resource data; all the word segments in the video resource data are obtained by capturing video resource data, extracting feature text information of the video resource data, and performing word segmentation on the feature text information;

Calculating a co-occurrence rate of the first participle and each second participle in the index table, where the co-occurrence rate is the number of occurrences of each second participle in the index table and information of video resource data in the index table a ratio of the total number; wherein the second participle is a participle of all the participles in the video resource data except the first participle;

Extracting the second participle with the co-occurrence rate higher than the preset threshold as the associated second participle.
The method of claim 1 wherein said step of finding an associated second word segment having a co-occurrence rate with said one or more first word segments that is above a predetermined threshold comprises:

When the video search string is mapped to the plurality of first word segments, respectively extracting a plurality of preset index tables corresponding to the plurality of first word segments; each index table includes video resource data to which the first word segment belongs Information, and all the word segments in the video resource data; all the word segments in the video resource data are obtained by capturing video resource data, extracting feature text information of the video resource data, and performing the feature text information Word segmentation;

Extracting a second participle that appears together with the plurality of first participles as a candidate participle; wherein the second participle is a participle of all the participles in the video resource data except the first participle;

Calculating a co-occurrence rate of the first participle and the candidate participle in each index table, where the co-occurrence rate is the number of occurrences of the candidate participle in the index table and the total information of the video resource data in the index table Ratio

a plurality of weights corresponding to the co-occurrence rate configuration of the plurality of first word segments and the candidate word segment;

Calculating, respectively, an average of a plurality of co-occurrence rates configured with weights as a co-occurrence rate of the plurality of first word segments and the candidate segmentation words;

The candidate participle with the co-occurrence rate higher than the preset threshold is extracted as the associated second participle.
The method of claim 1 wherein said step of finding an associated second word segment having a co-occurrence rate with said one or more first word segments that is above a predetermined threshold comprises:

When the video search string is mapped to a plurality of first word segments, a plurality of preset index tables corresponding to the plurality of first word segments are respectively extracted; wherein each index table includes a video to which the first word segment belongs The information of the resource data, and all the word segments in the video resource data; all the word segments in the video resource data are the feature text information of the video resource data by extracting the video resource data, and the feature text is extracted Information is generated by word segmentation;

Determining a main participle by using the plurality of index tables, where the main participle is a first participle corresponding to an index table with the largest total number of pieces of information of video resource data;

Calculating a co-occurrence rate of each of the second participles in the index table and the corresponding second participle in the index table, the co-occurrence rate being the number of occurrences of each second participle in the index table and the total information of the video resource data in the index table a ratio of the second participle being a part of all the participles in the video resource data except the first participle;

Extracting the second participle with the co-occurrence rate higher than the preset threshold as the associated second participle.
The method of claim 3 or 4 or 5, wherein the feature text information comprises a video title, Video keywords and/or video descriptions.
The method of claim 5 wherein the step of pushing the combination of the one or more first word segments and the one or more associated second word segments comprises:

Pushing a combination of the main participle and the associated second participle.
A method for pushing an online play portal object based on video search, comprising:

Receiving a video search string;

Mapping the video search string to one or more first word segments;

Finding an associated second participle with a co-occurrence rate of the one or more first participles that is higher than a preset threshold; the co-occurrence rate is that the first one or more first participles and the second participle are in the same video resource data The probability of co-occurrence;

Obtaining a network address of one or more video data resources that match the one or more first word segments and the associated second word segment;

Constructing an entry object for playing the video data resource online according to the one or more video data resource network addresses;

Pushing the one or more ingress objects of the online play video data resource.
A method for pushing an associated resource address based on video search, comprising:

Obtaining feature text information of the first video resource data when receiving a loading or playing request of the first video resource data;

Mapping the feature text information into one or more first word segments;

Finding an associated second participle with a co-occurrence rate of the one or more first participles that is higher than a preset threshold; the co-occurrence rate is that the first one or more first participles and the second participle are in the same video resource data The probability of co-occurrence;

Obtaining a network link address of the second video resource data that matches the one or more first word segments and the associated second word segment;

Pushing the network link address of the second video resource data.
A word segmentation information pushing device based on video search, comprising:

a video search string receiving module adapted to receive a video search string;

a first word segmentation mapping module, configured to map the video search string into one or more first word segments;

a second participle finding module, configured to find an associated second participle with a co-occurrence rate of the one or more first participles being higher than a preset threshold; the co-occurrence rate is one or more current participles and a second participle The probability of co-occurrence in the same video resource data;

A combined push module adapted to push a combination of the one or more first word segments and the one or more associated second word segments.
The apparatus according to claim 10, wherein the first word segmentation mapping module is further adapted to:

Extracting a participle mapped by the video search string;

or,

When the received video search string is a compound word, the video search string is split into a plurality of search subwords; and a plurality of word segments mapped by the plurality of search subwords are extracted.
The device of claim 10, wherein the second word segmentation module is further adapted to:

Extracting a preset index table corresponding to the first word segment when the video search string is mapped to a first word segment; wherein the index table includes information of video resource data to which the first word segment belongs, and All the word segments in the video resource data; all the word segments in the video resource data are obtained by capturing video resource data, extracting feature text information of the video resource data, and performing word segmentation on the feature text information;

Calculating a co-occurrence rate of the first participle and each second participle in the index table, where the co-occurrence rate is the number of occurrences of each second participle in the index table and information of video resource data in the index table a ratio of the total number; wherein the second participle is a participle of all the participles in the video resource data except the first participle;

Extracting the second participle with the co-occurrence rate higher than the preset threshold as the associated second participle.
The device of claim 10, wherein the second word segmentation module is further adapted to:

When the video search string is mapped to the plurality of first word segments, respectively extracting a plurality of preset index tables corresponding to the plurality of first word segments; each index table includes video resource data to which the first word segment belongs Information, as well as the video assets All the word segments in the source data; all the word segments in the video resource data are obtained by capturing video resource data, extracting feature text information of the video resource data, and performing word segmentation on the feature text information;

Extracting a second participle that appears together with the plurality of first participles as a candidate participle; wherein the second participle is a participle of all the participles in the video resource data except the first participle;

Calculating a co-occurrence rate of the first participle and the candidate participle in each index table, where the co-occurrence rate is the number of occurrences of the candidate participle in the index table and the total information of the video resource data in the index table Ratio

a plurality of weights corresponding to the co-occurrence rate configuration of the plurality of first word segments and the candidate word segment;

Calculating, respectively, an average of a plurality of co-occurrence rates configured with weights as a co-occurrence rate of the plurality of first word segments and the candidate segmentation words;

The candidate participle with the co-occurrence rate higher than the preset threshold is extracted as the associated second participle.
The device of claim 10, wherein the second word segmentation module is further adapted to:

When the video search string is mapped to a plurality of first word segments, a plurality of preset index tables corresponding to the plurality of first word segments are respectively extracted; wherein each index table includes a video to which the first word segment belongs The information of the resource data, and all the word segments in the video resource data; all the word segments in the video resource data are the feature text information of the video resource data by extracting the video resource data, and the feature text is extracted Information is generated by word segmentation;

Determining a main participle by using the plurality of index tables, where the main participle is a first participle corresponding to an index table with the largest total number of pieces of information of video resource data;

Calculating a co-occurrence rate of each of the second participles in the index table and the corresponding second participle in the index table, the co-occurrence rate being the number of occurrences of each second participle in the index table and the total information of the video resource data in the index table a ratio of the second participle being a part of all the participles in the video resource data except the first participle;

Extracting the second participle with the co-occurrence rate higher than the preset threshold as the associated second participle.
The apparatus of claim 12 or 13 or 14, the feature text information comprising a video title, a video keyword, and/or a video description.
The device according to claim 14, wherein the combined push module is further adapted to:

Pushing a combination of the main participle and the associated second participle.
A computer program comprising computer readable code, when said computer readable code is run on a computing device, causing said computing device to perform video search based word segmentation information according to any of claims 1-7 Push method.
A computer readable medium storing the computer program of claim 17.