CN103699658A - Method and system for sorting information of video resources - Google Patents

Method and system for sorting information of video resources Download PDF

Info

Publication number
CN103699658A
CN103699658A CN201310739976.6A CN201310739976A CN103699658A CN 103699658 A CN103699658 A CN 103699658A CN 201310739976 A CN201310739976 A CN 201310739976A CN 103699658 A CN103699658 A CN 103699658A
Authority
CN
China
Prior art keywords
information
video
inverted index
file
dictionary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310739976.6A
Other languages
Chinese (zh)
Inventor
曹坤波
郑磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
LeTV Cloud Computing Co Ltd
Original Assignee
LeTV Information Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by LeTV Information Technology Beijing Co Ltd filed Critical LeTV Information Technology Beijing Co Ltd
Priority to CN201310739976.6A priority Critical patent/CN103699658A/en
Publication of CN103699658A publication Critical patent/CN103699658A/en
Priority to US15/101,698 priority patent/US20160306811A1/en
Priority to PCT/CN2014/093176 priority patent/WO2015096609A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/71Indexing; Data structures therefor; Storage structures

Abstract

The invention discloses a method and a system for sorting information of video resources. The method includes acquiring inverted index result sets of video files from preliminarily created inverted index files of video files; providing sorting parameter information and receiving sorting parameters selected by users; sorting the inverted index result sets according to the received sorting parameters. The method and the system have the advantage that the retrieval efficiency and the user experience can be improved by the aid of the method and the system.

Description

The sort method of video resource information and system thereof
Technical field
The present invention relates to information retrieval technique, relate in particular to a kind of sort method and system thereof of video resource information.
Background technology
Along with scientific and technological development, increasing user is by internet hunt and watch various videos.The video information providing due to internet is very abundant, and has the feature of continuous variation and renewal, has produced multiple search engine thereupon and has carried out Video Information Retrieval Techniques:.
In relational database system, index is the mode of retrieve data full blast.But for the video search engine of the whole network, can not meet its specific (special) requirements.What face due to search engine is the massive video data of the whole network, such as large-scale video website search engine indexs such as happy views, is all hundred million grades of even webpage quantity of several hundred billion, in the face of the video data of magnanimity like this, makes Database Systems be difficult to effectively management.
When carrying out the whole network search, can produce a large amount of result for retrieval, and the useful information that user needs can not find rapidly, can not meet sequence demand.
Known in sum, in prior art, there is the technical matters that effective sequencing schemes is not provided for a large amount of result for retrieval of magnanimity video information, be therefore necessary to propose improved technical scheme and address the above problem.
Summary of the invention
Fundamental purpose of the present invention is to provide a kind of sort method and system thereof of video resource information, with what solve that prior art exists, for a large amount of result for retrieval of magnanimity video information, there is no the technical matters of effective sequencing schemes.
In order to address the above problem, according to an aspect of the present invention, provide a kind of sort method of video resource information, it comprises: from the inverted index file of the video file set up in advance, obtain the inverted index result set for video file; Parameters sortnig information is provided, and receives the selected parameters sortnig of user; According to the parameters sortnig receiving, inverted index result set is sorted.
Wherein, described parameters sortnig information comprises: video type, show time, playing duration, information that video file is relevant.
Wherein, described method also comprises: the inverted index file of setting up video file; Describedly from the inverted index file of the video file set up in advance, obtain the inverted index result set for video file, be specially: receive the retrieving information for video resource information; In described inverted index file, mate described retrieving information; According to the data in the described inverted index file mating with described retrieving information, obtain inverted index result set.
Wherein, the described inverted index file of setting up video file comprises: by default participle mode, video file information is carried out to word segmentation processing and obtain keyword; Set up described keyword and there is the index relative between the video file information of described keyword, thereby set up the inverted index file of video file.
Wherein, described method also comprises: dictionary is provided, and the Data Source of described dictionary comprises: basic dictionary, video copyright dictionary, user-generated content; Describedly by default participle mode, video file information is carried out to the step that word segmentation processing obtains keyword and comprise: by default participle mode, file video information is carried out to word segmentation processing, obtain preliminary participle vocabulary; According to described dictionary, described preliminary participle vocabulary is adjusted, obtained keyword.
According to a further aspect in the invention, also provide a kind of ordering system of video resource information, it comprises: acquisition module, obtains the inverted index result set for video file for the inverted index file of the video file from setting up in advance; Parameter provides module, for parameters sortnig information is provided; Parameter receiver module, for receiving the selected parameters sortnig of user; Order module, sorts to inverted index result set for the parameters sortnig receiving according to described receiver module.
Wherein, described parameters sortnig information comprises: video type, show time, playing duration, information that video file is relevant.
Wherein, described system also comprises: set up module, for setting up the inverted index file of video file; Described acquisition module is further used for, for receiving the retrieving information for video resource information, in described inverted index file, mate described retrieving information, according to the data in the described inverted index file mating with described retrieving information, obtain inverted index result set.
Wherein, the described module of setting up comprises: keyword acquisition module, carries out word segmentation processing for the participle mode by default to video file information and obtains keyword; Inverted index is set up module, for setting up described keyword and having the index relative between the video file information of described keyword, thereby sets up the inverted index file of video file.
Wherein, described system also comprises: dictionary maintenance module, and for setting up and safeguard dictionary, the Data Source of described dictionary comprises: basic dictionary, video copyright dictionary, user-generated content; Described keyword acquisition module carries out word segmentation processing by default participle mode to file video information, obtains preliminary participle vocabulary; According to described dictionary, described preliminary participle vocabulary is adjusted, obtained keyword.
According to technical scheme of the present invention, by obtaining the inverted index result set of video file, according to the parameters sortnig receiving, inverted index result set is sorted, when the video frequency searching information in the face of magnanimity, by inverted index, dwindled result set, by just arranging two minor sorts, met sequence demand, thereby improved recall precision and promoted user's experience.
Accompanying drawing explanation
Accompanying drawing described herein is used to provide a further understanding of the present invention, forms the application's a part, and schematic description and description of the present invention is used for explaining the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the process flow diagram of the sort method of video resource information according to an embodiment of the invention;
Fig. 2 is the process flow diagram of the sort method of video resource information according to another embodiment of the present invention;
Fig. 3 is the structured flowchart of the ordering system of video resource information according to an embodiment of the invention;
Fig. 4 is the structured flowchart of the ordering system of video resource information according to another embodiment of the present invention.
Embodiment
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with drawings and the specific embodiments, the present invention is described in further detail.
According to embodiments of the invention, provide a kind of sort method of video resource information.
Fig. 1 is according to the process flow diagram of the sort method of the video resource information of the embodiment of the present invention, and as shown in Figure 1, the method comprises the following steps (step S102-S106):
Step S102 obtains the inverted index result set for video file from the inverted index file of the video file set up in advance.
The data model matching by the data source with multiple source, sets up and meets the data structure of searching for framework, thereby set up the inverted index file of video file.The externally query engine of (user) is provided, reception is for the retrieving information of video resource information, in described inverted index file, mate described retrieving information, according to the data inverted index result in the described inverted index file mating with described retrieving information, and the output packet inverted index result set that contains a plurality of video informations.
Wherein, the source channel of above-mentioned data source comprises: DB(video database), xml(extend markup language), file system etc.
Step S104, provides parameters sortnig information, and receives the selected parameters sortnig of user.
In actual applications, can pass through user interface (User Interface) and user interactions, be provided for the parameter information of sequence and receive the selected parameters sortnig of user.Described parameters sortnig information includes but not limited to: show time, playing duration, information that video file is relevant.Wherein, showing the time or be called issuing time, is the year, month, day equal time information that video information is shown first or issued; Playing duration is the information of the time span of video information; The information that video file is relevant, is the information providing according to the feature of this video file, for special edition, comprises further detailed information of personnel's name of occurring in issue, volume number and video content, video etc.
Step S106, sorts to inverted index result set according to the parameters sortnig receiving.
By above-described embodiment, when the video frequency searching information in the face of magnanimity, by inverted index, dwindled result set, by just sorting and met sequence demand, thereby improved recall precision and promoted user's experience.
Below in conjunction with Fig. 2, describe the embodiment of the present invention in detail.Fig. 2 is according to the process flow diagram of the preferred process scheme of the sort method of the video resource information of the embodiment of the present invention, as shown in Figure 2, comprises the following steps:
Step S202, provides dictionary, and the Data Source of described dictionary includes but not limited to: basic dictionary, video copyright dictionary, user-generated content (User-generated content, referred to as UGC).
Wherein, basic dictionary comprises various dictionaries and dictionary, because video file is not strict consistent with the entry of dictionary, therefore also needs to use video copyright dictionary.The dictionary of video copyright dictionary for obtaining according to the video resource information with copyright, this dictionary can meet the demand of video file information word segmentation processing.And UGC is that generated by user or that provide or original content, some neologisms that do not have in basic dictionary and video copyright dictionary have been supplemented.By above-mentioned multiple dictionary, cooperatively interact and supplement, after word segmentation processing, can access comparatively ideal keyword.
Step S204, carries out word segmentation processing by default participle mode to file video information, obtains preliminary participle vocabulary.Wherein, default participle mode for example binary is divided morphology, maximum matching method, statistical method scheduling algorithm, does not repeat herein.
Step S206, adjusts preliminary participle vocabulary according to described dictionary, thereby obtains keyword.
In step S206, to the preliminary participle vocabulary obtaining in step S204, can in described dictionary, search for, if search described participle vocabulary, think that preliminary participle is accurate, described preliminary participle vocabulary is defined as to keyword; When not searching described participle vocabulary, think that preliminary participle is inaccurate, continue to adopt default participle mode to carry out preliminary word segmentation processing.
Step S208, sets up described keyword and has the index relative between the video file information of described keyword, thereby sets up the inverted index file of video resource.
Step S210, provides query engine, receives the retrieving information of the video resource information of user's input, in described inverted index file, mates this retrieving information, according to the data in the inverted index file mating with described retrieving information, obtains inverted index result set.
For example, user inputs term " Chinese good sound ", according to inverted index file, at the whole network, searches for the video file about " Chinese good sound ", obtains relevant great lot video files.
Step S212, provides parameters sortnig information, and receives the selected parameters sortnig of user.
Hold above-mentioned example, because the quantity about the video file of " Chinese good sound " in network is very huge, the result of search is unsatisfactory for the first time thus.In embodiments of the present invention, provide multiple parameters sortnig information, by user, select the condition that is applicable to oneself to sort for the second time.In actual applications, parameters sortnig information includes but not limited to: show the relevant information of video file such as time, playing duration, issue, tutor's name, student's name.
Step S214, sorts to inverted index result set according to the parameters sortnig receiving.
According to above-described embodiment, by two minor sorts, further dwindled result set, met sequence demand, thereby improved recall precision and promoted user's experience.
According to embodiments of the invention, also provide a kind of ordering system of video resource information.
Fig. 3 is according to the structured flowchart of the ordering system of the video resource information of the embodiment of the present invention, as shown in Figure 3, described system at least comprises: acquisition module 10, parameter provide module 20, parameter receiver module 30 and order module 40, describes structure and the annexation of each module below in detail.
Acquisition module 10, obtains the inverted index result set for video file for the inverted index file of the video file from setting up in advance.
Parameter provides module 20, for parameters sortnig information is provided.Wherein, described parameters sortnig information includes but not limited to: video type, show time, playing duration, information that video file is relevant.
Parameter receiver module 30 provides module 20 to couple with parameter, for receiving the selected parameters sortnig of user.
Order module 40 couples with acquisition module 10 and parameter receiver module 30 respectively, for the parameters sortnig receiving according to described receiver module, inverted index result set is sorted.
With reference to figure 4, in one embodiment of the invention, on the basis of Fig. 3, described system also comprises: set up module 50, for setting up the inverted index file of video file.Based on this, described acquisition module 10 is further used for, reception, for the retrieving information of video resource information, is mated described retrieving information in described inverted index file, according to the data in the described inverted index file mating with described retrieving information, obtains inverted index result set.
Wherein, set up module 50 and further comprise: keyword acquisition module (not shown), for the participle mode by default, video file information is carried out to word segmentation processing and obtain keyword; Inverted index is set up module (not shown), for setting up described keyword and having the index relative between the video file information of described keyword, thereby sets up the inverted index file of video file.
In addition, in one embodiment of the invention, the ordering system of described video resource information also includes: dictionary maintenance module (not shown), for setting up and safeguard dictionary, the Data Source of described dictionary includes but not limited to: basic dictionary, video copyright dictionary, user-generated content.Based on this, described keyword acquisition module carries out word segmentation processing by default participle mode to file video information, obtains preliminary participle vocabulary; According to described dictionary, described preliminary participle vocabulary is adjusted, obtained keyword.
The operation steps of method of the present invention is corresponding with the architectural feature of system, can cross-reference, repeat no longer one by one.
In sum, according to technical scheme of the present invention, by obtaining the inverted index result set of video file, according to the parameters sortnig receiving, inverted index result set is sorted, when the video frequency searching information in the face of magnanimity, by inverted index, dwindled result set, by just arranging two minor sorts, met sequence demand, thereby improved recall precision and promoted user's experience.
The foregoing is only embodiments of the invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in claim scope of the present invention.

Claims (10)

1. a sort method for video resource information, is characterized in that, comprising:
From the inverted index file of the video file set up in advance, obtain the inverted index result set for described video file;
Parameters sortnig information is provided, and receives the selected parameters sortnig of user;
According to the parameters sortnig receiving, inverted index result set is sorted.
2. method according to claim 1, is characterized in that, described parameters sortnig information comprises: video type, show time, playing duration, information that video file is relevant.
3. method according to claim 1, is characterized in that, also comprises:
Set up the inverted index file of video file;
Describedly from the inverted index file of the video file set up in advance, obtain the inverted index result set for video file, be specially:
Reception is for the retrieving information of video resource information;
In described inverted index file, mate described retrieving information;
According to the data in the described inverted index file mating with described retrieving information, obtain inverted index result set.
4. method according to claim 3, is characterized in that, the described inverted index file of setting up video file comprises:
By default participle mode, video file information is carried out to word segmentation processing and obtain keyword;
Set up described keyword and there is the index relative between the video file information of described keyword, thereby set up the inverted index file of video file.
5. method according to claim 4, is characterized in that, also comprises:
Dictionary is provided, and the Data Source of described dictionary comprises: basic dictionary, video copyright dictionary, user-generated content;
Describedly by default participle mode, video file information is carried out to the step that word segmentation processing obtains keyword and comprise: by default participle mode, file video information is carried out to word segmentation processing, obtain preliminary participle vocabulary;
According to described dictionary, described preliminary participle vocabulary is adjusted, obtained keyword.
6. an ordering system for video resource information, is characterized in that, comprising:
Acquisition module, obtains the inverted index result set for video file for the inverted index file of the video file from setting up in advance;
Parameter provides module, for parameters sortnig information is provided;
Parameter receiver module, for receiving the selected parameters sortnig of user;
Order module, sorts to inverted index result set for the parameters sortnig receiving according to described receiver module.
7. system according to claim 6, is characterized in that, described parameters sortnig information comprises: video type, show time, playing duration, information that video file is relevant.
8. system according to claim 6, is characterized in that, also comprises:
Set up module, for setting up the inverted index file of video file;
Described acquisition module is further used for, and receives the retrieving information for video resource information, in described inverted index file, mates described retrieving information, according to the data in the described inverted index file mating with described retrieving information, obtains inverted index result set.
9. system according to claim 8, is characterized in that, the described module of setting up comprises:
Keyword acquisition module, carries out word segmentation processing for the participle mode by default to video file information and obtains keyword;
Inverted index is set up module, for setting up described keyword and having the index relative between the video file information of described keyword, thereby sets up the inverted index file of video file.
10. system according to claim 9, is characterized in that, also comprises:
Dictionary maintenance module, for setting up and safeguard dictionary, the Data Source of described dictionary comprises: basic dictionary, video copyright dictionary, user-generated content;
Described keyword acquisition module carries out word segmentation processing by default participle mode to file video information, obtains preliminary participle vocabulary; According to described dictionary, described preliminary participle vocabulary is adjusted, obtained keyword.
CN201310739976.6A 2013-12-26 2013-12-26 Method and system for sorting information of video resources Pending CN103699658A (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201310739976.6A CN103699658A (en) 2013-12-26 2013-12-26 Method and system for sorting information of video resources
US15/101,698 US20160306811A1 (en) 2013-12-26 2014-12-05 Method and system for creating inverted index file of video resource
PCT/CN2014/093176 WO2015096609A1 (en) 2013-12-26 2014-12-05 Method and system for creating inverted index file of video resource

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310739976.6A CN103699658A (en) 2013-12-26 2013-12-26 Method and system for sorting information of video resources

Publications (1)

Publication Number Publication Date
CN103699658A true CN103699658A (en) 2014-04-02

Family

ID=50361186

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310739976.6A Pending CN103699658A (en) 2013-12-26 2013-12-26 Method and system for sorting information of video resources

Country Status (1)

Country Link
CN (1) CN103699658A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015096609A1 (en) * 2013-12-26 2015-07-02 乐视网信息技术(北京)股份有限公司 Method and system for creating inverted index file of video resource

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6101492A (en) * 1998-07-02 2000-08-08 Lucent Technologies Inc. Methods and apparatus for information indexing and retrieval as well as query expansion using morpho-syntactic analysis
CN101420547A (en) * 2008-11-28 2009-04-29 深圳创维数字技术股份有限公司 Digital television receiver, resource manager and resource management method thereof
CN101888500A (en) * 2009-05-15 2010-11-17 深圳Tcl新技术有限公司 Digital television all-in-one machine and program list realization method thereof
CN102026029A (en) * 2010-11-12 2011-04-20 上海聚欣网络科技有限公司 Method and equipment for information exchange based on electronic program guide
CN102999498A (en) * 2011-09-08 2013-03-27 中兴通讯股份有限公司 Method and device for searching multi-media programs
CN103186550A (en) * 2011-12-27 2013-07-03 盛乐信息技术(上海)有限公司 Method and system for generating video-related video list

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6101492A (en) * 1998-07-02 2000-08-08 Lucent Technologies Inc. Methods and apparatus for information indexing and retrieval as well as query expansion using morpho-syntactic analysis
CN101420547A (en) * 2008-11-28 2009-04-29 深圳创维数字技术股份有限公司 Digital television receiver, resource manager and resource management method thereof
CN101888500A (en) * 2009-05-15 2010-11-17 深圳Tcl新技术有限公司 Digital television all-in-one machine and program list realization method thereof
CN102026029A (en) * 2010-11-12 2011-04-20 上海聚欣网络科技有限公司 Method and equipment for information exchange based on electronic program guide
CN102999498A (en) * 2011-09-08 2013-03-27 中兴通讯股份有限公司 Method and device for searching multi-media programs
CN103186550A (en) * 2011-12-27 2013-07-03 盛乐信息技术(上海)有限公司 Method and system for generating video-related video list

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015096609A1 (en) * 2013-12-26 2015-07-02 乐视网信息技术(北京)股份有限公司 Method and system for creating inverted index file of video resource

Similar Documents

Publication Publication Date Title
CN110941692B (en) Internet political outturn news event extraction method
CN100401300C (en) Searching engine with automating sorting function
CN103678694A (en) Method and system for establishing reverse index file of video resources
CN103577394B (en) A kind of machine translation method based on even numbers group searching tree and device
CN103020159A (en) Method and device for news presentation facing events
CN104423621A (en) Pinyin string processing method and device
US10387805B2 (en) System and method for ranking news feeds
CN101079024A (en) Special word list dynamic generation system and method
WO2015096609A1 (en) Method and system for creating inverted index file of video resource
CN103617174A (en) Distributed searching method based on cloud computing
CN103714158A (en) Vertical search method and system for video websites
CN102662986A (en) System and method for microblog message retrieval
CN110928903B (en) Data extraction method and device, equipment and storage medium
CN103064842A (en) Information subscription processing device and information subscription processing method
CN113239111A (en) Network public opinion visual analysis method and system based on knowledge graph
CN114528312A (en) Method and device for generating structured query language statement
CN106557483B (en) Data processing method, data query method, data processing equipment and data query equipment
CN105512270B (en) Method and device for determining related objects
CN109710730B (en) Patrol information system and analysis method based on natural language analysis processing
CN103699658A (en) Method and system for sorting information of video resources
CN116595043A (en) Big data retrieval method and device
CN109033133A (en) Event detection and tracking based on Feature item weighting growth trend
CN111046059B (en) Low-efficiency SQL statement analysis method and system based on distributed database cluster
CN112883143A (en) Elasticissearch-based digital exhibition searching method and system
CN103714147A (en) Video resource data source processing method and system thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20151229

Address after: Room six, building 19, building 68, No. 100089 South Road, Haidian District, Beijing

Applicant after: LETV CLOUD COMPUTING CO., LTD.

Address before: Room six, building 19, building 68, No. 100089 South Road, Haidian District, Beijing

Applicant before: LeTV Information Technology (Beijing) Co., Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20140402