CN101957831A - Input and process method of feature words of file content - Google Patents

Input and process method of feature words of file content Download PDF

Info

Publication number
CN101957831A
CN101957831A CN2009102104625A CN200910210462A CN101957831A CN 101957831 A CN101957831 A CN 101957831A CN 2009102104625 A CN2009102104625 A CN 2009102104625A CN 200910210462 A CN200910210462 A CN 200910210462A CN 101957831 A CN101957831 A CN 101957831A
Authority
CN
China
Prior art keywords
file
questions record
feature
feature speech
questions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2009102104625A
Other languages
Chinese (zh)
Inventor
刘二中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN2009102104625A priority Critical patent/CN101957831A/en
Publication of CN101957831A publication Critical patent/CN101957831A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a computer executive input and process method of file feature judging information for network terminal users, comprising the following steps: operation A: a computer retrieval system provides a bibliography array containing files which meet the retrieval requirement and come from a plurality of websites; operation B: a computer system determines the input feature words according to the prescriptive operation modes on the page in which the bibliography array is positioned or the direct link pages of the page; and operation C: the computer system determines the bibliographies or the files corresponding to the input feature words in the operation B according to the prescriptive operation modes on the page in which the bibliography array is positioned or the direct link page. The method of the invention can form the accurate feature words database or the classified index of the webpage under the aid of the terminal users, and further form a better search engine and provide better search results more conveniently.

Description

The input and the disposal route of the feature speech of file content
Technical field
Present technique belongs to computer search technology or search engine technique.
Background technology
For many years, the Computer Database retrieval technique has had the progress of very big development, particularly network technology, makes the scale of the database that people can share reach astronomical figure.This also searches information needed to people and has brought very big difficulty.
With the query word search is that the search engine technique of core is that the user has brought facility.This system can obtain inquiry's keyword query request by interactive interface on the client computer and communication network, in text index storehouse or text library, inquire about, and carry out the correlation analysis of keyword request and text, obtain correlated results and ordering, be provided to interactive interface via communication network or circuit again.This search system uses very convenient rapid, but the index sum that the return result comprises is still very huge, is difficult to consult one by one.
For the potential Query Result to inquiry's most worthy can be come the front to make things convenient for the inquiry as far as possible, the 6th, 285, No. 999 United States Patent (USP)s have proposed to carry out based on webpage hyperlink structure analysis (Page link) technology of Search Results ordering, surpass other ordering techniques, obtained unprecedented success.
Yet this technology and other various ordering techniques only are the efficient that has improved keyword search on statistical significance, can not guarantee that Query Result that everyone wishes can both come the front of huge concordance list.We but helplessly read the irrelevant information that all main contents repeat again and again before reading the information of expectation.
For convenience the user finds information needed or file, and people also seek help from the vertical classification technology and based on the catalogue retrieval system of this technology.In order to classify to magnanimity information or to determine feature, various computer version sorting techniques have appearred.Yet, judge that by machine it is very difficult that a certain page or text belong to the semanteme of which bar of certain keyword or which bar or feature or classification, its reliability and accuracy rate are not high, and particularly in multiclass classification, the error rate height must make us and can't stand.Therefore, computer classification only is used for the simplest rude classification, and for example frequency or the format character that occurs according to some groups of speech judges that online file is " webpage " or " map " or " MP3 " or the like.
At present, the higher vertical classification technology of accuracy rate also be unable to do without artificial the participation.For example the nineties in 20th century website such as Yahoo artificial information classification system, can only bear the classification processing of few a part of network information.Other is as the very limited professional classification information of the various quantity of " Baidu's encyclopaedia ", " wikipedia ", " Taobao ", " Alibaba ", all be by special separately database platform, by registered members or registered user or website staff entry at particular range, compile according to special redaction rule and to form, the subsidiary categorised content that wherein comprises also can only be entry or the text at the notebook data storehouse.We can say that for the user of nonspecific internet database content, the help that obtains is very limited when search.
Therefore, domestic and international numerous netizens press for a kind of new technology, make Machine Retrieval System or search engine system that hundreds of millions webpage questions record information of tens thousand of different web sites can not only be provided to the inquiry, can also determine accurate feature or classification or the multi-stage characteristics or the classification of numerous different web sites source pages, and by inquiry's searching keyword requirement with to the requirement of web page characteristics or classification, the Search Results that provides accuracy rate and concentration degree to be greatly improved.For this reason, be badly in need of a kind of convenient technology of judging suggestion about web page characteristics of being convenient to compile and handle.
Summary of the invention
The object of the present invention is to provide a kind of method that is suitable for Machine Retrieval System or search engine system use, make it in the questions record sequence that provides to online terminal or user about the query word Search Results, can allow to import easily the feature speech that user or staff determine for the associated documents in different web sites source, and input information handled, be convenient to the gopher that comprises different characteristic speech or classification results that the user utilizes so that produce, improve the efficient of retrieval or search greatly.
The present invention be a kind of computing machine carry out about input and the disposal route of network terminal user to the file characteristic determination information, comprising:
Operation A: the search request that Machine Retrieval System proposes according to the terminal user, the questions record sequence of the questions record that the file that provides meet this search request comprising of source, a plurality of website to user terminal forms;
Operation B: computer system is determined the feature speech imported according to the predetermined operation mode on the page of directly linking at the described questions record sequence place page or this page;
Wherein said predetermined operation mode is one of following mode of operation:
Mode of operation one: with the described questions record of operation A or be subjected to cursor under it in the file content and choose the words of click as the feature speech of being imported;
Mode of operation two: be subjected to the words that cursor is chosen click in the feature entry word record to be selected that present on the page with the described questions record sequence of the operation A place page or the direct link of this page or directly link, as the feature speech of being imported;
Mode of operation three: the described questions record sequence of the operation A place page or this page directly page of link are provided with feature speech input field, and computer system is according to the definite feature speech of being imported of the input content in this input field.
Wherein, the input content of feature speech input field can be from keyboard, also can be from the stickup that the described questions record of the operation A place page or this page is directly linked the page or page top, feature speech input field place words.
When needing,, can limit feature speech input field and only appear on the page of described questions record sequence place for simple to operation.
Wherein, the one or more words or the phrase that can reflect corresponding questions record or document content feature can be thought for the terminal user is selected in described feature speech.Described words can be character or symbol or note or figure.
Described input field is meant space or the position of importing or filling in words on the terminal page.
Described Machine Retrieval System can be a search engine system.Described computer system or searching system can be the ingredient of Machine Retrieval System.
Described terminal user can be the author of netizen or webpage or the supplier of webpage, perhaps network or searching system staff.
Described file can be partial content or searching system or other computer system unloading content (as snapshots of web pages) of webpage or webpage, can be or comprises word content, also can be or comprises image content or audio content or video content.
Described questions record can be that the title of file or summary or title add summary, can be or comprises image content or audio content or video content.
Input of the present invention and disposal route also comprise:
Operation C: computer system is determined corresponding questions record of feature speech or file with the described input of operation B according to the prescribed manner that directly links at the described questions record sequence place page or this page on the page;
Wherein said prescribed manner is one of following manner:
Mode I: will operate mode of operation one described questions record or the file that cursor is chosen the words place of click, the corresponding questions record of feature speech or the file that are defined as and import of being subjected among the B;
Mode II: corresponding questions record of feature speech or file that the questions record that will be clicked or file are defined as and import;
Near mode III: questions record or file the feature word judgment operation that will be clicked indicates, the corresponding questions record of feature speech or the file that are defined as and import;
Mode IV: with on the page at feature speech input field place apart from this input field questions record or file nearest or that be positioned at this input field regulation orientation, the corresponding questions record of feature speech or the file that are defined as Yu are imported;
Mode V: unique questions record or file on the page with feature speech input field place, the corresponding questions record of feature speech or the file that are defined as and import;
Mode VI: will operate on the page at the described mode of operation of B two described feature entry words record to be selected places apart from this catalogue questions record or file nearest or that be positioned at this catalogue regulation orientation, the corresponding questions record of feature speech or the file that are defined as Yu are imported;
Mode VII: will operate unique questions record or file on the page at the described mode of operation of B two described feature entry words record to be selected places, the corresponding questions record of feature speech or the file that are defined as and import.
As required predetermined operation B with the operation C precedence.
We can with certain questions record or the corresponding feature speech of certain file, be called the feature speech that belongs to this questions record or this document, perhaps be called this questions record or this document characteristic of correspondence speech, perhaps be called the feature speech of this questions record or this document.
In the above method, allow identical file or its questions record can have a plurality of different classifiers simultaneously, a kind of feature speech can belong to a plurality of different questions records or file simultaneously.
Can think that generally the feature speech of the file under the feature speech of a questions record and this questions record is identical.
Described feature speech can be the keyword of corresponding questions record of reflection or document content feature, also can be speech or the keyword that occurs in corresponding questions record or file, and described feature speech input field also can be the keyword input field.
Described feature speech can be the classifier of corresponding questions record of reflection or file content classification, or reflects the classifier of its different stage in the multiclass classification system, and described feature speech input field also can be the classifier input field.
In described input and disposal route, directly link on the page at the described questions record sequence place page or this page, additional feature entry word record to be selected is set.
This feature entry word record to be selected can be the classification catalogue that comprises a plurality of different classes of speech.Described feature speech to be selected or classification catalogue can be first class catalogue or multistage catalogue or tree-shaped catalogue.
Can arrange: in described classification catalogue, can show its affiliated next stage classification clauses and subclauses before or after upper level classification clauses and subclauses are clicked automatically.
In the method, can allow to click or brush and get the mode of required words to described input field input feature vector speech by cursor in the described catalogue of described setting.
Obviously, the feature speech of the described input of this method is exactly determination information or the feature speech corresponding with it to relevant questions record or file characteristic that terminal is clicked operator's input.
This method can also comprise: relative computer system can be accepted or reference or processing or the refusal feature that the terminal user imported judgement suggestion or feature speech or classifier at its lane database.
Input of the present invention and disposal route, can also comprise: described computer system or database import according to the terminal user that suggestion is determined or when input and arbitrary questions record or corresponding feature speech of file or classifier, and the principle that need follow can be considered one or more in the following factor at least:
The similarity degree of the supplier's of the user's who (1) decisions making title or the network address of its website and this document title or its network address or file chaining network address;
(2) make the number of users of same judgement;
(3) make the time of certain judgement;
(4) user who decisions making or come from accuracy rate or the scoring that same network address was clicked selection in the past;
(5) selection of this kind feature speech and other artificial selection method or computing machine system of selection or selective system result's consistent degree;
(6) whether be searching system operating personnel or staff's institute's judge or similar to it.
Whether user who (7) decisions making or terminal be in related web site that carries out feature word judgment or selection or webpage registration.
Method of the present invention can also comprise operation D 1: searching system all or part of according to described method that determine with data a plurality of questions records or the corresponding feature speech of file, generate the feature speech content that comprises a plurality of files or questions record or with the database of the similarities and differences classification of its feature speech or classifier.
Method of the present invention can also comprise operation D2: searching system is all or part of according to the described method feature speech that comprises a plurality of files or questions record that generates with data a plurality of files or the corresponding feature speech of questions record or operation D1 that determine or the database of classifier content, generates feature glossarial index or the classifier index or the category index of a plurality of files or questions record.
Described feature glossarial index can be meant, utilizes this index to go to retrieve or visit or link file or its questions record or its address or its relevant information corresponding to this feature speech according to arbitrary feature speech of selecting.
Described category index can be meant, utilizes this index to go to retrieve or visit or link file or its questions record or its address or its relevant information corresponding to this classifier according to arbitrary classifier of selecting.
Input of the present invention and disposal route can also comprise: utilize this method that the classification of feature speech or the category index of a plurality of files are substituted or revise other original classification or category index to a plurality of files.
Method of the present invention can also comprise: when accepting inquiry, searching system is utilized described feature glossarial index or category index, and the retrieval or the Search Results that meet required feature speech or classifier requirement are provided.This result can comprise questions record or questions record sequence or catalogue or tree-shaped catalogue.
Input of the present invention and disposal route, can also comprise: when accepting inquiry, query word index or the keyword index utilized when searching system is utilized the search request that described feature glossarial index or category index and Machine Retrieval System processing terminal user propose obtain or provide retrieval or the Search Results that not only meets the requirement of required feature speech but also meet required search request.This result can comprise questions record or questions record sequence or catalogue.
Input of the present invention and disposal route can also comprise:
Operation E: Machine Retrieval System when search service is provided, the search request that proposes according to the network inquiry user, the sequence of the questions record of a plurality of files that provide to user terminal; Near part or all of each questions record of described questions record sequence, can have the prompting of the affiliated one or more feature speech of each questions record or its affiliated file respectively.
Described feature speech prompting can be this feature speech or the prompting that comprises this feature speech.
This method allows the operation according to the terminal user, increases or reduces or replace described feature speech and point out.
Described feature speech prompting can be the prompting of the keyword of corresponding questions record of reflection or document content feature, it also can be the prompting of the keyword that in corresponding questions record or file, occurs, when needing, allow the prompting of keyword to appear at the described questions record of operation E between the lines.
Described feature speech prompting also can be the classifier prompting, can be the classification prompting of single-stage or multiclass classification system.
Input of the present invention and disposal route can also comprise:
Operation F: can make near each feature speech prompting the operation E described questions record, respectively can with the sequence link of other a plurality of file questions records; File under part or all of questions record in the sequence of other a plurality of file questions records of described link or the questions record at least respectively has the feature speech under in the of, and is identical with feature speech in the original feature speech prompting of this sequence link.
When needing, can require to operate part or all of questions record in the sequence of other a plurality of file questions records of the described link of F or the file under the questions record, also want the search request of the original proposition of the described user of match operation E.
Input of the present invention and disposal route, can also comprise operation G: the search request that proposes according to the network inquiry user at Machine Retrieval System is near a plurality of file questions record sequences that user terminal provides, has the navigation directory that a plurality of feature speech promptings are formed, the prompting of each feature speech can different with each the respectively sequence link that comprises a plurality of file questions records, file under part or all of questions record in the sequence of other a plurality of file questions records of described link or the questions record, feature speech under in the of one is at least respectively arranged, identical with feature speech in the original feature speech prompting of this sequence link.
When needing, can require to operate part or all of questions record in the sequence of other a plurality of file questions records of the described link of G or the file under the questions record, also will meet the original search request that proposes of described inquiring user.
The prompting of the feature speech of described navigation directory can be the prompting of the keyword of corresponding questions record of reflection or document content feature, also can be the prompting of the keyword that occurs in corresponding questions record or file, also can be the classifier prompting.
Described navigation directory can be first class catalogue or multistage catalogue.Automatically show a plurality of feature speech promptings that next stage is to be selected again after can allowing the upper level feature selected ci poem of this catalogue to select to determine.
This method allows the operation according to the terminal user, increases or reduces or replace this directory feature speech and point out.
This method also allows providing near operation F and operation the questions record that G linked or showed or near the questions record sequence, having prompting of feature speech or navigation directory, with link or for the questions record sequence results of clicking the displaying renewal.
Method of the present invention is the feature speech problem identificatioin from hundreds of millions webpage questions records of millions upon millions of different web sites that search engine system can compile, provide one can the essence solution.Any netizen even comprise the network system staff, the particularly supplier of webpage or author or promoter, in the questions record sequence of the keyword search results of search engine, see with the interests of oneself or during the relevant file questions record of interest, utilize technology of the present invention, can very determine or import feature speech or the keyword or the classifier of this document easily.Have that the webpage of a plurality of speech of feature accurately is easier to be arrived by first search, like this, valuable webpage majority has relevant expert personage and determines the feature speech for it.Method of the present invention can also guarantee that the input suggestion of file relevant people can preferentially be adopted.On basis of the present invention, search engine system can provide the service of high-quality characteristic word and search for the high-quality webpage of significant proportion, even multiclass classification retrieval service, the Search Results that obtains high concentration or highly concentrate, improve the efficient of the online search of numerous netizens greatly, solve a difficult problem that perplexs the netizen for many years, thereby present technique have outstanding practical value and effect.
Description of drawings
Fig. 1 is the suitable environment synoptic diagram of embodiments of the invention.
Fig. 2 is the explanation synoptic diagram at questions record sequence page input feature vector speech of one embodiment of the present of invention.
Fig. 3 is the questions record of the questions record sequence page or the character pair speech prompting (keyword prompting) of file attachments and the synoptic diagram of navigation directory under it of the user inquiring Search Results of one embodiment of the present of invention.
Fig. 4 is the questions record of the questions record sequence page or the feature speech prompting (multistage classifier prompting) of affiliated file attachments and the synoptic diagram of navigation directory of the user inquiring Search Results of another one embodiment of the present invention.
Fig. 5 is the schematic process flow diagram of the implementation method of one embodiment of the present of invention.
Concrete embodiment
Below in conjunction with the concrete implementation method of description of drawings.Wherein, search engine system 101 is a kind of specialized types of Machine Retrieval System 102.They get in touch (referring to Fig. 1) by internet 103 and user terminal 104.
In the embodiment of Fig. 2, Fig. 3, Fig. 4,201 is the inquiry hurdle of input inquiry speech, 202 is questions record, 203 is feature speech input field, 204 are feature word judgment operation sign, 205 cursors for mouse apparatus operation, 206 be reference list, 208 is that described mode one is described is subjected to the words that cursor is chosen click; 301 are feature speech prompting (keyword prompting), and 302 for selected operation indicates, and 303 for the new-added item operation indicates, and 304 is navigation directory; 401 are feature speech prompting (classifier prompting).
For instance, implement this method (referring to Fig. 5), should be from operation A, at first need coordinate indexing system or search engine system to accept network inquiry user or terminal user in inquiry hurdle 201 input inquiry requirements (flow process 501), provide the query search service to user terminal, promptly to its questions record that provides a plurality of files that meet search request in source, a plurality of website to form sequence (flow process 502) 202 compositions or that participate in composition.
Described file can be a webpage, can comprise word content, also can comprise image content or audio content or video content.
Described questions record can be the content that the title of file or summary or title add summary or partial content or unloading, as snapshots of web pages, cache web pages etc.
The questions record of described file also can comprise all kinds of contents, Tu Xiang breviary content for example, syllable or music score fragment, or the fragment of audio or video or breviary content, perhaps screenshotss or screenshotss partial picture.
Method of the present invention is to the classification of the webpage of image content or audio content or video content or file or set up category index, has more the meaning of particular importance.
This method also needs to operate B: computer system is determined the feature speech (flow process 503) that the terminal user imports.
That described feature speech is assert by the terminal user or import words, can reflect corresponding questions record or document characteristic, as keyword or classifier, can be character or symbol or note or figure or pictorial symbolization, when needing, for example can be syllable or the music score fragment relevant with audio file or video file.
The concrete operations mode of input feature vector speech (or keyword or classifier) mainly contains 3 kinds.The first is chosen the words of click (208) as the feature speech of being imported with being subjected to cursor in the described questions record of operation A or its file content that directly links.
It can be to make the slippage on relevant words of the cursor of click state that what is called is chosen click, also can be other mode of operation of agreement.Be preferably in before this during concrete enforcement or after this cooperate the operation of click feature word judgment to indicate 204, perhaps otherwise make the terminal page be in feature speech mode of operation, be beneficial to computer Recognition.
The operation of described feature word judgment indicates (being called for short operation indicates), is meant in order to accept to click to enter feature speech mode of operation or in order to the questions record that indicates feature word judgment correspondence or file or in order to link character or sign or the figure or the graph key of feature entry word record to be selected or other associative operation.Printed words of " set feature speech " 204 of Fig. 2 or " record of chain feature entry word " or " drawing generic operation indicates " or " participating in classifying " or the like for example.
The mode of another kind of input feature vector speech, be with being subjected to the words that cursor is chosen click on the page of the described questions record sequence of operation A place or in feature entry word record 206 (as " reference list " 206 among Fig. 2) to be selected that present on the page of the direct link of this page or directly link, as the feature speech of being imported.
The page that the page directly links or catalogue are meant questions record on the page of questions record sequence place or feature word judgment operation sign or Catalog Header or prompting or the page or catalogue that other lexical item or content linked.
When needing, can make feature entry word record to be selected when the terminal page is in feature speech mode of operation or in other needs, occur at the page.
The third mode of operation is that the page that the described questions record sequence of the operation A place page or this page directly link is provided with feature speech input field 203 or input frame, and computer system is according to the definite feature speech of being imported of the input content in this input field.
In the method, the input content of feature speech input field can be from keyboard, also can be from the stickup that the described questions record of the operation A place page or this page is directly linked page top words, perhaps can allow to click or brush is got the mode of required words to described input field 203 input feature vector speech by cursor in the described questions record of described setting or file or feature entry word to be selected record.
In fact, feature speech input field can be an input frame, or near the local space respective markers on the described page or prompting words (for example " input of feature speech " or " feature speech " or " keyword " or " classification ").
Need make the terminal page be in feature speech mode of operation, can set in advance by inquiry system, or click selection by the terminal user.When needing, can stipulate also that when the feature word judgment on page operation indicates 204 or feature entry word to be selected record 206 or feature speech input field 203 when being subjected to clicking the back or having the input content, the terminal page enters or is in feature speech mode of operation.
This method also needs computer system according to directly linking one of following manner on the page at the described questions record sequence place page or this page, determines corresponding questions record of feature speech or file (operation C) (flow process 503) with the described input of operation B.
Specifically, can use-pattern I: will operate mode of operation one described questions record 202 or the file that cursor is chosen words 208 places of click, the corresponding questions record of feature speech or the file that are defined as and import of being subjected among the B.This moment, the terminal page should be in feature speech mode of operation, was beneficial to avoid obscure with other linked operation.
Perhaps mode II: corresponding questions record 202 of the feature speech that the questions record that will be clicked or file are defined as and import or file.This moment, the terminal page preferably was in feature speech mode of operation.
Perhaps mode III: the feature word judgment operation that will be clicked indicates near questions record or the files 204, the corresponding questions record of feature speech or the file that are defined as and import.This moment, questions record or file were preferably corresponding one by one with feature decision sign 204.
Or mode IV: with on the page at feature speech input field place apart from this input field questions record or file nearest or that be positioned at this input field regulation orientation (for example), the corresponding questions record of feature speech or the file that are defined as Yu are imported.
Perhaps mode V: unique questions record or file on the page with feature speech input field place, the corresponding questions record of feature speech or the file that are defined as and import.
Perhaps mode VI: will operate on the page at the described mode of operation of B two described feature entry words record to be selected places apart from this catalogue questions record or file nearest or that be positioned at this catalogue regulation orientation (for example left or right-hand), the corresponding questions record of feature speech or the file that are defined as Yu are imported.
Or mode VII: will operate unique or questions record or file on the page at the described mode of operation of B two described feature entry words record to be selected places, the corresponding questions record of feature speech or the file that are defined as and import.
In fact, can arrange to operate B and the precedence of operation C and terminal user's working rule as required.
Described feature entry word to be selected record can by the terminal user when the input feature vector speech with reference to or a plurality of words of selecting for use form.According to the difference of feature speech to be selected, this catalogue can have the title of " reference list " or " classification catalogue " or " keyword suggestion " and so on.
In one embodiment, we can be at each bar questions record downside setting " recommended keywords: " or " selected classifier; " printed words, form input field, so that user's input.For fear of maloperation, the input field rear side can also have " choosing is finished " printed words, confirms for clicking.Like this, the user only need import in the input field of corresponding questions record or " stickup " goes into keyword or classifier, clicks " choosing is finished " again, has just finished the definite work to this document feature speech.This embodiment has utilized described mode of operation three and mode IV.
In another embodiment of this method, all have the printed words (operation of feature word judgment indicates) of " classification " at the downside or the end of each bar questions record.After the user clicked these printed words, five-star a plurality of classifiers of a classifier catalogue to be selected can appear in the page one side.Behind user's click classifier wherein, can occur belonging to a plurality of classifiers of such other next stage in this catalogue, select to click for the user.The rest may be inferred, and the user clicks " selecting " printed words after selecting to finish, and system will import into each classifier of the multiclass classification of this questions record automatically.This embodiment has utilized described mode of operation two and mode III.
In concrete implementation process, can also utilize mode of operation one and mode I,
Or utilize mode of operation two and mode II,
Or utilize mode of operation two and mode VI,
Or utilize mode of operation two and mode VII
Or utilize mode of operation three and mode III,
Or utilize mode of operation three and mode V,
Or utilize mode of operation three and mode VI,
Or utilize mode of operation three and or mode VII, for corresponding questions record or corresponding document in the questions record sequence are determined its feature speech or keyword or classifier.
This method allows to provide one or more or overlaps the record of elite feature entry word or catchword catalog or classifier catalogue to be selected or multistage classifier catalogue more, selects for use for user terminal.
In the ordinary course of things, can think that the feature speech of file is same or similar under the feature speech of certain questions record and this questions record, can be directly obtain the feature speech of file under this questions record, perhaps judge conversely according to the feature speech of certain questions record.
Obviously, the feature speech of the described input of this method is exactly determination information or the feature speech corresponding with it to relevant questions record or file characteristic that terminal is clicked operator's input.
This method can also comprise relative computer system at its lane database, can accept or reference or processing or the refusal feature that the terminal user imported judgement suggestion or feature speech or classifier.
Like this, according to operation A, B, C, just imported terminal and clicked feature speech or keyword or the classifier information of operator, be i.e. 504 " determining questions record file characteristic speech " by input to relevant questions record or file.Computer system or searching system can directly be utilized these information, but might also need the category information of drawing of input is handled.
Obviously, searching system selects to determine that according to Internet user's click also there is a problem in the feature speech of each file: if a plurality of user or terminal operation person have made different choice, and should be what if? the problem that the flow process 505 of Here it is Fig. 5 " different input suggestions are handled " will solve.
Searching system is in the face of the possible contradiction suggestion of user or terminal operation person input, determines or when input and arbitrary questions record or corresponding feature speech of file or classifier, and the principle that need follow can be considered one or more in the following factor at least:
The similarity degree of the supplier's of the user's who (1) decisions making title or the network address of its website and this document title or its network address or file chaining network address;
Similar more, the user that classification the is selected possibility consistent with the supplier of original file is just big more.
(2) make the number of users of same judgement;
The number of users of same suggestion is many more, and suggestion is reliable more.
(3) make the time of certain judgement;
In order to form category index as early as possible, can not wait for too of a specified duration; But suggestion for revision afterwards may be more pertinent.
(4) user who decisions making or come from accuracy rate or the scoring that same network address was clicked selection in the past;
Should more pay attention to high-level user's suggestion.
(5) selection of this kind feature speech and other artificial selection method or computing machine system of selection or selective system result's consistent degree;
So both can also can avoid changing too much with reference to being fruitful.
(6) whether be searching system operating personnel or staff's institute's judge or similar to it.
Whether user who (7) decisions making or terminal be in related web site that carries out feature word judgment or selection or webpage registration.
In fact, can pay the utmost attention to (1) or (6) or (7) when needing, consider other factors again.
Also can write the algebraic expression of certain objective function, the variable of this functional expression comprises the one or more of above-mentioned 7 kinds of factors at least.Can determine the priority of different classification according to the size of target function value.
Feature speech (particularly keyword) possibility quantity at arbitrary questions record or file is a lot, can arrange priority with reference to above factor, and the maximum quantity that keeps or provide is provided appropriateness.
In fact, might not have only one, two or more can be arranged, priority can be arranged at the same one-level classifier or the categorizing selection of arbitrary questions record or file.Can limit quantity, for example 2 or 3 kind corresponding to each grade classifier of arbitrary questions record or file.
Method of the present invention can also comprise 506 " forming file characteristic speech data, index ": searching system all or part of according to described method that determine with data a plurality of questions records or the corresponding feature speech of file, generate the feature speech content that comprises a plurality of files or questions record or with the database of the similarities and differences classification of its feature speech or classifier; And the feature glossarial index or classifier index or category index or the keyword index that generate a plurality of files or questions record, perhaps feature speech or keyword or the classifier inverted index known of people perhaps arranged title catalogues.
Described feature glossarial index can be meant, utilizes this index to go to retrieve or visit or link file or its questions record or its address or its relevant information corresponding to this feature speech according to arbitrary feature speech of selecting.
Utilize category index to go to retrieve or visit or link file or its questions record or its address or its relevant information according to arbitrary classifier of selecting corresponding to this classifier.
When needing, can also generate the sort file or bibliographic database or the multiclass classification index that comprise a plurality of different subclass or multistage subclass according to the difference of feature speech or the keyword or the classifier of each questions record or file.
Method of the present invention can also comprise: utilize this method that the classification of feature speech or the category index of a plurality of files are substituted or revise other original classification or category index to a plurality of files.
This method can also comprise: when accepting inquiry, searching system is utilized described feature glossarial index or category index, and the retrieval or the Search Results that meet required feature speech or classifier requirement are provided.This result can comprise questions record or questions record sequence or catalogue or tree-shaped catalogue.
In flow process 504 or flow process 506,, can return (flow process 510) to flow process 501 if the terminal user wishes to begin other questions record or file are carried out the feature word judgment.
Obviously, purpose of the present invention is not only to set up the feature speech database or the feature speech inverted index of relevant document.Method of the present invention comprises that also any these index or data utilized carry out questions record search.
Therefore, this method can also comprise:
The search request that Machine Retrieval System proposes according to the network inquiry user, the questions record sequence of a plurality of files that provide to user terminal; Near part or all of each questions record of described questions record sequence, can have each questions record respectively or the prompting of the feature speech under the file (301 or 401) (flow process 507) under it.
Described feature speech also can be and this questions record or file is relevant under it keyword prompting 301 (Fig. 3).
Feature speech prompting under each questions record or its affiliated file can be single-stage or multistage classification prompting 401 (referring to Fig. 4).
Multistage classifier prompting under so-called shows a plurality of symbol or the graph keys that belong to classifier or the item name or the classification clauses and subclauses of different stage classification respectively or represent classification that are suitable for this questions record or its affiliated file exactly.
Obviously, each classifier of the multistage classifier prompting under so-called, classification size no matter all is this questions record or classifier under the file under it.Take up room compared with showing general tree-shaped catalogue or navigation directory general, not only having significantly reduced like this, also relevant questions record is had direct specific aim or analogy or indicative.
For example, a certain file or questions record belong to " physics " this next stage subclass classifier more of next stage subclass classifier " science " the inside in " Xue Zhi " this big class classifier, so, for example " Xue Zhi will occur near this questions record; Academic; Physics " printed words 401, as the multistage classifier prompting of this questions record.
The described keyword prompting relevant with this questions record or its affiliated file can not be meant original searching keyword, can reflect that preferably this questions record or its affiliated file are different from the far reaching keyword of the characteristics or the content of many other questions records of former sequence.
Be implemented near the increase of questions record or show this questions record or the multi-stage characteristics speech under file or keyword or classifier prompting under it, several different methods can be arranged.A kind of is address or the network address visit this document that utilizes its subsidiary affiliated file of this questions record, and then obtains feature speech or the keyword or the classifier information (utilizing 506 result) of this document, adds near the former questions record again.Another kind method is when generation has the keyword of unique characteristics word information file or query word and arranges title catalogues, directly makes multi-stage characteristics speech or the keyword or the classifier information of the subsidiary original of each questions record, shows with each questions record.Perhaps utilize other method.
We can make near the feature speech prompting the described questions record, respectively can with the sequence link 509 of other a plurality of file questions records; Part or all of questions record in the sequence of other a plurality of file questions records of described link or the file under it, its feature speech (or keyword or classifier are identical with the feature speech (or keyword or classifier) of the original prompting of this sequence link, and can meet or not meet the search request that original user proposes.
When for example needing, when search subscriber certain feature speech in clicking selective a plurality of promptings is pointed out, will obtain new that belong to this feature speech and meet the file questions record sequence (flow process 509) of the search request that original user proposes, can dwindle greatly like this or freely regulate and control the hunting zone, obtain Query Result and required file.
Obviously, near the questions record in the described new file questions record sequence that meets this feature speech that obtains, also can have this questions record simultaneously or a plurality of different characteristic speech under the file or classifier prompting or keyword prompting under it; Also can make wherein affiliated a plurality of different characteristic speech promptings or prompting of class rank speech or the keyword prompting of the affiliated file of questions record, point out the sequence link of relevant a plurality of file questions records with other with these respectively.And can the rest may be inferred.
In existing retrieval technique, also can be provided at the multiclass classification catalogue (as patent documentation IC catalogue) in the specific scope sometimes, but non-professional domestic consumer often can not accurately hold the connotation or the definite coverage of each classifier, usually select classification mistakenly, had a strong impact on retrieval rate.
Some search engine system provides the prompting or the link of " similar webpage " or " same web site " and so at the questions record end of Search Results, but the result who obtains is too general or mixed and disorderly, and use is very limited.
And near the method for the multi-stage characteristics speech prompting that shows simultaneously the questions record that provides when inquiry of the present invention can be brought very big facility for the inquiry.When the user sees interesting questions record,, can click higher feature speech or classifier (for example aforesaid " Xue Zhi ") in the prompting if wish to obtain and the identical questions record sequence of the big class of former questions record (more senior classification); If wish to obtain and the identical questions record sequence of former questions record subclass (more rudimentary classification), can directly click feature speech more rudimentary in the prompting (for example aforesaid " physics ").Like this, can keep the inquiry to click the accuracy and the dirigibility of selection simultaneously, improve the efficient of inquiry greatly, and improve user's inquiry experience.
Classifier of the present invention prompting or keyword prompting link with new questions record sequence, can be directly to link or indirect link 509.
Described prompting can at first be linked to the query search that has increased feature speech in the prompting or respective classes speech or keyword on the basis of former inquiry, thereby obtains required questions record sequence.
Described prompting can be linked at first also that further on the search questions record sequence results basis of former inquiry what do is the query search that query logic requires with the feature speech in the prompting or respective classes speech or keyword, thereby obtains required questions record sequence.
When needing, can will not appear at the questions record of described new questions record sequence in the former Query Result sequence yet, be arranged in the back of described new questions record sequence or suitably after move.
Can arrange when needing: the search request that proposes according to the network inquiry user at Machine Retrieval System is near the questions record sequence that user terminal provides, have the navigation directory (flow process 508) that a plurality of promptings are formed, each feature speech prompting can different with each the respectively sequence link that comprises a plurality of file questions records.That is to say, if the user clicks a certain feature speech (also can arrange to click again the operating key of " search " or " affirmation " or other title) in this catalogue when search, will obtain the new file questions record sequence 509 that meets this feature speech, the feature speech of the file under the questions record in this sequence, identical with the feature speech of (clicked) in the original prompting of this sequence link, and can still meet or not meet the search request that original user proposes.
Described navigation directory also can be first class catalogue or multistage catalogue.Can allow the upper level classification of this catalogue to select to determine that preceding or definite back shows next stage classification to be selected automatically.
The feature speech prompting of described navigation directory can be that the classifier prompting also can be the keyword prompting.
The feature speech prompting of described navigation directory links with new questions record sequence, can be directly to link or indirect link.Described prompting can at first be linked to has increased the query search that the keyword prompting requires in the prompting on the basis of former query word, thereby obtains required questions record sequence.Described prompting also can at first be linked on the search questions record sequence results basis of former search request further do with the prompting in the feature speech be the query search that query logic requires, thereby obtain required new questions record sequence.When needing, also can be with in the former Query Result sequence, do not appear at the questions record of described new questions record sequence, be arranged in the back of described new questions record sequence or suitably after move.When needing, can on the questions record sequence of flow process 509, repeat flow process 507 or 508, make it have corresponding feature speech prompting or navigation directory, with link or for clicking the questions record sequence results that represents renewal.
After search finished, the searchers can return (flow process 510), resume operations.
Above content is the exemplary illustration of the inventive method, is not able to this and limits interest field of the present invention.

Claims (10)

  1. The present invention be a kind of computing machine carry out about input and the disposal route of network terminal user to the file characteristic determination information, comprising:
    Operation A: the search request that Machine Retrieval System proposes according to the terminal user, the questions record sequence of the questions record that the file that provides meet this search request comprising of source, a plurality of website to user terminal forms;
    Operation B: computer system is determined the feature speech imported according to the predetermined operation mode on the page of directly linking at the described questions record sequence place page or this page;
    Wherein said predetermined operation mode is one of following mode of operation:
    Mode of operation one: with the described questions record of operation A or be subjected to cursor under it in the file content and choose the words of click as the feature speech of being imported;
    Mode of operation two: be subjected to the words that cursor is chosen click in the feature entry word record to be selected that present on the page with the described questions record sequence of the operation A place page or the direct link of this page or directly link, as the feature speech of being imported;
    Mode of operation three: the described questions record sequence of the operation A place page or this page directly page of link are provided with feature speech input field, and computer system is according to the definite feature speech of being imported of the input content in this input field;
    Input of the present invention and disposal route also comprise:
    Operation C: computer system is determined corresponding questions record of feature speech or file with the described input of operation B according to the prescribed manner that directly links at the described questions record sequence place page or this page on the page;
    Wherein said prescribed manner is one of following manner:
    Mode I: will operate mode of operation one described questions record or the file that cursor is chosen the words place of click, the corresponding questions record of feature speech or the file that are defined as and import of being subjected among the B;
    Mode II: corresponding questions record of feature speech or file that the questions record that will be clicked or file are defined as and import;
    Near mode III: questions record or file the feature word judgment operation that will be clicked indicates, the corresponding questions record of feature speech or the file that are defined as and import;
    Mode IV: with on the page at feature speech input field place apart from this input field questions record or file nearest or that be positioned at this input field regulation orientation, the corresponding questions record of feature speech or the file that are defined as Yu are imported;
    Mode V: unique questions record or file on the page with feature speech input field place, the corresponding questions record of feature speech or the file that are defined as and import;
    Mode VI: will operate on the page at the described mode of operation of B two described feature entry words record to be selected places apart from this catalogue questions record or file nearest or that be positioned at this catalogue regulation orientation, the corresponding questions record of feature speech or the file that are defined as Yu are imported;
    Mode VII: will operate unique questions record or file on the page at the described mode of operation of B two described feature entry words record to be selected places, the corresponding questions record of feature speech or the file that are defined as and import.
  2. 2. according to described input of claim 1 and disposal route, wherein: described feature speech is the keyword of corresponding questions record of reflection or document content feature.
  3. 3. according to described input of claim 1 and disposal route, wherein: described feature speech is the words that occurs in corresponding questions record or file.
  4. 4. according to described input of claim 1 and disposal route, wherein: described feature speech is the classifier of corresponding questions record of reflection or file content classification.
  5. 5. according to described input of claim 1 and disposal route, also comprise: described computer system is when importing suggestion and determine with the corresponding feature speech of arbitrary questions record or file according to the terminal user, the principle that need follow is considered one or more in the following factor at least:
    The similarity degree of the supplier's of the user's who (1) decisions making title or the network address of its website and this document title or its network address or file chaining network address;
    (2) make the number of users of same judgement;
    (3) make the time of certain judgement;
    (4) user who decisions making or come from accuracy rate or the scoring that same network address was clicked selection in the past;
    (5) selection of this kind feature speech and other artificial selection method or computing machine system of selection or selective system result's consistent degree;
    (6) whether be searching system operating personnel or staff's institute's judge or similar to it.
    Whether user who (7) decisions making or terminal be in related web site that carries out feature word judgment or selection or webpage registration.
  6. 6. according to described input of claim 1 and disposal route, also comprise:
    Operation D1: searching system according to described method that determine with data a plurality of questions records or the corresponding feature speech of file, generate the database of the feature speech content that comprises a plurality of files or questions record.
  7. 7. according to claim 1 or 5 or 6 described input and disposal routes, also comprise:
    Operation D2: a plurality of files that searching system is determined according to described method or the data of the feature speech of questions record or the database of the feature speech content that comprises a plurality of files or questions record that operation D1 generates generate the feature glossarial index of a plurality of files or questions record.
  8. 8. according to described input of claim 1 and disposal route, also comprise:
    Operation E: Machine Retrieval System when search service is provided, the search request that proposes according to the network inquiry user, the sequence of the questions record of a plurality of files that provide to user terminal; Near part or all of each questions record of described questions record sequence, has the prompting of the affiliated one or more feature speech of each questions record or its affiliated file respectively.
  9. 9. according to described input of claim 1 and disposal route, also comprise:
    Operation F: near each feature speech prompting the described questions record of order operation E, respectively with the sequence link of other a plurality of file questions records; File under part or all of questions record in the sequence of other a plurality of file questions records of described link or the questions record at least respectively has the feature speech under in the of, and is identical with feature speech in the original feature speech prompting of this sequence link.
  10. 10. according to described input of claim 1 and disposal route, also comprise:
    Operation G: the search request that proposes according to the network inquiry user at Machine Retrieval System is near a plurality of file questions record sequences that user terminal provides, has the navigation directory that a plurality of feature speech promptings are formed, wherein different with each the respectively sequence link that comprises a plurality of file questions records pointed out in each feature speech, file under part or all of questions record in the sequence of other a plurality of file questions records of described link or the questions record, feature speech under in the of one is at least respectively arranged, identical with feature speech in the original feature speech prompting of this sequence link.
CN2009102104625A 2009-07-17 2009-11-03 Input and process method of feature words of file content Pending CN101957831A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009102104625A CN101957831A (en) 2009-07-17 2009-11-03 Input and process method of feature words of file content

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN200910158038 2009-07-17
CN200910158038.0 2009-07-17
CN2009102104625A CN101957831A (en) 2009-07-17 2009-11-03 Input and process method of feature words of file content

Publications (1)

Publication Number Publication Date
CN101957831A true CN101957831A (en) 2011-01-26

Family

ID=43485163

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009102104625A Pending CN101957831A (en) 2009-07-17 2009-11-03 Input and process method of feature words of file content

Country Status (1)

Country Link
CN (1) CN101957831A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014206186A1 (en) * 2013-06-28 2014-12-31 百度在线网络技术(北京)有限公司 Method and device for generating entry information
CN106202146A (en) * 2012-07-16 2016-12-07 刘二中 A kind of search engine terminal use inputs the processing method of reference paper Search Hints information

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202146A (en) * 2012-07-16 2016-12-07 刘二中 A kind of search engine terminal use inputs the processing method of reference paper Search Hints information
CN106202146B (en) * 2012-07-16 2019-04-16 刘二中 A kind of search engine terminal user inputs the processing method of reference paper Search Hints information
WO2014206186A1 (en) * 2013-06-28 2014-12-31 百度在线网络技术(北京)有限公司 Method and device for generating entry information
CN104252487A (en) * 2013-06-28 2014-12-31 百度在线网络技术(北京)有限公司 Method and device for generating entry information
CN104252487B (en) * 2013-06-28 2019-05-03 百度在线网络技术(北京)有限公司 A kind of method and apparatus for generating entry information

Similar Documents

Publication Publication Date Title
CN101694666B (en) Method for inputting and processing characteristic words of file contents
US11741173B2 (en) Related notes and multi-layer search in personal and shared content
US9864808B2 (en) Knowledge-based entity detection and disambiguation
US10235470B2 (en) User retrieval enhancement
JP6057476B2 (en) System, method and software for identifying relevant legal documents
US20020073079A1 (en) Method and apparatus for searching a database and providing relevance feedback
CN104794242B (en) Searching method
CN100501745C (en) Convenient method and system for electronic text-processing and searching
CN107016020A (en) The system and method for aiding in searching request using vertical suggestion
CN102375885A (en) Method and device for providing search suggestions corresponding to query sequence
JP2010055618A (en) Method and system for providing search based on topic
US20120323905A1 (en) Ranking data utilizing attributes associated with semantic sub-keys
CN107408107A (en) Text prediction is integrated
CN102314461B (en) Navigation prompt method and system
CN102063453A (en) Method and device for searching based on demands of user
US20160162583A1 (en) Apparatus and method for searching information using graphical user interface
CN101763424B (en) Method for determining characteristic words and searching according to file content
US20040193591A1 (en) Searching content information based on standardized categories and selectable categorizers
US20120323904A1 (en) Automatic generation of a search query
CN105975508B (en) Personalized meta search engine search result synthesizes sort method
CN101957831A (en) Input and process method of feature words of file content
Gretzel et al. Intelligent search support: Building search term associations for tourism-specific search engines
US20070174266A1 (en) Method of optimization of listed result of internet-based search and system based on the method
CN101692243A (en) Method for conveniently inputting and processing document class determination information
CN101692245A (en) Method for processing additional searching requirement input in retrieval system conveniently and quickly

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20110126