WO2005022410A1 - Search engine - Google Patents

Search engine Download PDF

Info

Publication number
WO2005022410A1
WO2005022410A1 PCT/GB2004/003716 GB2004003716W WO2005022410A1 WO 2005022410 A1 WO2005022410 A1 WO 2005022410A1 GB 2004003716 W GB2004003716 W GB 2004003716W WO 2005022410 A1 WO2005022410 A1 WO 2005022410A1
Authority
WO
WIPO (PCT)
Prior art keywords
search engine
website address
website
internet
address
Prior art date
Application number
PCT/GB2004/003716
Other languages
French (fr)
Inventor
James Martyn Prince
Stephen Carr
Original Assignee
James Martyn Prince
Stephen Carr
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by James Martyn Prince, Stephen Carr filed Critical James Martyn Prince
Publication of WO2005022410A1 publication Critical patent/WO2005022410A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9566URL specific, e.g. using aliases, detecting broken or misspelled links

Definitions

  • the incomplete or similar website address may comprise part of a website address or a similar website address to that being located. Any part of the website address may be used and preferably it will comprise one or more of the following parts of a website address; a domain name, a subdomain, a URL (Uniform Resource Locator) and an IP (Internet Protocol) address.
  • the incomplete or similar website address may additionally comprise a wild card. It will be evident that a number of wild cards may be used such as "*" or "?". Such wild cards may be placed anywhere in the incomplete or similar website address (for example "www.* wpt.co.uk” or "www. wpt*. co.uk” which would either provide website

Abstract

The present invention relates to an Internet search engine for locating a correct web address for a user, wherein only an incomplete or similar website address is known, the search engine being capable of locating the correct website address by using one or more of the following criteria; (a) the incomplete or similar website address: (b) the business that the website address is related to; and (c) other identifier information. The present invention also relates to a procedure for searching the Internet as using the Internet search engine and the procedure for indexing one or more website addresses.

Description

DESCRIPTION SEARCH ENGINE
The present invention relates to an Internet search engine for locating a correct website address for a user wherein only an incomplete or similar website
address is known. Internet search engines are designed to help people find information stored on the internet and more particularly on websites or the websites themselves. Due to the increasing number of web sites currently available and the information contained on these sites, search engines frequently produce results which are irrelevant and the user has to spend a considerable amount of time looking through the search results in order to find the required outcome. In order to help overcome this problem and produce results which are relevant to user, search and engines and companies providing search engine facilities have developed new ways of categorising websites and the data contained within them in addition to how the information relating to such information is presented. Most search engines look at the contents of the webpage as well as the uniform resource locator (URL) in order to identify relevant matches for the search criteria. A search criteria is usually inputted into a single field and may comprise a. number of key words or a more specific search string. Using key words, the search results frequently contain results which are irrelevant and/or only partially relevant. Although the use of search strings can help matters somewhat, they can often produce spurious results from a search engine and
require the user to be more skilled at searching. For example, a search string may use standard wild cards in the search criteria as one may use in order to search a database. Therefore, a wild card such as "*" or "?" can appear anywhere in the search string, but the search engine will only apply these characters in a search criteria applied to the whole website content after applying it to the website address. Therefore, should a user know only part of a website address or a website address similar to one which they are looking for, it is almost impossible to find that web address using an Internet search engine. Search engines commonly rely upon software called "spiders" to build lists of words found on websites. The spider compiles its list by a process called web crawling which involves indexing words on particular pages and following links found within the site. In this fashion, the spider quickly compiles indicies of
popular websites and also indexes those which would not necessarily be found. Metatags are also used by search engines in order to index and categorize websites, and such tags are usually hidden code labeled accordingly which spiders may use in order to compile its index. Although the use of such indexing and metatags can prove useful in search engines, they do not address the problem associated with locating a correct website address where only an incomplete or similar website address is known to the user.
EP 1128283 discloses an index system for electronic addresses for ' providing a virtual address of a trading entity with its geographical location, contact details for the trading entity and the trade or service category. Although the use of such an indexation system would be useful to find website addresses and contact details for companies and service providers in a chosen geographical
location, the user would not necessarily know the name of the website address - j - beforehand, and would presumably use the geographical location and service type in order to identify a suitable website. Similarly, US 6,523,021 discloses a business directory search engine for efficiently searching directory listing information in order to obtain relevant results. This search engine relies upon a directory or database in order to locate a suitable website or service, the user again not knowing beforehand the website address. WO 01/41001 discloses an automated method for information browsing in a networked environment based on dynamic content. Such dynamic content being the product of a number of identifiers identifying information sources in order to assist a user in retrieving and browsing information over the network. The information source identifiers are dynamically assembled and based at least in part on the content of the first information page. The search engine appears to utilise unique nouns in order to determine the relevance a given website/website page. Therefore, the website name may or may not be included in the search criteria and the search engine would have problems in locating a correct website address for a user who only had an incomplete or similar website address as it would also take into consideration key words on the front page of the website. US 2002/0181466 discloses a system that completes a communication initiated with an insufficient partial/incomplete or fuzzy destination address. The
system searches information contained on a database in order to find records with similar address information and the records are compared with the original fuzzy
address and the record with the closest match is used as the record from which a complete address is obtained and the complete address is then used to forward or send the communication to the destination. The document appears to relate to the delivery of e-mails with the incorrect address to the correct address by means of looking up a directory or similar listing to establish the nearest address and as such would require access to the mail server which would more than likely only
happen on an intranet based network. Although this document is not directed towards Internet search engines, the use of such a system would not necessarily result in the correct website from being located as a number of similarly named websites may relate to different businesses and services. US 6,526,402 discloses a searching procedure using a search engine associated with a database and comprising submitting a request string comprising a locator (or URL) for the search engine and the search string including at least one search term. The locator will be an invalid address so an error signal is generated and the generation of such a signal is monitored and used to trigger passing of the request into the locator and the search string. The search string is then submitted to the search engine having the specified locator and the data returned from the search engine is passed back to the user. Although such a search procedure may be effective in establishing whether an invalid locator has been submitted, it may provide results which are blank, as the locator provided may be wrong and relate to a different business or entity. For example, www.wpt.com may relate to a different business
entity than www.wpt.co.uk. Similarly, www.wptk.co.uk may relate to car sales wherein www.wptkc.co.uk may relate to vending machines. It is therefore an object of the present invention to overcome one or more of the problems associated with the prior art search engine. Furthermore, it is an
object of the present invention to provide an Internet search engine that can locate a correct website address for a user wherein only an incomplete or similar web address is known. In accordance with the present invention, there is provided an Internet search engine for locating a correct website address for a user wherein only an incomplete or similar website address is known, the search engine being capable of locating the correct website address by using one or more of the following
criteria; (a) the incomplete or similar website address; (b) the business that the website address is related to; and (c) other identifier information. Therefore, the present invention allows for the location of a correct website address from an incomplete or similar website address by utilising part or all of the known website address and optionally to the type of business that the website address relates to in order to establish two or more websites which the user is trying to locate. The optional use of additional identified information further allows the results to be more prescriptive towards the search criteria. In accordance with another aspect of the present invention there is provided an Internet search engine for locating a correct website address wherein only an
incomplete or similar website address is known, the engine locating the correct
website by using the following steps: (a) allowing a user to input a search criteria comprising the incomplete or similar website address, and optionally the business that the website address relates to and/or other identifier information; (b) sending the search criteria to one or more databases to interrogate data contained therein; (c) receiving the search results; and (d) providing the user with the correct website address or a list of possible website addresses to the user. The other identifier information may be information contained on a website page. Such information may be coded text, metatags or any other information provided on a given website page. The other identifier information may be information contained in a metatag and will more preferably be a key word. The incomplete or similar website address may comprise part of a website address or a similar website address to that being located. Any part of the website address may be used and preferably it will comprise one or more of the following parts of a website address; a domain name, a subdomain, a URL (Uniform Resource Locator) and an IP (Internet Protocol) address. The incomplete or similar website address may additionally comprise a wild card. It will be evident that a number of wild cards may be used such as "*" or "?". Such wild cards may be placed anywhere in the incomplete or similar website address (for example "www.* wpt.co.uk" or "www. wpt*. co.uk" which would either provide website
addresses ending in "wpt" or starting in "wpt" respectfully). The Internet search engine may comprise a user interface for allowing the information to be inputted into one or more search fields. Preferably, the information is converted to a search string after it has been inputted into a given
field. The search string may be submitted to a database for interrogation of the data contained therein and may comprise wild cards such as "*" and "?". The wild cards may be input by the user or placed in the search string by the Internet search engine. Such a database may be produced from one or more Internet spiders. It will be evident to one skilled in the art, than an Internet spider comprises a software program which trawls through the Internet and compiles databases and indexing information for data contained on websites. It will also be evident to one skilled in the art that a number of other methods for compiling and populating databases, such as manual data entry for example. The database may be hosted on an internal or external server or could potentially be a mixture of both. The database may be held on a server (or similar equipment) that is held on the Internet or on a client server. The database may be produced in a number of programming languages and it will be evident that the language used may reflect the functional requirement of the database and may also be in languages not yet developed. Preferably, the database is interrogated using SQL (Structured Query Language) or a similar language and may additionally have DNS access. Such a database may be populated with results from standard spiders. The database may contain data relating to website addresses which is indexable. The search string used in the Internet search engine may be used to search the Internet. Therefore, the Internet search engine may act as a spider in order to
locate the correct website address. This may be by using super computers and high volume Internet connections. The Internet search engine may either provide a list of possible links to the correct website address or possible website addresses. Alternatively, the Internet
search engine may automatically re-direct the user to the correct website address. Thus, the Internet search engine would allow seemless connection from the engine to the desired website address. In accordance with a further aspect of the present invention, there is provided an Internet search engine as described herein above wherein the search engine is an executable computer program held on a computer readable means. A computer readable means, may be a computer server linked to a network or Internet, alternatively such a computer readable means may comprise a storage media such as a portable media device (i.e. a floppy disc). It will be evident to one skilled in the art, that such an executable computer program may be coded in a number of languages commonly used for Internet website pages and this may include languages which have not yet been developed. Preferably, the program will utilize one or more of the following languages: HTML (Hypertext Markup Language), XML (Extensible Markup Language), DHTML (Dynamic Hypertext Markup Language), JAVA and COBOL (Common Business Oriented Language). Preferably, such a search engine would utlilise one or more of the following languages in order to interrogate a database; PHP, ASP (Active Server Pages), CGI (Common Gateway Interface), .NET, C, C++, C# or Pascal.
In accordance with yet another aspect of the present invention, there is provided a procedure for indexing one or more website addresses comprising data associated with the website address which relates to (a) an incomplete or similar website address;
(b) key words associated with the business that the website address is related to; and
(c) other identified information. A specific embodiment of the present invention will now be described, by way of example only, with reference to the company in figures: Figure 1 illustrates a user interface which may be used in accordance with the present invention. Figure 2 is a flow diagram to outline how the search engine may be used in
accordance with the present invention. The reference to Figure 1 , there is provided a user interface which can be displayed on a computer screen or similar, which will allow access to the Internet search engine. The user interface 10 comprises a website name field 12, a business type field 14 and a key words field 16 and a search button 18. The user interface 10 will commonly be accessible via a server connected to the Internet (the Worldwide web) and may be situated either on the Internet or on a server accessible via the Internet. Figure 2 provides a flow diagram for using and Internet search engine by following a number of discreet steps in order for a user to locate a correct website address wherein only an incomplete or similar website address is known. The Internet search engine is used by a user accessing the Internet 20 and loading the Internet search engine 22 from its URL. The user then enters the website name
(with wild cards if appropriate), business type and optional key word 24. The search engine then converts the search criteria 26 into a search string and submits the search string to a database for an interrogation of its data 28. A database 30 is populated with data from one or more spiders 32 (or other data collections means)
which input data onto the database 30 in a continuous manner. The database 30 then provides a correct website address or a number of possible website addresses 34. A user then may select the correct web site address 36 and the search engine
will connect the user to the required website address 38.
The user interface 10 may have a number of additional features in order to
assist a user in obtaining the appropriate results. The website name field 12 can
comprise more than one field, so that a user may input part of the domain name
(such as "www. pt" or "wpt" and another field can allow entry of a top level
domain suffix (such as ".co.uk" or ".com"). Furthermore, a field dedicated to a
top level domain suffix can either allow entry of the suffix or for it to be selected
from a drop down list. The website name field 12 can also accept wild cards in
order to search for specific parts of the website address, for example
"www.wpt* .co.uk" would result in only website addresses starting with "wpt"
being located. Alternatively the search engine can apply wild cards to a search
string according to a protocol. The business type field 14 can either allow entry of
a business type (such as "automobiles") or allow the business type to be selected
from a number of predetermined business types held in the search engine (thereby
selecting "automobiles" from the list). Alternatively, the field can be left blank or
a "Unknown" or similar field selected. The key word field 16 can be used to enter
further identified information such as "repairs" in order of that the user can find
the website associated with the incomplete or similar website address to that
entered in the website name field 12 which relates to a business type of
automobile repairers or alternatively, this field can be left blank. A number of key
words can also be listed in a drop down menu in the key words field 16 and this in
turn can be a sub division of any given business type listed in the drop down list in
the business type field 14. When the user has input an entry in the website name field 12 and the business type field 14, the search may be actuated by selecting the
search button 18.
The search criteria entered in the user interface 10 is then converted into a
search string 26 which is submitted to the database 30 for interrogation. One or
more spiders 32 trawl the Internet in order to populate the database 30 so that
suitable results can be produced, or the use of other pertinent methods of
population may also be employed. The database 30 then provides the correct
website address or a number of possible website addresses 34 and displays them
in the user interface 10. The user can select the correct website address 36 in
order to connect to that website, and the search engine then connects to the
required website address 38. Alternatively, the Internet search engine can directly
link to the correct website without the user having to select the correct website
address. This can be in instances where only one result has been supplied by the
database 30.
The user interface 10 can also carry marketing material such as advertising
banners and the like and could be incorporated in a number of website pages in
order that searches may be performed and results provided whilst viewing another
website (not the website which houses the Internet search engine). Such a use of
the Internet search engine can be inserted into a frame of another parties web site
for example.

Claims

1. An Internet search engine for locating a correct web address for a
user, wherein only an incomplete or similar website address is known, the search engine being capable of locating the correct website address by using one or more
of the following criteria; (a) the incomplete or similar website address:
(b) the business that the website address is related to; and (c) other identifier information.
2. An Internet search engine for locating a correct website address
wherein only an incomplete or similar website address is known, the engine
locating the correct website by using the following steps:
(a) allowing a user to input a search criteria comprising the incomplete or
similar website address, and optionally the business that the website address
relateds to and/or optionally other identifier information;
(b) sending the search criteria to one or more databases to interrogate data
contained therein;
(c) receiving the search results; and
(d) providing the user with the correct website address or a list of possible
website addresses to the user.
3. An Internet search engine as claimed in either claim 1 or claim 2,
wherein the other identifier information is information contained on a website
page.
4. An Internet search engine as claimed in claims 1 to 3, wherein the other identifier information is information contained in a metatag on a website page.
5. An Internet search engine as claimed in any preceding claim, wherein the incomplete or similar website address comprises part of a website
address or a similar website address to that been located.
6. An Internet search engine as claimed in claim 4, wherein the incomplete or similar website address comprises one or more of the following parts of a website address; a domain name, a subdomain, a URL (Uniform Resource Locator) and an IP (Internet Protocol) address.
7. An Internet search engine as claimed in any preceding claim, wherein the incomplete or similar website address further comprises a wild card
8. An Internet search engine as claimed in any preceding claim, wherein information is inputted into one or more search fields.
9. An Internet search engine as claimed in any preceding claim, wherein the information is converted to a single search string.
10. An Internet search engine as claimed in claim 9, wherein the search string is used to search the Internet or databases.
11. An Internet search engine as claimed in claim 10, wherein the search string is submitted to a database for interrogation of the data contained therein.
12. An Internet search engine as claimed in claim 11, wherein the
database contains data populated by one or more Internet spiders.
13. An Internet search engine as claimed in either claim 11 or 12,
wherein the database contains data relating to a Website address which is
indexable.
14. An Internet search engine as claimed in any preceding claim,
wherein the Internet search engine either provides a list of possible links to the
correct website address or possible website addresses.
15. An Internet search engine as claimed in any of claims 1 to 13, wherein the Internet search engine automatically redirects the user to the correct
website address.
16. An Internet search engine as claimed in any preceding claim,
wherein the search engine is an executable computer program held on a computer
readable means.
17. A procedure for searching the Internet to locate a correct website
address for a user, wherein only an incomplete or wrong address is known by
using an Internet search engine as described in any of claims 1 to 16.
18. A procedure for indexing one or more website addresses
comprising data associated with the website address which relates to ; a) an incomplete or similar website address; b) key words associated with the business that the website address is related
to; and c) other identifier information.
19. An Internet search engine substantially as herein described in claims 1
to 16 with reference to and as illustrated in the accompanying figures.
PCT/GB2004/003716 2003-08-28 2004-08-31 Search engine WO2005022410A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0320106.8 2003-08-28
GB0320106A GB2405497A (en) 2003-08-28 2003-08-28 Search engine

Publications (1)

Publication Number Publication Date
WO2005022410A1 true WO2005022410A1 (en) 2005-03-10

Family

ID=28686438

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2004/003716 WO2005022410A1 (en) 2003-08-28 2004-08-31 Search engine

Country Status (2)

Country Link
GB (1) GB2405497A (en)
WO (1) WO2005022410A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999039280A2 (en) * 1998-01-30 1999-08-05 Net-Express Ltd. Www addressing
US6092100A (en) * 1997-11-21 2000-07-18 International Business Machines Corporation Method for intelligently resolving entry of an incorrect uniform resource locator (URL)

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB0007225D0 (en) * 2000-03-24 2000-05-17 Game James D System for locating businesses organisation and individuals via the internet

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6092100A (en) * 1997-11-21 2000-07-18 International Business Machines Corporation Method for intelligently resolving entry of an incorrect uniform resource locator (URL)
WO1999039280A2 (en) * 1998-01-30 1999-08-05 Net-Express Ltd. Www addressing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SONNENREICH W ET AL: "Guide to Search Engines, Chapter 2 and 3", 1998, WILEY COMPUTER PUBLISHING, ISBN: 0-471-24638-7, XP002313969 *

Also Published As

Publication number Publication date
GB2405497A (en) 2005-03-02
GB0320106D0 (en) 2003-10-01

Similar Documents

Publication Publication Date Title
US11443358B2 (en) Methods and systems for annotation of digital information
US6212522B1 (en) Searching and conditionally serving bookmark sets based on keywords
CN101416186B (en) Enhanced search results
US6338058B1 (en) Method for providing more informative results in response to a search of electronic documents
US8041741B1 (en) Searching content using a dimensional database
KR100719009B1 (en) Apparatus for identifying related searches in a database search system
KR100473086B1 (en) Method and system for accessing information on a network
US6490575B1 (en) Distributed network search engine
US8280868B2 (en) Method and system for monitoring domain name registrations
US6311194B1 (en) System and method for creating a semantic web and its applications in browsing, searching, profiling, personalization and advertising
US7231405B2 (en) Method and apparatus of indexing web pages of a web site for geographical searchine based on user location
US20080140626A1 (en) Method for enabling dynamic websites to be indexed within search engines
US20020129062A1 (en) Apparatus and method for cataloging data
US20070022096A1 (en) Method and system for searching a plurality of web sites
EP0817099A2 (en) Client-side, Server-side and collaborative spell check of URL's
KR20010040626A (en) Navigating network resources using metadata
JP2003518293A (en) Indexing system and method
KR20040026167A (en) Method and Apparatus for providing an advertisement based on an URL and/or search keyword input by a user
JP2007122732A (en) Method for searching dates efficiently in collection of web documents, computer program, and service method (system and method for searching dates efficiently in collection of web documents)
US20020116394A1 (en) Meta data category and a method of building an information portal
EP1993045A1 (en) Electronic document retrievel system
US7630959B2 (en) System and method for processing database queries
US20060116992A1 (en) Internet search environment number system
US10474685B1 (en) Mobile to non-mobile document correlation
KR20110069018A (en) Indexing system

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase