WO2004079485A2 - Improvements in internet site architecture - Google Patents

Improvements in internet site architecture Download PDF

Info

Publication number
WO2004079485A2
WO2004079485A2 PCT/GB2004/000959 GB2004000959W WO2004079485A2 WO 2004079485 A2 WO2004079485 A2 WO 2004079485A2 GB 2004000959 W GB2004000959 W GB 2004000959W WO 2004079485 A2 WO2004079485 A2 WO 2004079485A2
Authority
WO
WIPO (PCT)
Prior art keywords
data
information
page
cache
user
Prior art date
Application number
PCT/GB2004/000959
Other languages
French (fr)
Other versions
WO2004079485A3 (en
Inventor
Martyn Whitwell
Original Assignee
Imperial College Innovations Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Imperial College Innovations Ltd filed Critical Imperial College Innovations Ltd
Publication of WO2004079485A2 publication Critical patent/WO2004079485A2/en
Publication of WO2004079485A3 publication Critical patent/WO2004079485A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Abstract

An apparatus for sending to a user data related to a page of information (17) such as a web page uses a controller (12), a search module (14) with an index (15) for containing data relating to the pages of information. When a page of information is updated, the controller controls the search module to update the index to reflect the fact that the page has been updated.

Description

IMPROVEMENTS IN INTERNET SITE ARCHITECTURE
The present invention relates to developments in the architecture of Internet sites. The present invention also relates to the manner in which data may be handled, retrieved from storage and prepared for sending to a user.
The Internet is a vast network of computers and computer servers, all of which store information which may be downloaded or retrieved by a user. To use the Internet, a user with a computer (or other Internet enabled device such as a palm device or mobile phone enabled with a WAP browser, or other suitable device) uses an Internet connection to browse online through the documents and files available on the Internet. He may also choose to download various files, such as text, graphic, video or music files.
Internet sites created and/or maintained by both business and non-business users alike often contain "pages" of information which include large files, the content of these files being in many of the available formats mentioned above. Such pages of information come in many formats; for example the page may comprise any or all of the following: text, audio, video or other graphics (GIF or equivalent files), Java applets etc. Often, the software applications which manage these Internet sites, when requested by a user, are required to perform intensive processing in order to retrieve the necessary data from the site in question and send the information to the user. With the advent of greater system requirements and the increasing complexity of the information which is available on the Internet, systems presently in use are often cumbersome and slow to respond to the heavy processing required of them. As a means of browsing the Internet, a user may, typically, employ a web browser which allows access to various programs and functions for searching the Internet. One of the preferred methods of searching the Internet is to employ a search engine, which allows for the searching of pages of interest in a number of ways, for example by "key-word search".
One particular context in which search engines are known is their use within a specific web site or network, i.e. having a defined and finite set of pages in order to allow key- word searches in search systems. Periodically (for example every few weeks, or possibly more frequently), a search engine will deploy a program, such as a "spider" program, to go to every page or representative pages on the web site or sites that the web site owner or administrator designates to be searchable to read it, using the hypertext links on each page to discover and read other related pages to the page in question. The spider program can create an index in the search engine, sometimes called a
"catalogue", from the pages that are being read, populating the data in the index with data related to the information the spider program has read from the target page. When a user enters a key-word search entry, the search engine program compares it to entries in the index which it already holds and returns results in the form of exact or close matches to the user. A problem with this system is that the spider program trawls the web only periodically, thereby meaning that data entries in its index are often outdated, and sometimes entirely wrong.
The invention is set out in the appended claims. Because the index is updated dynamically and hence is continuously up to date, the search engine will always direct the user to a page of information which is up to date.
In a preferred embodiment of the invention, the invention further comprises a cache for storing data related to a plurality of pages of information. Caching servers and systems are used to store information temporarily or permanently, depending on the type of memory used for the cache; a common type of memory used for this purpose is very fast types of volatile memory where web pages requested by a user are stored temporarily. Because the cache is dynamically updated according to the preferred aspect of the invention, a second user requesting the same page will be able to retrieve the desired page from cache memory rather than being directed to slower formats of storage which is usually employed as the primary storage device. An advantage of the present arrangement is that all pages are kept updated in cache; there is a reduced processing burden on the server system then to provide the page in readable format to the user.
In embodiments of the invention, the controller is operable to request that at least one of the search module and the cache read said updated page to extract the relevant information. The controller instructs the search module and/or the cache to scan the updated web page to extract the relevant information (which will be in the form of data for the search engine index, or information in page format for the cache).
In an alternative arrangement, the controller is operable to send data related to said updated page to at least one of the searching means and caching means in order that they may extract the relevant information themselves. This further reduces the processing required of the controller; such processing is "delegated" to the search engine or caching device.
In embodiments of the invention, the search module and the cache are operable to be updated simultaneously in response to the request from the controller. The efficiency of the system may be further enhanced by having both the search engine and the cache retrieve the relevant information in response to a single command from the controller.
In embodiments of the invention, the controller is operable to communicate with at least one of said search module and said cache in a cross-platform protocol. Cross-platform protocols such as Web Services which are relatively new developments allow Remote Procedure Calls (RPCs) between different computer systems.
In embodiments of the invention, the controller is operable to communicate with said search module and said cache over a network. As a result the various components of the system may be remote from one another, connected over a network. Any network such as Wide Area Network (WAN), or Local Area Network (LAN) or indeed wireless networks may be utilised in providing such a network.
In embodiments of the invention, the cache is operable to send data related to a page of information to a user. Once the data has been sent to the cache, the page may be sent onward from cache to a user who, being connected to the
Internet, could be in any part of the world. He might also be in the same organisation as the web administrator. Aspects of the present invention lend themselves equally well to many or all types of networks.
In embodiments of the invention, the system further comprises a store for storing data related to a plurality of pages of information. Typical storage means such as a computer hard disk or server may be employed for this task.
In embodiments of the invention, the store comprises a database (for example, flat-file or relational databases) or any other type of storage program or software suitable for this purpose, e.g. spreadsheets, proprietary binary files, simple text files, or XML files (discussed further below).
Alternatively the store could comprise a content management system. Proprietary content management systems such as the Microsoft Content Management Server may be employed for this purpose.
In embodiments of the invention, the cache comprises at least two caching servers, wherein said caching servers are connected to said storage means. Multiple caching servers allow fast, easy and efficient servicing of a multiplicity of users as may be required by popular internet sites which attract many users every day, and often simultaneously. The caching servers may be used to serve the various users; for example if the cache is retrieved from the store for the purposes of sending to a user it may be stored in cache for a further user to retrieve without placing an undue burden on the processing system. This may remove the necessity to have multiple databases or content management systems for the purposes of serving numerous users and the associated problems inherent with such an arrangement (synchronizing data between the various data stores etc.).
In embodiments of the invention, said cache is operable to process the data prior to sending said data to a user. The caching server or system may be operable to provide secondary processing on the data before it is sent to a user; for example, if the user has limited rights, the caching server may note this and withhold all or part of the requested page.
In embodiments of the invention, said cache is operable to perform a re- organisation of the data, for example a filtering of data, prior to sending it to a user. For example, if the user has limited rights, the caching server may note this and withhold all or part of the requested page.
In embodiments of the invention, said data related to a page of information comprises data in XML format, extensible Mark-up Language (XML) is an industry standard means of storing data. This protocol allows many heterogeneous computer platforms to communicate together.
It will be appreciated that features of one aspect of the invention may be applied to features of another aspect of the invention.
Embodiments of the present invention will now be described, by way of example only, and with reference to the accompanying drawings in which:
Figure 1 is a block diagram illustrating an embodiment of the present invention; Figure 2 is a block diagram illustrating a preferred embodiment of the present invention; Figure 3 shows the system data flows in a preferred embodiment of the present invention;
Figure 4 shows a flow of information between the various components of a preferred embodiment of the invention; and Figure 5 illustrates the software architecture of a preferred embodiment of the invention;
Figure 6 shows a flow chart of the authentication system employed by the invention; and
Figure 7 illustrates the process by which further caching servers may be added to the system.
Referring now to Figure 1, the system 10 of the present invention is illustrated. In the system 10, a controller 12 communicates with a search device 14 and pages of information 17 as shown. The pages of information comprise data in any format: text, graphics, audio, video etc. The search device 14 further comprises an index 15 which, when in use, is populated with data relating to the pages of information 17. When one or more pages of information is updated say, by a user such as a web administrator, the controller 12 is notified of this, and instructs the search device 14 to obtain data related to the updated page(s). To do this, the search device employs a program (such as a spider program described above) to read the updated page and extract the required data. When doing this the search device communicates directly with the pages of information, or via the controller. Alternatively, the controller 17 instructs the page of information to be sent to the searching device 14, possibly routing the page through itself.
An other embodiment of the invention is illustrated in Figure 2. The system again, generally referred to by reference 10, comprises three main systems: a controller 12, a search engine 14, and a caching system 16 coupled to a content store/database 18 where the pages of information 17 of Figure 1 may be stored. The content store/database 18 comprises a proprietary Content Management System, for example as provided by Microsoft. Alternatively, the content store/database 18 is another data storage and management tool such as a database, spreadsheet or other suitable example as discussed above. The controller 12, the search engine 14 and the caching system 16 all communicate with the outside world, preferably via the Internet 20. Alternatively, the system is put into use in other types of network such as local area network and/or Intranet.
In use, the search engine 14 (which is provided remote from the system - i.e. connected through the Internet - or locally to the controller and content store) conducts a search of the content store/database 18 through the controller 12. The search engine contains an index of data related to the information in page format stored in the contents store/database 18. When a user local to the system updates a page the controller 12 is notified of this and sends out a request to the searching device to retrieve updated information related to the updated page. This may be done in various ways. In a first example of updating the information, the controller sends a request to the search engine to read the updated page. The search engine utilises the spider or equivalent type program for this purpose. The spider program then reads the updated page either directly, or via the controller and extracts the relevant information. In a second example, the controller downloads the updated page from the contents store/database 18 and sends the page to the search device 14 for it to perform the necessary processing locally. Additionally, or simultaneously, the cache 16 is similarly updated. However, the cache 16 will not store the information in an index format; the contents store/database arranges the requested information into page format suitable for sending to a user to view on a browser or by other means; for example, the information may be stored in XML format and translated by XSL before sending to a user (as discussed in more detail below). The information is stored in cache 16 in this format. It will be appreciated that alternatively the information is stored in the cache in a manner more suited to the user who will be requesting the information. The cache 16 then sends the requested page to the user in a suitable format. The cache device may perform some further processing on the page prior to sending it to a user; for example, if the user has restricted rights, the cache device can filter or block some, or all of the page content before sending it to the user. In embodiments of the invention, this is effected by the use of XSL (which is discussed further below).
A second or further user may then request the same information as the first user has done. Instead of instructing the same heavy processing again at the contents store/database 18, the controller first checks to see if the requested page is resident in cache 16. If the requested page is resident in cache it sends a signal to the cache to send the requested page information to the user.
The controller communicates with both the service device 14 and the cache 16 by means of a cross-platform development such as Web Services. Web Services is a conceptual way of communicating which allows procedure calls (known as RPCs) between different computer systems. These effectively allow different processes running on different computers to communicate with one another. They utilise the Simple Object Access Protocol (SOAP) to provide a standardised message format, exchanging messages in XML format. The controller exposes certain commands to the rest of the system (for example, cache, search engine) as web services. This means that these commands (functions, routines, blocks of code - examples of which might be a command to retrieve a web page, or a command to check a username and password) can be executed ("called") by using a web service methodology. The system may utilise web services to synchronise the caching system and search engine system with the controller.
When a change is made to a document, the controller immediately alerts the caching systems and search engine that the document has changed and requires re-indexing. By doing so, this means that the search engine and the cache are always up to date. The controller communicates with the content store/database 18 in the same manner, or other suitable means, such as Internal Calls, whereby one function/routine (a block of code) executes ("calls") another function/routine.
Referring now to Figure 3, the system data flows of an embodiment of the present invention are now discussed. The embodiment comprises an authoring server, generally depicted 22 and a rendering server, generally depicted 24. The authoring server itself comprises the following components: the search engine 14, the contents store or database 18, SQL (Structured Query Language) Server 19 and various web pages or databases 26 for viewing by a user. SQL, which will be discussed further below, is responsible for handling web page resources. At the core of the system, the data which flows between the various components of the system comprises XML data 28, which is communicated to and from the search engine 14 and the cache 16 in a cross-platform protocol as discussed above. As a result content can be generated in a wide variety of formats (web pages, WAP pages for mobile phones, special pages for talking browsers, PDF files etc); content can be imported from other XML sources; content can be exported to other XML consumers etc. XML is also used to communicate with the controller 12, although, alternatively, XSL translates the XML data into the relevant format prior to sending from the cache. XSL (extensible Style-sheet Language) is an industry standard language to convert XML into other formats, including XML. It is at this point that the filtration of data may take place depending upon the user's rights for access to the system; data may be filtered out as it is translated from XML. The communication which takes place between the search engine 14, the SQL server 19 and the contents store/database 18 comprises data in XML format. By enabling such an arrangement, XML is then at the "heart" of the system; allowing the actual content of the data to be distinct from the way in which it is presented. This provides "future-proofing" of the system, meaning that virtually any system which can handle XML data may be later bolted-on.
In use, the data comprising XML data 28 is communicated to the cache 16 which is part of the rendering server 24 by the cross-platform protocol. The controller 12 communicates with the cache 16, which is done again by the cross-platform protocol discussed above. The controller sends a request to the cache for a web page via means of either specifying the URL (Universal Resource Locater) of the web page, or communicate in the cross-platform protocol. The page requested from cache may be in a number of formats: XML, HTML, PDF, XHTML (extensible Hypertext Mark-up Language), or other suitable format. The controller also communicates with the authentication system 40 which will be described further below. A user inputs the request at 28 for the controller to receive a web page; this is done by the user specifying the URL of the web page. The controller returns the page after the various processing in, for example, XHTML format at 30. It will be appreciated that the individual components (Cache, SQL, XML, Content Management System etc) are well known to the skilled reader and do not require detailed discussion here.
Turning now to Figure 4, the system process is further illustrated. The process begins at Start 50. The user requests the retrieval of a page from the system at process step 52. If the page is already in cache, as is queried at process step 54, the system retrieves the page from cache and sends it to the user at process step
62. If the page is not in cache, a request is sent to the controller via web services at process step 56. The system interfaces with the contents store/database and SQL database in the process at 56 by means of a request and data return to the system. The retrieved web page is, in embodiments of the invention, in XML format, and is then stored in cache at 58. This may need translating to other formats for viewing by the user. The system translates the Web Services page using XSL at process step 60. The cache system 16 then sends the page to the user for rendering at the user's browser (or other means) at process step 62.
The process ends at 64.
Referring now to Figure 5, the software architecture is illustrated.
A user views retrieved web pages at the client browser 70 which may be provided separately. In embodiments of the present invention, the user is an authorised user of the web site, thereby having various rights in order to author data on the web site and/or, for example, perform administrative tasks. For reasons of security, in embodiments of the present invention, there is provided a further authentication facility 72 through which an authorised user will log in in order to gain full access to the web site. The authentication system 72 communicates with the rights and configuration system 74 to process the request to author data on the site.
The other various components of the system are illustrated in the architecture including rendering 22, authoring 24, cache 16, Search Engine 14, Rights and Configuration 74, SQL 15 and content store/database 18.
With reference now to Figure 6, the authentication system procedure for allowing authoring access to the web site contents store will be discussed further.
The process begins at process step 80. A user will be prompted for his user name and password over a secured channel which may be made over the Internet (or Intranet or LAN or other suitable network connection). The connection is made using any suitable secure protocol such as Secure Sockets Layer (SSL), a preferred implementation of which is HTTPS (Hyper Text Transfer Protocol Secure Sockets). This authentication request is made at process step 82 as shown in Figure 5 after which, at process step 84, there is an attempt to authenticate the user's rights via the active directory. Should the user enter an invalid user name or incorrect password, he will be deferred back to the log in screen through process loop 83 until a valid username and password are entered. Caller identification (CID) is retrieved from the relevant database, for example, a personnel database, at process step 86. If the caller identification has been authenticated successfully, but there is no entry for that user in the people database at 86, the user is deemed to not exist, and it is assumed that the user is to have a standard view of the web site at process step 88. After this process, or if the caller identification is valid at 86, an authentication token is stored in a cookie at process step 89, and the user is granted full access as his rights allow to the web site. The process ends at process step 90.
Referring now to Figure 7, it will be seen that in another aspect of the invention additional caching servers allow multiple users to connect to the content database/store 18. As shown, multiple cache devices 16a, 16b, 16c, which may comprise any of various types of memory (fast, volatile memory, computer hard disks, etc.) are connected to the content store/database 18. Multiple Internet or Intranet users 18 can request pages from the content store/database 18 via any of the cache devices shown. This eases the burden on the processing requirements of the system considerably; if the required pages of information from the content store/database are stored in cache in a suitable format for a user to download and view, the processing required by the content store/database 18 to collate the information related to the required page occurs only once.
Such an arrangement becomes a particularly powerful tool if the cache servers are kept continuously up to date as described above.
It will also be appreciated that further caching devices could be added to the illustrated arrangement. Further content store/databases might also be provided, but it is anticipated that such additions will not be frequent; the addition of extra caching devices would hopefully obviate this latter arrangement.
It will be understood that the present invention has been described above purely by way of example, and modifications of detail can be made within the scope of the invention.
Each feature disclosed in the description, and the claims and drawings may be provided independently or in any appropriate combination.

Claims

1. Apparatus for sending to a user data related to a page of information, said apparatus comprising: a controller; and a search module, said search module comprising an index for containing data relating to a plurality of pages of information; wherein, when a page of information is updated, the controller is arranged to control the search module to update data in said index relating to said updated page of information.
2. Apparatus according to claim 1, further comprising a cache for storing data related to a plurality of pages of information, wherein said controller is arranged to control the cache to update data relating to said updated page of information.
3. Apparatus according to claim 1 or claim 2, wherein the controller controls at least one of the search module and the cache to read said updated page to update the data.
4. Apparatus according to claim 1 or claim 2, wherein the controller is arranged to send data related to said updated page to at least one of the search module and cache to update the data.
5. Apparatus according to any of claims 2 to 4, wherein the search module and the cache are controlled to be updated simultaneously.
6. Apparatus according to any of claims 2 to 5, wherein said controller is arranged to communicate with at least one of said search module and said cache in a cross-platform protocol.
7. Apparatus according to claim 6, wherein said controller is arranged to communicate with said search module and said cache over a network.
8. Apparatus according to any of claims 2 to 7, wherein said cache is arranged to send data related to a page of information to a user.
9. Apparatus according to any preceding claim, further comprising a store for storing data related to a plurality of pages of information.
10. Apparatus according to claim 9, wherein the store comprises a content management system
11. Apparatus according to claim 9, wherein the store comprises a database.
12. Apparatus according to any of claims 9 to 11, wherein the cache comprises at least two caching servers, and wherein said caching servers are connected to said store.
13. Apparatus for sending to a user data related to a page of information, said apparatus comprising: a cache comprising at least two caching servers; and a store for storing data related to a plurality of pages of information; wherein said cache is operable to store data received from said store, and to send data to a user in response to a request from said user.
14. Apparatus according to claim 13, wherein said cache is arranged to process the data prior to sending said data to a user.
15. Apparatus according to claim 14, wherein said cache is arranged to perform a re-organisation of the data, for example a filtering of data, prior to sending it to a user.
16. Apparatus according to any preceding claim, wherein said data related to a page of information comprises data in XML format.
17. Apparatus for sending to a user data related to a page of information, said apparatus comprising a controller and a store for storing data related to a plurality of pages of information, wherein said data comprises data in XML format.
18. Apparatus according to claim 16 or claim 17, further comprising core interface means utilising XSL for translating said XML data into other formats.
19. A method of making data related to a page of information available to a user, said method comprising: receiving notification that a page of information has been updated; and in response to said notification, updating an index in a search device with data related to said updated page of information.
20. A method according to claim 19, further comprising updating a caching device with data related to said updated page of information in response to said notification.
21. A method according to claim 19 or claim 20, further comprising at least one of the search device and the caching device reading said updated page to update said data.
22. A method according to claim 19 or claim 20, further comprising sending said updated page to at least one of the search device and the caching device to update said data.
23. A method according to claim 22, further comprising extracting the relevant information to update said data.
24. A method according to any of claims 20 to 23, further comprising updating the search device and the caching device simultaneously in response to said notification.
25. A method according to any of claims 20 to 24, further comprising communicating with at least one of said search device and said caching device in a cross-platform protocol.
26. A method according to any of claims 20 to 25, wherein communication with at least one of said search device and said cache device is in a cross- platform protocol.
27. A method according to claim 26, further comprising said communication taking place over a network
28. A method according to any of claims 20 to 27, further comprising sending data related to a page of information from the caching device to a user,
29. A method according to claim 28, further comprising sending said data in response to a request from a user.
30. A method according to any of claims 19 to 29, further comprising storing data related to a plurality of pages of information.
31. A method according to claim 30, further comprising storing said data in a content management system
32. A method according to claim 30, further comprising storing said data in a database.
33. A method according to any of claims 30 to 32, wherein the caching device comprises at least two caching servers, and the method further comprising: connecting said caching servers to said storage device; storing, at the caching servers, data received from said storage device; and sending to a user, from at least one of said caching servers, data in response to a request from said user.
34. A method of making data related to a page of information available to a user, said method comprising: storing, at a storage device, data related to a plurality of pages of information; receiving from said storage device, at a cache comprising at least two caching servers, data related to a page of information.
35. A method as claimed in claim 34 further comprising sending data to a user in response to a request from said user.
36. A method according to claim 34 or 35, further comprising processing the data at the cache prior to sending said data to a user, and preferably wherein the step of processing said data further comprises a re-organisation of the data, for example a filtering of the data.
37. A method according to any of claims 19 to 36, wherein said data related to a page of information comprises data in XML format.
38. A method according to claim 37, further comprising utilising XSL for translating said XML data into other formats.
39. A computer program configured to implement a method as claimed in any of claims 19 to 38.
40. A computer readable medium storing a computer program as claimed in claim 39.
41. Apparatus for sending to a user data related to a page of information, said apparatus comprising: a controller; and a cache for storing data related to a plurality of pages of information; wherein, when a page of information is updated, the controller is arranged to control the cache to update data relating to said updated page of information.
42. Apparatus substantially as herein described or as illustrated in the accompanying drawings.
43. A method substantially as herein described or as illustrated in the accompanying drawings.
PCT/GB2004/000959 2003-03-06 2004-03-05 Improvements in internet site architecture WO2004079485A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0305145.5 2003-03-06
GBGB0305145.5A GB0305145D0 (en) 2003-03-06 2003-03-06 Improvements to internet site architecture

Publications (2)

Publication Number Publication Date
WO2004079485A2 true WO2004079485A2 (en) 2004-09-16
WO2004079485A3 WO2004079485A3 (en) 2004-11-11

Family

ID=9954244

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2004/000959 WO2004079485A2 (en) 2003-03-06 2004-03-05 Improvements in internet site architecture

Country Status (2)

Country Link
GB (1) GB0305145D0 (en)
WO (1) WO2004079485A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015026667A1 (en) * 2013-08-21 2015-02-26 Alibaba Group Holding Limited Generating cache query requests

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5819271A (en) * 1996-06-04 1998-10-06 Multex Systems, Inc. Corporate information communication and delivery system and method including entitlable hypertext links
US6253198B1 (en) * 1999-05-11 2001-06-26 Search Mechanics, Inc. Process for maintaining ongoing registration for pages on a given search engine
EP1143349A1 (en) * 2000-04-07 2001-10-10 IconParc GmbH Method and apparatus for generating index data for search engines
WO2002057949A1 (en) * 2001-01-22 2002-07-25 Contrieve, Inc. Systems and methods for managing and promoting network content

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5819271A (en) * 1996-06-04 1998-10-06 Multex Systems, Inc. Corporate information communication and delivery system and method including entitlable hypertext links
US6253198B1 (en) * 1999-05-11 2001-06-26 Search Mechanics, Inc. Process for maintaining ongoing registration for pages on a given search engine
EP1143349A1 (en) * 2000-04-07 2001-10-10 IconParc GmbH Method and apparatus for generating index data for search engines
WO2002057949A1 (en) * 2001-01-22 2002-07-25 Contrieve, Inc. Systems and methods for managing and promoting network content

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GUPTA V ET AL: "Internet search engine freshness by Web server help" APPLICATIONS AND THE INTERNET, 2001. PROCEEDINGS. 2001 SYMPOSIUM ON SAN DIEGO, CA, USA 8-12 JAN. 2001, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 8 January 2001 (2001-01-08), pages 113-119, XP010532804 ISBN: 0-7695-0942-8 *
LOESER H: "Shift it to the server! Let the database server update your Web sites" WEB INFORMATION SYSTEMS ENGINEERING, 2000. PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON HONG KONG, CHINA 19-21 JUNE 2000, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 19 June 2000 (2000-06-19), pages 50-54, XP010521836 ISBN: 0-7695-0577-5 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015026667A1 (en) * 2013-08-21 2015-02-26 Alibaba Group Holding Limited Generating cache query requests

Also Published As

Publication number Publication date
GB0305145D0 (en) 2003-04-09
WO2004079485A3 (en) 2004-11-11

Similar Documents

Publication Publication Date Title
US6907423B2 (en) Search engine interface and method of controlling client searches
US6195696B1 (en) Systems, methods and computer program products for assigning, generating and delivering content to intranet users
US6145003A (en) Method of web crawling utilizing address mapping
US6061686A (en) Updating a copy of a remote document stored in a local computer system
JP3983035B2 (en) User terminal authentication program
US6625624B1 (en) Information access system and method for archiving web pages
US20050246717A1 (en) Database System with Methodology for Providing Stored Procedures as Web Services
US20020078180A1 (en) Information collection server, information collection method, and recording medium
US20050086212A1 (en) Method, apparatus and computer program for key word searching
Rao et al. A proxy-based personal web archiving service
JP2002229842A (en) Http archival file
JP2004536398A (en) Apparatus and method for selectively retrieving information and then displaying that information
WO2005052811A1 (en) Searching in a computer network
WO2002077860A1 (en) Application data synchronization in telecommunications system
US20110179178A1 (en) System and Method for Managing Multiple Domain Names for a Website in a Website Indexing System
US6883020B1 (en) Apparatus and method for filtering downloaded network sites
JPH11502346A (en) Computer system and computer execution process for creating and maintaining online services
US20030149745A1 (en) Method and apparatus for accessing information from a network data source
US8135860B1 (en) Content interpolating web proxy server
US8533226B1 (en) System and method for verifying and revoking ownership rights with respect to a website in a website indexing system
RU2295762C2 (en) Method for supporting a set of languages on web-servers for inbuilt systems
US10255362B2 (en) Method for performing a search, and computer program product and user interface for same
WO2004079485A2 (en) Improvements in internet site architecture
KR20020005882A (en) The system and the method of remote controlling a computer and reading the data therein using the mobile phone
JPH11312172A (en) Information processor, its processing method and medium with control program stored therein

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase