US20020078087A1 - Content indicator for accelerated detection of a changed web page - Google Patents

Content indicator for accelerated detection of a changed web page Download PDF

Info

Publication number
US20020078087A1
US20020078087A1 US09/737,946 US73794600A US2002078087A1 US 20020078087 A1 US20020078087 A1 US 20020078087A1 US 73794600 A US73794600 A US 73794600A US 2002078087 A1 US2002078087 A1 US 2002078087A1
Authority
US
United States
Prior art keywords
file
digest
content indicator
content
files
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US09/737,946
Inventor
Alan Stone
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corp filed Critical Intel Corp
Priority to US09/737,946 priority Critical patent/US20020078087A1/en
Assigned to INTEL CORP. reassignment INTEL CORP. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: STONE, ALAN E.
Priority to US09/865,929 priority patent/US7078766B2/en
Publication of US20020078087A1 publication Critical patent/US20020078087A1/en
Assigned to INTEL CORPORATION reassignment INTEL CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DIALOGIC CORPORATION
Assigned to INTEL CORPORATION reassignment INTEL CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DIALOGIC CORPORATION
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L27/00Devices consisting of a plurality of semiconductor or other solid-state components formed in or on a common substrate
    • H01L27/02Devices consisting of a plurality of semiconductor or other solid-state components formed in or on a common substrate including semiconductor components specially adapted for rectifying, oscillating, amplifying or switching and having at least one potential-jump barrier or surface barrier; including integrated passive circuit elements with at least one potential-jump barrier or surface barrier
    • H01L27/12Devices consisting of a plurality of semiconductor or other solid-state components formed in or on a common substrate including semiconductor components specially adapted for rectifying, oscillating, amplifying or switching and having at least one potential-jump barrier or surface barrier; including integrated passive circuit elements with at least one potential-jump barrier or surface barrier the substrate being other than a semiconductor body, e.g. an insulating body
    • H01L27/1203Devices consisting of a plurality of semiconductor or other solid-state components formed in or on a common substrate including semiconductor components specially adapted for rectifying, oscillating, amplifying or switching and having at least one potential-jump barrier or surface barrier; including integrated passive circuit elements with at least one potential-jump barrier or surface barrier the substrate being other than a semiconductor body, e.g. an insulating body the substrate comprising an insulating body on a semiconductor body, e.g. SOI
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L29/00Semiconductor devices adapted for rectifying, amplifying, oscillating or switching, or capacitors or resistors with at least one potential-jump barrier or surface barrier, e.g. PN junction depletion layer or carrier concentration layer; Details of semiconductor bodies or of electrodes thereof  ; Multistep manufacturing processes therefor
    • H01L29/66Types of semiconductor device ; Multistep manufacturing processes therefor
    • H01L29/68Types of semiconductor device ; Multistep manufacturing processes therefor controllable by only the electric current supplied, or only the electric potential applied, to an electrode which does not carry the current to be rectified, amplified or switched
    • H01L29/76Unipolar devices, e.g. field effect transistors
    • H01L29/772Field effect transistors
    • H01L29/78Field effect transistors with field effect produced by an insulated gate
    • H01L29/786Thin film transistors, i.e. transistors with a channel being at least partly a thin film
    • H01L29/78606Thin film transistors, i.e. transistors with a channel being at least partly a thin film with supplementary region or layer in the thin film or in the insulated bulk substrate supporting it for controlling or increasing the safety of the device
    • H01L29/78612Thin film transistors, i.e. transistors with a channel being at least partly a thin film with supplementary region or layer in the thin film or in the insulated bulk substrate supporting it for controlling or increasing the safety of the device for preventing the kink- or the snapback effect, e.g. discharging the minority carriers of the channel region for preventing bipolar effect
    • HELECTRICITY
    • H10SEMICONDUCTOR DEVICES; ELECTRIC SOLID-STATE DEVICES NOT OTHERWISE PROVIDED FOR
    • H10BELECTRONIC MEMORY DEVICES
    • H10B10/00Static random access memory [SRAM] devices

Definitions

  • the invention generally relates to web pages, browsers and search engines, and in particular, to a content indicator for accelerated detection of a changed web page.
  • a web server is a server that stores or provides web pages, typically in Hypertext Markup Language (HTML) format, and makes these web pages available to clients upon request, such as in response to a “Get” request using Hypertext Transfer Protocol (HTTP)—HTTP/1.1, Request For Comments 2616, June 1999.
  • HTTP Hypertext Transfer Protocol
  • a client may be any software program that may request access the web pages.
  • Two common web clients include a web browser and search engine indexers.
  • a web browser is a program which can retrieve web pages from remote web servers and display the web page for the user.
  • the Internet is typically indexed via search engine indexers, also known as web “spiders.”
  • these spiders may be dedicated machines that relentlessly visit all the publicly addressable Internet addresses to gain access to the HyperText Transfer Protocol (HTTP) port number 80 to find “home pages” or “web pages.” Once found, the spider navigates through the content of each ‘page’, indexing both content and hyperlinks.
  • the index may provide, for example, a correspondence between the subject matter of a web page and an address or Universal Resource Identifier for each web page. This information is then provided to a search engine, to allow the search engine to identify addresses or locations of pertinent web pages in response to a particular search.
  • the content change indication is provided by either file size, file date, or a file digest specified by MD5 message digest algorithm, described in RFC 1321.
  • the client can request one or more of these values from the web server for a particular page.
  • the web server then retrieves the page from memory, calculates the file digest, file size or file date, and then returns this information to the client, where the client may use this information to decide whether to use the cached copy or request a copy from the web server.
  • this is a slow and inefficient technique.
  • web pages may be stored at a location where a web server is not available. For example, it is common to store web pages on a server or a network accessible drive, without the additional burden of an HTTP server. Thus, in such cases, it is desirable to obtain a page content change indication without querying the web server.
  • the changes in the web content can cause the web index to become outdated, which may create search results that include stale pages, pages that have moved or disappeared, broken links, etc.
  • the web spider usually indexes web content relentlessly, constantly downloading indexing the same web content over and over again in attempt to provide updated indexes. This is very inefficient because this repetitive downloading of web pages consumes a large amount of bandwidth.
  • FIG. 1 is a diagram illustrating insertion of a digest or other content indicator into a file according to an example embodiment.
  • FIG. 2 is a diagram illustrating insertion of a digest into a file according to another example embodiment.
  • FIG. 3 is a diagram illustrating use of a digest according to yet another example embodiment.
  • FIG. 4 is a diagram illustrating a HTML document according to an example embodiment.
  • FIG. 5 is a block diagram that illustrates a network according to an example embodiment.
  • FIG. 1 is a diagram illustrating insertion of a digest or other content indicator into a file according to an example embodiment.
  • a web page or HTML page authoring tool 112 is provided to author or generate web pages or HTML pages.
  • HTML authoring tool 112 typically may be a software program running on a processing node, such as a computer.
  • the processing node or computer may include a processor, memory and other components.
  • Web page authoring tool 112 may be, for example, software programs such as Front Page or Word, both available from Microsoft Corporation, Redmond, Washington.
  • a page-resident content indicator may be provided for each page to allow programs or clients to detect web page changes.
  • the authoring tool 112 may include an additional program that calculates or generates a content indicator for each file.
  • the files may be, for example, a web page or HTML page, a graphic, a script, etc.
  • a content indicator is calculated or generated for each web page.
  • the content indicator may then be stored in or with the file or web page.
  • a content indicator may be anything that allows a client or other program to detect a change or update to the content of the web pages.
  • a content indicator when compared to another content indicator for the same web page, provides an indication as to whether or not the content of the web page has been changed or updated.
  • a content indicator may include, for example, a file size of the web page, a date and time that the web page was last modified or changed, and a file digest.
  • a digest function takes an arbitrary sized message or file, such as a web page, and generates a number, which is typically a fixed length quantity.
  • a hash algorithm or hash function also known as a message digest is typically a one-way function. It is considered a function because it takes an input message and produces an output. It may be considered one-way because it is not practical to figure out what input corresponds to a given output. If it is cryptographically secure, it should be impossible to find two messages or files that have the same file digest.
  • the digest may be calculated, for example, using message digest algorithms, including MD2, MD4 and MD5, and documented in Request for Comments 1319, 1320, 1321, respectively. Other algorithms, such as hash functions or Cyclic Redundancy Checks (CRC) algorithms, etc. may be used to generate the file digests.
  • message digest algorithms including MD2, MD4 and MD5, and documented in Request for Comments 1319, 1320, 1321, respectively.
  • Other algorithms such as hash functions or Cyclic Redundancy Checks (CRC) algorithms, etc. may be used to generate the file digests.
  • CRC Cyclic Redundancy Checks
  • the term digest will be used hereinbelow in the various embodiments and examples. However, other types of content indicators may be used as well.
  • the page authoring tool 112 includes a digest calculator 114 to calculate or generate a digest for each file or web page each time a web page or file is generated or created or updated, and then to store this digest with the corresponding web page.
  • Files 120 includes files 120 A, 120 B and 120 C, which may be web pages, HTML pages or other types of files.
  • the digests may be page-resident, since the digests may reside with the corresponding web pages or files 120 .
  • FIG. 4 is a diagram illustrating a HTML document according to an example embodiment.
  • the web page authoring tool 112 (FIG. 1) generates or updates, or is used to generate or update, the HTML web page shown in FIG. 4, including the head and title of the message and the body of the message 410 .
  • the digest calculator 114 (FIG. 1) then calculates or generates the file digest 405 based on the HTML page shown in FIG. 4.
  • the file digest 405 may then be prepended or attached to or stored within the HTML file.
  • each file or web page may include a corresponding digest that is encoded onto the file or web page.
  • the page-resident file digests for each of the files or web pages allows web indexers to quickly index the web pages since the indexer can identify which pages have changed, and then update the index using only changed web pages. For example, the indexer can read the file digest for each web page. If the digest for a web page matches the digest for a previous version of the web page that has already been indexed, then the indexer can skip this page and move on to the next web page without downloading the web page. If the digest for a web page is different from a previous digest for that web page, this indicates that the web page has changed, and the indexer can download and index that page. This allows the indexer to selectively download only those web pages that have changed, resulting in a significant decrease in bandwidth usage to index a set of web pages.
  • the page-resident digests for each of the stored web pages or files are also beneficial to the browsers that may be accessing these web pages. For example, in the event that the web pages are stored on a local storage drive or if a web server is not available, the browser may compare a digest from the cache-stored page to the digest from the page stored on the storage drive to determine if the cache-stored web page is invalid. If the cached copy of the page is invalid, as indicated by different digests, then the browser will retrieve the web page from the storage device. Otherwise, if the digests are the same, then this indicates that the cached copy of the page is still valid, and the browser may then use the cached copy, and need not download the entire web page from the network drive.
  • FIG. 2 is a diagram illustrating insertion of a digest into a file according to another example embodiment.
  • user-programmable digest insertion tool 130 or a content indicator insertion tool in the general case, is provided.
  • the digest insertion tool 130 can be programmed or directed to calculate updated digests for a plurality of files or web pages 120 , and then replace the existing digest in each file with the updated digest.
  • the digest insertion tool 130 may also include or use the digest calculator to calculate or generate a digest for each file or web page.
  • FIG. 3 is a diagram illustrating use of a digest according to yet another example embodiment.
  • a digest repository insertion tool 140 is provided to read each file or web page 120 and the file path.
  • the file path for each file may be the path that identifies the location or address of the file in a network, for example.
  • the file path may be a Universal Resource Identifier (URI) or a Universal Resource Location (URL), for example.
  • the digest repository insertion tool 140 includes a digest calculator 114 .
  • the digest repository insertion tool 140 then calculates or computes a digest for each web page or file, or uses the digest calculator 114 to perform these calculations.
  • the digest repository insertion tool 140 then stores a file path and digest in a digest repository or storage 170 , for each file or web page. Two example file path and digest pairs are shown below:
  • the page authoring tool 112 , digest insertion tools 130 or 140 , the files or web pages 120 and the digest repository 170 may be provided on a single processing node, or spread across multiple processing nodes, where a processing node may be a computer, a server or similar system.
  • FIG. 5 is a block diagram that illustrates a network according to an example embodiment.
  • web page authoring tool 112 may be a software program running on processing node 510
  • digest insertion tool 130 or 140 may be a software program running on processing node 515
  • files 120 may be stored in processing node 520
  • digest repository may be stored on processing node 525 .

Abstract

Various embodiments are described for content indicators to accelerate detection of a changed web page or other file.

Description

    FIELD
  • The invention generally relates to web pages, browsers and search engines, and in particular, to a content indicator for accelerated detection of a changed web page. [0001]
  • BACKGROUND
  • Today, web pages are commonly stored on web servers. A web server is a server that stores or provides web pages, typically in Hypertext Markup Language (HTML) format, and makes these web pages available to clients upon request, such as in response to a “Get” request using Hypertext Transfer Protocol (HTTP)—HTTP/1.1, Request For Comments 2616, June 1999. A client may be any software program that may request access the web pages. Two common web clients include a web browser and search engine indexers. A web browser is a program which can retrieve web pages from remote web servers and display the web page for the user. [0002]
  • The Internet is typically indexed via search engine indexers, also known as web “spiders.” Typically, these spiders may be dedicated machines that relentlessly visit all the publicly addressable Internet addresses to gain access to the HyperText Transfer Protocol (HTTP) port number 80 to find “home pages” or “web pages.” Once found, the spider navigates through the content of each ‘page’, indexing both content and hyperlinks. The index may provide, for example, a correspondence between the subject matter of a web page and an address or Universal Resource Identifier for each web page. This information is then provided to a search engine, to allow the search engine to identify addresses or locations of pertinent web pages in response to a particular search. [0003]
  • Changes to web pages can create problems for browsers and search engine indexers or spiders. Web content is frequently changed, by adding new content to pages, removing or adding new pages, or changing a hyperlink to another page, etc. When a browser retrieves a web page, a copy of the web page is stored in a local cache. When a second request for the cached web page is received at the browser from a user, the browser determines whether to use the cached copy of the web page, or whether to retrieve the web page from the web server. In HTTP/1.1 protocol, RFC 2068, a technique is described for the web server to provide a page content change indication. The content change indication is provided by either file size, file date, or a file digest specified by MD5 message digest algorithm, described in RFC 1321. The client can request one or more of these values from the web server for a particular page. The web server then retrieves the page from memory, calculates the file digest, file size or file date, and then returns this information to the client, where the client may use this information to decide whether to use the cached copy or request a copy from the web server. However, this is a slow and inefficient technique. Also, in some instances, web pages may be stored at a location where a web server is not available. For example, it is common to store web pages on a server or a network accessible drive, without the additional burden of an HTTP server. Thus, in such cases, it is desirable to obtain a page content change indication without querying the web server. [0004]
  • For the search engine, the changes in the web content can cause the web index to become outdated, which may create search results that include stale pages, pages that have moved or disappeared, broken links, etc. As a result, the web spider usually indexes web content relentlessly, constantly downloading indexing the same web content over and over again in attempt to provide updated indexes. This is very inefficient because this repetitive downloading of web pages consumes a large amount of bandwidth. As a result, it is desirable to provide a technique to obtain a page content change indication so that only the changed pages would be necessary to download and re-index.[0005]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The foregoing and a better understanding of the present invention will become apparent from the following detailed description of exemplary embodiments and the claims when read in connection with the accompanying drawings, all forming a part of the disclosure of this invention. While the foregoing and following written and illustrated disclosure focuses on disclosing example embodiments of the invention, it should be clearly understood that the same is by way of illustration and example only and is not limited thereto. The spirit and scope of the present invention is limited only by the terms of the appended claims. [0006]
  • The following represents brief descriptions of the drawings, wherein: [0007]
  • FIG. 1 is a diagram illustrating insertion of a digest or other content indicator into a file according to an example embodiment. [0008]
  • FIG. 2 is a diagram illustrating insertion of a digest into a file according to another example embodiment. [0009]
  • FIG. 3 is a diagram illustrating use of a digest according to yet another example embodiment. [0010]
  • FIG. 4 is a diagram illustrating a HTML document according to an example embodiment. [0011]
  • FIG. 5 is a block diagram that illustrates a network according to an example embodiment.[0012]
  • DETAILED DESCRIPTION
  • Referring to the Figures in which like numerals indicate like elements, FIG. 1 is a diagram illustrating insertion of a digest or other content indicator into a file according to an example embodiment. As shown in FIG. 1, a web page or HTML [0013] page authoring tool 112 is provided to author or generate web pages or HTML pages. HTML authoring tool 112 typically may be a software program running on a processing node, such as a computer. The processing node or computer may include a processor, memory and other components. Web page authoring tool 112 may be, for example, software programs such as Front Page or Word, both available from Microsoft Corporation, Redmond, Washington.
  • According to the embodiment shown in FIG. 1, a page-resident content indicator may be provided for each page to allow programs or clients to detect web page changes. For example, the [0014] authoring tool 112 may include an additional program that calculates or generates a content indicator for each file. The files may be, for example, a web page or HTML page, a graphic, a script, etc.
  • According to an example embodiment, a content indicator is calculated or generated for each web page. The content indicator may then be stored in or with the file or web page. A content indicator may be anything that allows a client or other program to detect a change or update to the content of the web pages. According to an example embodiment, a content indicator, when compared to another content indicator for the same web page, provides an indication as to whether or not the content of the web page has been changed or updated. [0015]
  • A content indicator may include, for example, a file size of the web page, a date and time that the web page was last modified or changed, and a file digest. When a file digest is calculated for a web page, a digest function takes an arbitrary sized message or file, such as a web page, and generates a number, which is typically a fixed length quantity. A hash algorithm or hash function, also known as a message digest is typically a one-way function. It is considered a function because it takes an input message and produces an output. It may be considered one-way because it is not practical to figure out what input corresponds to a given output. If it is cryptographically secure, it should be impossible to find two messages or files that have the same file digest. Thus, if a change is made to a web page, the digest for that page will change. The digest may be calculated, for example, using message digest algorithms, including MD2, MD4 and MD5, and documented in Request for Comments 1319, 1320, 1321, respectively. Other algorithms, such as hash functions or Cyclic Redundancy Checks (CRC) algorithms, etc. may be used to generate the file digests. The term digest will be used hereinbelow in the various embodiments and examples. However, other types of content indicators may be used as well. [0016]
  • Therefore, as shown in FIG. 1, the [0017] page authoring tool 112 includes a digest calculator 114 to calculate or generate a digest for each file or web page each time a web page or file is generated or created or updated, and then to store this digest with the corresponding web page. Files 120 includes files 120A, 120B and 120C, which may be web pages, HTML pages or other types of files. Thus, according to an example embodiment, the digests may be page-resident, since the digests may reside with the corresponding web pages or files 120.
  • FIG. 4 is a diagram illustrating a HTML document according to an example embodiment. The web page authoring tool [0018] 112 (FIG. 1) generates or updates, or is used to generate or update, the HTML web page shown in FIG. 4, including the head and title of the message and the body of the message 410. The digest calculator 114 (FIG. 1) then calculates or generates the file digest 405 based on the HTML page shown in FIG. 4. The file digest 405 may then be prepended or attached to or stored within the HTML file. Thus, each file or web page may include a corresponding digest that is encoded onto the file or web page.
  • The page-resident file digests for each of the files or web pages allows web indexers to quickly index the web pages since the indexer can identify which pages have changed, and then update the index using only changed web pages. For example, the indexer can read the file digest for each web page. If the digest for a web page matches the digest for a previous version of the web page that has already been indexed, then the indexer can skip this page and move on to the next web page without downloading the web page. If the digest for a web page is different from a previous digest for that web page, this indicates that the web page has changed, and the indexer can download and index that page. This allows the indexer to selectively download only those web pages that have changed, resulting in a significant decrease in bandwidth usage to index a set of web pages. [0019]
  • The page-resident digests for each of the stored web pages or files are also beneficial to the browsers that may be accessing these web pages. For example, in the event that the web pages are stored on a local storage drive or if a web server is not available, the browser may compare a digest from the cache-stored page to the digest from the page stored on the storage drive to determine if the cache-stored web page is invalid. If the cached copy of the page is invalid, as indicated by different digests, then the browser will retrieve the web page from the storage device. Otherwise, if the digests are the same, then this indicates that the cached copy of the page is still valid, and the browser may then use the cached copy, and need not download the entire web page from the network drive. [0020]
  • FIG. 2 is a diagram illustrating insertion of a digest into a file according to another example embodiment. As shown in FIG. 2, as user-programmable digest [0021] insertion tool 130, or a content indicator insertion tool in the general case, is provided. Rather than calculating a digest each time a file or web page is created, updated or saved, the digest insertion tool 130 can be programmed or directed to calculate updated digests for a plurality of files or web pages 120, and then replace the existing digest in each file with the updated digest. The digest insertion tool 130 may also include or use the digest calculator to calculate or generate a digest for each file or web page.
  • FIG. 3 is a diagram illustrating use of a digest according to yet another example embodiment. As shown in FIG. 3, a digest [0022] repository insertion tool 140 is provided to read each file or web page 120 and the file path. The file path for each file may be the path that identifies the location or address of the file in a network, for example. The file path may be a Universal Resource Identifier (URI) or a Universal Resource Location (URL), for example. The digest repository insertion tool 140 includes a digest calculator 114. The digest repository insertion tool 140 then calculates or computes a digest for each web page or file, or uses the digest calculator 114 to perform these calculations. The digest repository insertion tool 140 then stores a file path and digest in a digest repository or storage 170, for each file or web page. Two example file path and digest pairs are shown below:
  • 1) home/stonea/new.html MD5=“CD25D86057DA6337090518B858D41E2”[0023]
  • 2) home/stonea/improved.html home/stonea/new.html [0024]
  • Where “home/stonea/new.html” is the file path and “CD25D86057DA6337090518B858D41 E2” is the digest for file 1), shown above as an example. [0025]
  • It may be advantageous to store such an array or listing of file path and digest pairs for each of a plurality of files or web pages. This would allow a web indexer or a browser to retrieve entries from the digest [0026] repository 170, rather than retrieve portions of the web pages or files, to quickly obtain a current digest for each page or file. The client, indexer or web browser, may then compare the digest from the repository to a local copy of the digest for the same page to determine if the web page has changed, which would typically be indicated by digests that are different.
  • The [0027] page authoring tool 112, digest insertion tools 130 or 140, the files or web pages 120 and the digest repository 170 may be provided on a single processing node, or spread across multiple processing nodes, where a processing node may be a computer, a server or similar system.
  • FIG. 5 is a block diagram that illustrates a network according to an example embodiment. For example, as shown in FIG. 5, web [0028] page authoring tool 112 may be a software program running on processing node 510, digest insertion tool 130 or 140 may be a software program running on processing node 515, files 120 may be stored in processing node 520, while digest repository may be stored on processing node 525. This is just an example network, however, the invention is not limited in scope to such a network or arrangement.
  • Several embodiments of the present invention are specifically illustrated and/or described herein. However, it will be appreciated that modifications and variations of the present invention are covered by the above teachings and within the purview of the appended claims without departing from the spirit and intended scope of the invention. [0029]

Claims (28)

What is claimed is:
1. An apparatus comprising:
an authoring tool to generate files;
a calculator to calculate a content indicator for one or more of the files, the page authoring tool to store each of the calculated content indicators with a corresponding file.
2. The apparatus of claim 1 wherein the authoring tool comprises a HTML authoring tool or program.
3. The apparatus of claim 1 wherein the calculator comprises a digest calculator to calculate digests for each of the files.
4. The apparatus of claim 1 wherein the apparatus encodes each of the content indicators within a corresponding file.
5. An apparatus comprising:
an insertion tool comprising a calculator to calculate a content indicator for each of a plurality of files, the insertion tool to insert each of the calculated content indicators within a corresponding one of the files.
6. The apparatus of claim 5 wherein the calculator comprises a digest calculator to calculate digests for each of the files.
7. The apparatus of claim 5 wherein the apparatus encodes each of the content indicators within a corresponding file.
8. The apparatus of clam 6 wherein the insertion tool comprises an insertion tool to insert each of the calculated digests within a corresponding one of the files.
9. The apparatus of claim 5 wherein the files comprise one or more of the following:
web pages;
HTML pages;
Graphics; and
Scripts.
10. An apparatus comprising:
an insertion tool comprising a calculator to calculate a content indicator for each of a plurality of files, the insertion tool to obtain a path for each file and store a file path and the content indicator in a repository for each of a plurality of files.
11. The apparatus of claim 10 wherein the calculator comprises a digest calculator to calculate digests for each of a plurality of files.
12. The apparatus of claim 11 wherein the calculator comprises a digest calculator to calculate digests for each of a plurality of web pages.
13. The apparatus of claim 10 wherein the path file indicates a location or address of the file.
14. A method comprising:
generating a file;
calculating a content indicator for the file;
storing the content indicator with the file to provide content change indication for the file when compared to another content indicator.
15. The method of claim 14 wherein the calculating comprises calculating a digest for the file.
16. The method of claim 15 wherein the storing comprises storing the digest with the file to provide content change indication for the file when compared to another digest.
17. The method of claim 14 and further comprises:
retrieving the content indicator for the file;
comparing the content indicator for the file to a content indicator corresponding to a previous version of the file; and
determining whether the file has changed based on the comparing.
18. The method of claim 14 and further comprises:
retrieving the content indicator for the file;
comparing the content indicator for the file to a content indicator corresponding to a previous version of the file; and
retrieving the file if the content indicator for the file and the content indicator corresponding to a previous version of the file do not match.
19. The method of claim 18 and further comprising updating an index based on the retrieved file if there was not a match.
20. A method comprising:
obtaining a path for a file;
retrieving the file;
calculating a content indicator for the file;
storing the file path and the content indicator for the file in a repository or storage.
21. The method of claim 20 wherein the calculating comprises calculating a digest for the file.
22. The method of claim 21 wherein the storing comprises storing the file path and the digest in a digest repository or storage.
23. An apparatus comprising a readable media having instructions thereon, the instructions resulting in the following when executed:
generating a file;
calculating a content indicator for the file;
storing the content indicator with the file to provide content change indication for the file when compared to another content indicator.
24. The apparatus of claim 23 wherein the calculator comprises calculating a digest for the file.
25. The apparatus of claim 23 wherein the storing comprises storing the digest with the file to provide content change indication for the file when compared to another digest.
26. An apparatus comprising a readable media having instructions thereon, the instructions resulting in the following when executed:
obtaining a path for a file;
retrieving the file;
calculating a content indicator for the file;
storing the file path and the content indicator for the file in a repository or storage.
27. The apparatus of claim 26 wherein the calculating comprises calculating a digest for the file.
28. The apparatus of claim 27 wherein the storing comprises storing the file path and the digest in a digest repository or storage.
US09/737,946 2000-12-18 2000-12-18 Content indicator for accelerated detection of a changed web page Abandoned US20020078087A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US09/737,946 US20020078087A1 (en) 2000-12-18 2000-12-18 Content indicator for accelerated detection of a changed web page
US09/865,929 US7078766B2 (en) 2000-12-18 2001-05-24 Transistor and logic circuit on thin silicon-on-insulator wafers based on gate induced drain leakage currents

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US09/737,946 US20020078087A1 (en) 2000-12-18 2000-12-18 Content indicator for accelerated detection of a changed web page

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US09/865,929 Division US7078766B2 (en) 1999-04-08 2001-05-24 Transistor and logic circuit on thin silicon-on-insulator wafers based on gate induced drain leakage currents

Publications (1)

Publication Number Publication Date
US20020078087A1 true US20020078087A1 (en) 2002-06-20

Family

ID=24965915

Family Applications (2)

Application Number Title Priority Date Filing Date
US09/737,946 Abandoned US20020078087A1 (en) 2000-12-18 2000-12-18 Content indicator for accelerated detection of a changed web page
US09/865,929 Expired - Fee Related US7078766B2 (en) 1999-04-08 2001-05-24 Transistor and logic circuit on thin silicon-on-insulator wafers based on gate induced drain leakage currents

Family Applications After (1)

Application Number Title Priority Date Filing Date
US09/865,929 Expired - Fee Related US7078766B2 (en) 1999-04-08 2001-05-24 Transistor and logic circuit on thin silicon-on-insulator wafers based on gate induced drain leakage currents

Country Status (1)

Country Link
US (2) US20020078087A1 (en)

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020161860A1 (en) * 2001-02-28 2002-10-31 Benjamin Godlin Method and system for differential distributed data file storage, management and access
US20030145281A1 (en) * 2001-10-31 2003-07-31 Metacyber.Net Hypertext page generator for a computer memory resident rapid comprehension document for original source information, and method
US20040054967A1 (en) * 2002-09-17 2004-03-18 Brandenberger Sarah M. Published web page version tracking
US20040243536A1 (en) * 2003-05-28 2004-12-02 Integrated Data Control, Inc. Information capturing, indexing, and authentication system
US20050071366A1 (en) * 2003-09-30 2005-03-31 International Business Machines Corporation Method, apparatus and computer program for retrieving data
US20050289648A1 (en) * 2004-06-23 2005-12-29 Steven Grobman Method, apparatus and system for virtualized peer-to-peer proxy services
US7107336B2 (en) * 2001-02-23 2006-09-12 International Business Machines Corporation Method and apparatus for enhanced server page execution
US20070124667A1 (en) * 2005-11-25 2007-05-31 International Business Machines Corporation Verifying content of resources in markup language documents
US20070192378A1 (en) * 2003-11-21 2007-08-16 Bellsouth Intellectual Property Corporation Method, systems and computer program products for monitoring files
US20070277045A1 (en) * 2006-05-25 2007-11-29 Kabushiki Kaisha Toshiba Data processing apparatus and a method for processing data
US20090037393A1 (en) * 2004-06-30 2009-02-05 Eric Russell Fredricksen System and Method of Accessing a Document Efficiently Through Multi-Tier Web Caching
US20090132539A1 (en) * 2005-04-27 2009-05-21 Alyn Hockey Tracking marked documents
US20100050067A1 (en) * 2006-05-20 2010-02-25 International Business Machines Corporation Bookmarking internet resources in an internet browser
US20110029899A1 (en) * 2009-08-03 2011-02-03 FasterWeb, Ltd. Systems and Methods for Acceleration and Optimization of Web Pages Access by Changing the Order of Resource Loading
US8224964B1 (en) 2004-06-30 2012-07-17 Google Inc. System and method of accessing a document efficiently through multi-tier web caching
US20120210237A1 (en) * 2011-02-16 2012-08-16 Computer Associates Think, Inc. Recording A Trail Of Webpages
US8341177B1 (en) * 2006-12-28 2012-12-25 Symantec Operating Corporation Automated dereferencing of electronic communications for archival
US8341711B1 (en) * 2004-08-24 2012-12-25 Whitehat Security, Inc. Automated login session extender for use in security analysis systems
US8346784B1 (en) 2012-05-29 2013-01-01 Limelight Networks, Inc. Java script reductor
US8370420B1 (en) * 2002-07-11 2013-02-05 Citrix Systems, Inc. Web-integrated display of locally stored content objects
US8495171B1 (en) 2012-05-29 2013-07-23 Limelight Networks, Inc. Indiscriminate virtual containers for prioritized content-object distribution
US8676922B1 (en) 2004-06-30 2014-03-18 Google Inc. Automatic proxy setting modification
US8812651B1 (en) 2007-02-15 2014-08-19 Google Inc. Systems and methods for client cache awareness
US20140359411A1 (en) * 2013-06-04 2014-12-04 X1 Discovery, Inc. Methods and systems for uniquely identifying digital content for ediscovery
US9015348B2 (en) 2013-07-19 2015-04-21 Limelight Networks, Inc. Dynamically selecting between acceleration techniques based on content request attributes
US9058402B2 (en) 2012-05-29 2015-06-16 Limelight Networks, Inc. Chronological-progression access prioritization
US20160364376A1 (en) * 2015-06-10 2016-12-15 Fuji Xerox Co., Ltd. Information processing apparatus, network system, and non-transitory computer readable medium
US20220217117A1 (en) * 2017-10-17 2022-07-07 Servicenow, Inc. Deployment of a custom address to a remotely managed computational instance

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6713819B1 (en) * 2002-04-08 2004-03-30 Advanced Micro Devices, Inc. SOI MOSFET having amorphized source drain and method of fabrication
US7635882B2 (en) * 2004-08-11 2009-12-22 Taiwan Semiconductor Manufacturing Company, Ltd. Logic switch and circuits utilizing the switch

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5481672A (en) * 1991-02-27 1996-01-02 Canon Kabushiki Kaisha Detecting rewriting of stored data, using codes based on password and the stored data
US5978842A (en) * 1997-01-14 1999-11-02 Netmind Technologies, Inc. Distributed-client change-detection tool with change-detection augmented by multiple clients
US6055522A (en) * 1996-01-29 2000-04-25 Futuretense, Inc. Automatic page converter for dynamic content distributed publishing system
US6161126A (en) * 1995-12-13 2000-12-12 Immersion Corporation Implementing force feedback over the World Wide Web and other computer networks
US20010039563A1 (en) * 2000-05-12 2001-11-08 Yunqi Tian Two-level internet search service system
US20010044820A1 (en) * 2000-04-06 2001-11-22 Scott Adam Marc Method and system for website content integrity assurance
US20010056460A1 (en) * 2000-04-24 2001-12-27 Ranjit Sahota Method and system for transforming content for execution on multiple platforms
US20020013825A1 (en) * 1997-01-14 2002-01-31 Freivald Matthew P. Unique-change detection of dynamic web pages using history tables of signatures
US20020022977A1 (en) * 1999-12-03 2002-02-21 Schiff Martin R. Systems and methods of maintaining client relationships
US6411959B1 (en) * 1999-09-29 2002-06-25 International Business Machines Corporation Apparatus and method for dynamically updating a computer-implemented table and associated objects
US6411989B1 (en) * 1998-12-28 2002-06-25 Lucent Technologies Inc. Apparatus and method for sharing information in simultaneously viewed documents on a communication system
US6460023B1 (en) * 1999-06-16 2002-10-01 Pulse Entertainment, Inc. Software authorization system and method
US6681369B2 (en) * 1999-05-05 2004-01-20 Xerox Corporation System for providing document change information for a community of users

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5448513A (en) 1993-12-02 1995-09-05 Regents Of The University Of California Capacitorless DRAM device on silicon-on-insulator substrate

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5481672A (en) * 1991-02-27 1996-01-02 Canon Kabushiki Kaisha Detecting rewriting of stored data, using codes based on password and the stored data
US6161126A (en) * 1995-12-13 2000-12-12 Immersion Corporation Implementing force feedback over the World Wide Web and other computer networks
US6055522A (en) * 1996-01-29 2000-04-25 Futuretense, Inc. Automatic page converter for dynamic content distributed publishing system
US5978842A (en) * 1997-01-14 1999-11-02 Netmind Technologies, Inc. Distributed-client change-detection tool with change-detection augmented by multiple clients
US20020013825A1 (en) * 1997-01-14 2002-01-31 Freivald Matthew P. Unique-change detection of dynamic web pages using history tables of signatures
US6411989B1 (en) * 1998-12-28 2002-06-25 Lucent Technologies Inc. Apparatus and method for sharing information in simultaneously viewed documents on a communication system
US6681369B2 (en) * 1999-05-05 2004-01-20 Xerox Corporation System for providing document change information for a community of users
US6460023B1 (en) * 1999-06-16 2002-10-01 Pulse Entertainment, Inc. Software authorization system and method
US6411959B1 (en) * 1999-09-29 2002-06-25 International Business Machines Corporation Apparatus and method for dynamically updating a computer-implemented table and associated objects
US20020022977A1 (en) * 1999-12-03 2002-02-21 Schiff Martin R. Systems and methods of maintaining client relationships
US20010044820A1 (en) * 2000-04-06 2001-11-22 Scott Adam Marc Method and system for website content integrity assurance
US20010056460A1 (en) * 2000-04-24 2001-12-27 Ranjit Sahota Method and system for transforming content for execution on multiple platforms
US20010039563A1 (en) * 2000-05-12 2001-11-08 Yunqi Tian Two-level internet search service system

Cited By (54)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7107336B2 (en) * 2001-02-23 2006-09-12 International Business Machines Corporation Method and apparatus for enhanced server page execution
US20020161860A1 (en) * 2001-02-28 2002-10-31 Benjamin Godlin Method and system for differential distributed data file storage, management and access
US20030145281A1 (en) * 2001-10-31 2003-07-31 Metacyber.Net Hypertext page generator for a computer memory resident rapid comprehension document for original source information, and method
US8370420B1 (en) * 2002-07-11 2013-02-05 Citrix Systems, Inc. Web-integrated display of locally stored content objects
US20040054967A1 (en) * 2002-09-17 2004-03-18 Brandenberger Sarah M. Published web page version tracking
US7418661B2 (en) * 2002-09-17 2008-08-26 Hewlett-Packard Development Company, L.P. Published web page version tracking
US20040243536A1 (en) * 2003-05-28 2004-12-02 Integrated Data Control, Inc. Information capturing, indexing, and authentication system
GB2406660A (en) * 2003-09-30 2005-04-06 Ibm A system for retrieving data from a partially indexed data store
US20050071366A1 (en) * 2003-09-30 2005-03-31 International Business Machines Corporation Method, apparatus and computer program for retrieving data
US8818990B2 (en) 2003-09-30 2014-08-26 International Business Machines Corporation Method, apparatus and computer program for retrieving data
US7584230B2 (en) * 2003-11-21 2009-09-01 At&T Intellectual Property, I, L.P. Method, systems and computer program products for monitoring files
US20070192378A1 (en) * 2003-11-21 2007-08-16 Bellsouth Intellectual Property Corporation Method, systems and computer program products for monitoring files
US20050289648A1 (en) * 2004-06-23 2005-12-29 Steven Grobman Method, apparatus and system for virtualized peer-to-peer proxy services
US7788713B2 (en) * 2004-06-23 2010-08-31 Intel Corporation Method, apparatus and system for virtualized peer-to-peer proxy services
US8275790B2 (en) * 2004-06-30 2012-09-25 Google Inc. System and method of accessing a document efficiently through multi-tier web caching
US8825754B2 (en) 2004-06-30 2014-09-02 Google Inc. Prioritized preloading of documents to client
US20090037393A1 (en) * 2004-06-30 2009-02-05 Eric Russell Fredricksen System and Method of Accessing a Document Efficiently Through Multi-Tier Web Caching
US8788475B2 (en) 2004-06-30 2014-07-22 Google Inc. System and method of accessing a document efficiently through multi-tier web caching
US8676922B1 (en) 2004-06-30 2014-03-18 Google Inc. Automatic proxy setting modification
US8639742B2 (en) 2004-06-30 2014-01-28 Google Inc. Refreshing cached documents and storing differential document content
US9485140B2 (en) 2004-06-30 2016-11-01 Google Inc. Automatic proxy setting modification
US8224964B1 (en) 2004-06-30 2012-07-17 Google Inc. System and method of accessing a document efficiently through multi-tier web caching
US8925051B1 (en) 2004-08-24 2014-12-30 Whitehat Security, Inc. Automated login session extender for use in security analysis systems
US8341711B1 (en) * 2004-08-24 2012-12-25 Whitehat Security, Inc. Automated login session extender for use in security analysis systems
US20090132539A1 (en) * 2005-04-27 2009-05-21 Alyn Hockey Tracking marked documents
US9002909B2 (en) * 2005-04-27 2015-04-07 Clearswift Limited Tracking marked documents
US8549390B2 (en) * 2005-11-25 2013-10-01 International Business Machines Corporation Verifying content of resources in markup language documents
US20070124667A1 (en) * 2005-11-25 2007-05-31 International Business Machines Corporation Verifying content of resources in markup language documents
US9477647B2 (en) 2005-11-25 2016-10-25 International Business Machines Corporation Verifying content of resources in markup language documents by inclusion of a hash attribute-value pair in references to the content
US9892100B2 (en) 2005-11-25 2018-02-13 International Business Machines Corporation Verifying content of resources in markup language documents
US9984052B2 (en) 2005-11-25 2018-05-29 International Business Machines Corporation Verifying content of resources in markup language documents
US9286407B2 (en) * 2006-05-20 2016-03-15 International Business Machines Corporation Bookmarking internet resources in an internet browser
US20100050067A1 (en) * 2006-05-20 2010-02-25 International Business Machines Corporation Bookmarking internet resources in an internet browser
US20070277045A1 (en) * 2006-05-25 2007-11-29 Kabushiki Kaisha Toshiba Data processing apparatus and a method for processing data
US8341177B1 (en) * 2006-12-28 2012-12-25 Symantec Operating Corporation Automated dereferencing of electronic communications for archival
US8996653B1 (en) 2007-02-15 2015-03-31 Google Inc. Systems and methods for client authentication
US8812651B1 (en) 2007-02-15 2014-08-19 Google Inc. Systems and methods for client cache awareness
US8219633B2 (en) * 2009-08-03 2012-07-10 Limelight Networks, Inc. Acceleration of web pages access using next page optimization, caching and pre-fetching
US8250457B2 (en) 2009-08-03 2012-08-21 Limelight Networks, Inc. Acceleration and optimization of web pages access by changing the order of resource loading
US20110029641A1 (en) * 2009-08-03 2011-02-03 FasterWeb, Ltd. Systems and Methods Thereto for Acceleration of Web Pages Access Using Next Page Optimization, Caching and Pre-Fetching Techniques
US20120089695A1 (en) * 2009-08-03 2012-04-12 Fainberg Leonid Acceleration of web pages access using next page optimization, caching and pre-fetching
US20110029899A1 (en) * 2009-08-03 2011-02-03 FasterWeb, Ltd. Systems and Methods for Acceleration and Optimization of Web Pages Access by Changing the Order of Resource Loading
US8321533B2 (en) 2009-08-03 2012-11-27 Limelight Networks, Inc. Systems and methods thereto for acceleration of web pages access using next page optimization, caching and pre-fetching techniques
US8346885B2 (en) 2009-08-03 2013-01-01 Limelight Networks, Inc. Systems and methods thereto for acceleration of web pages access using next page optimization, caching and pre-fetching techniques
US20120210237A1 (en) * 2011-02-16 2012-08-16 Computer Associates Think, Inc. Recording A Trail Of Webpages
US8495171B1 (en) 2012-05-29 2013-07-23 Limelight Networks, Inc. Indiscriminate virtual containers for prioritized content-object distribution
US8346784B1 (en) 2012-05-29 2013-01-01 Limelight Networks, Inc. Java script reductor
US9058402B2 (en) 2012-05-29 2015-06-16 Limelight Networks, Inc. Chronological-progression access prioritization
US9880983B2 (en) * 2013-06-04 2018-01-30 X1 Discovery, Inc. Methods and systems for uniquely identifying digital content for eDiscovery
US20140359411A1 (en) * 2013-06-04 2014-12-04 X1 Discovery, Inc. Methods and systems for uniquely identifying digital content for ediscovery
US9015348B2 (en) 2013-07-19 2015-04-21 Limelight Networks, Inc. Dynamically selecting between acceleration techniques based on content request attributes
US20160364376A1 (en) * 2015-06-10 2016-12-15 Fuji Xerox Co., Ltd. Information processing apparatus, network system, and non-transitory computer readable medium
US20220217117A1 (en) * 2017-10-17 2022-07-07 Servicenow, Inc. Deployment of a custom address to a remotely managed computational instance
US11601392B2 (en) * 2017-10-17 2023-03-07 Servicenow, Inc. Deployment of a custom address to a remotely managed computational instance

Also Published As

Publication number Publication date
US20020074599A1 (en) 2002-06-20
US7078766B2 (en) 2006-07-18

Similar Documents

Publication Publication Date Title
US20020078087A1 (en) Content indicator for accelerated detection of a changed web page
US6356906B1 (en) Standard database queries within standard request-response protocols
US6345292B1 (en) Web page rendering architecture
US9703885B2 (en) Systems and methods for managing content variations in content delivery cache
US7660844B2 (en) Network service system and program using data processing
US6233606B1 (en) Automatic cache synchronization
US7334087B2 (en) Context-sensitive caching
US7024452B1 (en) Method and system for file-system based caching
US7689667B2 (en) Protocol to fix broken links on the world wide web
US6301614B1 (en) System and method for efficient representation of data set addresses in a web crawler
JP4785838B2 (en) Web server for multi-version web documents
US8250081B2 (en) Resource access filtering system and database structure for use therewith
US7600028B2 (en) Methods and systems for opportunistic cookie caching
US20020188631A1 (en) Method, system, and software for transmission of information
US20040010543A1 (en) Cached resource validation without source server contact during validation
US20030120752A1 (en) Dynamic web page caching system and method
US8225192B2 (en) Extensible cache-safe links to files in a web page
US6351748B1 (en) File system level access source control of resources within standard request-response protocols
US20070033290A1 (en) Normalization and customization of syndication feeds
CA2369613A1 (en) Selecting a cache
EP1175651A2 (en) Handling a request for information provided by a network site
US20080147875A1 (en) System, method and program for minimizing amount of data transfer across a network
US7797376B1 (en) Arrangement for providing content operation identifiers with a specified HTTP object for acceleration of relevant content operations
US20070022082A1 (en) Search engine coverage
US20020107986A1 (en) Methods and systems for replacing data transmission request expressions

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTEL CORP., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:STONE, ALAN E.;REEL/FRAME:011366/0473

Effective date: 20001214

AS Assignment

Owner name: INTEL CORPORATION, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:DIALOGIC CORPORATION;REEL/FRAME:014120/0462

Effective date: 20031017

Owner name: INTEL CORPORATION, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:DIALOGIC CORPORATION;REEL/FRAME:014120/0451

Effective date: 20031017

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION