US20050235030A1 - System and method for estimating prevalence of digital content on the World-Wide-Web - Google Patents

System and method for estimating prevalence of digital content on the World-Wide-Web Download PDF

Info

Publication number
US20050235030A1
US20050235030A1 US11/144,110 US14411005A US2005235030A1 US 20050235030 A1 US20050235030 A1 US 20050235030A1 US 14411005 A US14411005 A US 14411005A US 2005235030 A1 US2005235030 A1 US 2005235030A1
Authority
US
United States
Prior art keywords
advertisement
data
advertising
web
traffic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/144,110
Inventor
Gregory Lauckhart
Craig Horman
Christa Korol
James Bartot
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US11/144,110 priority Critical patent/US20050235030A1/en
Publication of US20050235030A1 publication Critical patent/US20050235030A1/en
Assigned to CITIBANK, N.A., AS COLATERAL AGENT reassignment CITIBANK, N.A., AS COLATERAL AGENT SECURITY AGREEMENT Assignors: NETRATINGS, INC.
Assigned to NETRATINGS, LLC reassignment NETRATINGS, LLC RELEASE (REEL 019817 / FRAME 0774) Assignors: CITIBANK, N.A.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0242Determining effectiveness of advertisements
    • G06Q30/0246Traffic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0277Online advertisement

Definitions

  • the present invention relates generally to a system, method, and computer program product for tracking and measuring digital content that is distributed on a computer network such as the Internet. More particularly, the present invention relates to a system, method, and computer program product that collects online advertisement data, analyzes the data, and uses the data to calculate measurements of the prevalence of those advertisements.
  • Web World-Wide-Web
  • these interactive technologies directly affect the Web as an advertising medium because the technologies introduced new advertising formats such as fixed icon sponsorship advertisements, rotating banners and buttons, and interstitial advertisements (i.e., online advertisements that interrupts the user's work and takes over a significant percentage of the screen display).
  • interstitial advertisements i.e., online advertisements that interrupts the user's work and takes over a significant percentage of the screen display.
  • a banner advertisement or logo icon on a Web page creates an impression of the product for the viewer that is equivalent to a traditional billboard advertisement that promotes a product by presenting the brand name or slogan.
  • a sponsor's logo on a Web page creates an impression of the sponsor for the viewer that is equivalent to seeing a sponsor logo on the scoreboard at a college basketball game.
  • Online advertising is one area where traditional methodologies do not lend well to measurement. Each day, thousands upon thousands of electronic advertisements appear and then disappear from millions of Web pages. The transitory nature of online advertising activity warrants a novel methodology to accurately measure advertising activity.
  • the present invention is a system, method, and computer program product for tracking and measuring digital content that is distributed on a computer network such as the Internet.
  • the system collects online advertisement data, analyzes the data, and uses the data to calculate measurements of the prevalence of those advertisements.
  • traffic data from a variety of sources and complimentary methodologies fuels the traffic analysis system, an intelligent agent (i.e., software that interact with, learn from, and adapt to an environment).
  • the traffic analysis system processes raw traffic data by cleansing and summarizing the traffic data prior to storing the processed data in a database.
  • the statistical summarization system calculates the advertising frequency, impressions, and spending, it relies upon the processed data from the traffic analysis system.
  • the advertisement sampling system also known as the “prober” or “Cloudprober”, use a robust methodology that continually seek out the most significant and influential Web sites to probe (i.e., monitor). Moreover, the site selection and definition performed by the present invention dictates the Web pages that comprise each Web site to ensure that complete, singularly branded entities are reported as such.
  • the advertisement sampling system uses intelligent agent technology to retrieve Web pages at various frequencies to obtain a representative sample. This allows the Cloudprober to accurately assess how frequently each advertisement appears in the traffic data.
  • the advertisement sampling system extracts the advertisements from the Web page.
  • the advertisement extractor also known as the “extractor” invokes an automatic advertisement detection (“AAD”) process, a heuristic extraction process, to automatically extract all of the advertisements from the Web page.
  • AAD automatic advertisement detection
  • the advertisement sampling system invokes a classification engine to analyze the advertisement fragments.
  • the classifier processes each fragment to determine a classification for the fragment and then stores the fragment and classification data in a database.
  • the result of the analyses and processing performed by the advertisement sampling system is a rich catalog of advertising activity that can be easily queried by a client.
  • the present invention uses a Web front end and user interface to access and update the data in the database.
  • the Web front end provides a client, or user, of the present invention with a query interface to the database populated by the traffic analysis, advertisement sampling, and the statistical summarization systems.
  • the user interface is a graphical user interface that includes a separate component for system account management, site administration, taxonomy administration, advertising content classification, and rate card collection.
  • the user interface allows an account manager and operator to maintain and administer the present invention.
  • the user interface also allows a media editor to review the data in the database to verify the accuracy and integrity of the vast amount of data collected by the present invention. This data integrity process routinely investigates unusual or outlying data points to calibrate the system and adapt it to an ever-changing environment.
  • FIG. 1 is a network diagram depicting the environment for an advertising prevalence system according to the present invention.
  • FIG. 2 depicts the network diagram of FIG. 1 , in greater detail, to show the relationships between the network environment and the elements that comprise the advertising prevalence system.
  • FIG. 3 depicts the network diagram of FIG. 2 , in greater detail, to show the elements and sub-elements that comprise the advertising prevalence system and the connections to the network environment.
  • FIG. 4A is an exemplary Web site that illustrates the expected values used in the calculation of the advertising prevalence statistics.
  • FIG. 4B is an exemplary Web site that illustrates the observed values used in the calculation of the advertising prevalence statistics.
  • FIG. 4C is an exemplary Web site that illustrates the weighted values used in the calculation of the advertising prevalence statistics.
  • FIG. 4D is an exemplary Web site that illustrates an alternative method for the calculation of the advertising prevalence statistics.
  • FIG. 5 illustrates an example of a database structure that the advertising prevalence system may use.
  • FIG. 6 is a functional block diagram of the advertising prevalence system that shows the configuration of the hardware and software components.
  • FIG. 7A is a flow diagram of a process in the advertising prevalence system that measures the quality of online advertising and the activity generated by an online advertisement.
  • FIG. 7B is a flow diagram that describes, in greater detail, the process of sampling traffic data from FIG. 7A .
  • FIG. 7C is a flow diagram that describes, in greater detail, the process of generating a probe map based on sampled traffic data from FIG. 7A .
  • FIG. 7D is a flow diagram that describes, in greater detail, the process of probing the Internet 100 to gather sample data from FIG. 7A .
  • FIG. 7E is a flow diagram that describes, in greater detail, the process of classifying the advertising data from FIG. 7A .
  • FIG. 7F is a flow diagram that describes, in greater detail, the process of calculating advertising statistics from FIG. 7A .
  • FIG. 1 depicts the environment for the preferred embodiment of the present invention that includes the Internet 100 , and a Web site 110 , traffic sampling system 120 , advertising prevalence system 130 , and client 140 .
  • the present invention uses intelligent agent technology to gather data related to the attributes, placement, and prevalence of online advertisements. This data provides a user with up-to-date estimates of advertisement statistics and helps the user to gain a competitive advantage.
  • the Internet 100 is a public communication network that allows the traffic sampling system 120 and advertising prevalence system 130 to communicate with a client 140 and a Web site 110 .
  • the present invention contemplates the use of other public or private network architectures such as an intranet or extranet.
  • An intranet is a private communication network that functions similar to the Internet 100 .
  • An organization such as a corporation, creates an intranet to provide a secure means for members of the organization to access the resources on the organization's network.
  • An extranet is also a private communication network that functions similar to the Internet 100 .
  • an extranet provides a secure means for the organization to authorize non-members of the organization to access certain resources on the organization's network.
  • the present invention also contemplates using a network protocol such as Ethernet or Token Ring, as well as, proprietary network protocols.
  • the traffic sampling system 120 is a program that monitors and records Web activity on the Internet 100 .
  • the traffic sampling system 120 is an intermediary repository of traffic data between a Web surfer (not shown) on the Internet 100 and a Web server 112 .
  • the Web server 112 shown in FIG. 1 is a conventional personal computer or computer workstation that includes the proper operating system, hardware, communications protocol (e.g., Transmission Control Protocol/Internet Protocol), and Web server software to host a collection of Web pages.
  • the Web surfer (not shown) communicates with the Web server 112 by requesting a Uniform Resource Locator (“URL”) 114 , 116 , 118 associated with the Web site 110 , typically using a Web browser.
  • URL Uniform Resource Locator
  • Any program or device that can record a request for a URL made by a Web surfer (not shown) to a Web server 112 can perform the functions that the present invention requires of the traffic sampling system 120 .
  • the traffic sampling system 120 then aggregates the traffic data for each Web site 110 for use by the advertising prevalence system 130 .
  • the present invention can use any commercially available traffic sampling system that provides functionality similar to the Media Metrix audience measurement product.
  • Other possible mechanisms to obtain a traffic data sample include:
  • “Proxy Cache Sampling” gathers data such as user clickstream data, and Web page requests from a global distributed hierarchy of proxy cache servers. This data passes through an intermediate mechanism that provides pre-fetch and caching services for Web objects. As of May 1999, traffic statistics calculated by the present invention represent the distillation of raw data from nine first-tier and approximately 400 second-tier caches in the United States, as well as an additional 1100 worldwide.
  • Client-Side Panel Collection retrieves sample data from each panelist via a client-side mechanism and transfers that data to a collection repository.
  • the client-side mechanism may monitor the browser location bar, use browser, a client-side proxy, or TCP/IP stack hooks.
  • a “Transcoder” is a proxy that rewrites HTML, usually for the purpose of adding elements for generation of advertisement revenue or page headers/footers. Free Internet service providers (“ISPs”) typically use this technique.
  • Any content filtering mechanism that evaluates requests for URLs and takes actions to allow or disallow such requests.
  • FIG. 2 expands the detail of the advertising prevalence system 130 in FIG. 1 to show the relationships between the network environment and the elements that comprise the advertising prevalence system 130 .
  • the advertising prevalence system 130 includes a traffic analysis system 210 , advertisement sampling system 220 , and statistical summarization system 230 that communicate data to the database 200 for storage.
  • the account manager 260 , operator 262 , and media editor 264 can access the database 200 through the user interface 240 to perform administrative functions.
  • the client 140 can access the database 200 through the Web front end 250 .
  • the traffic analysis system 210 receives raw traffic data from the traffic sampling system 120 .
  • the traffic analysis system 210 cleanses the raw traffic data by removing information from the traffic data that may identify a particular user on the Internet 100 and then stores the anonymous data in the database 200 .
  • the traffic analysis system 210 estimates the global traffic to every significant Web site on the Internet 100 . This present invention uses this data not only for computing the number of advertising impressions given an estimate of the frequency of rotation on that page, but also in the probe mapping system 320 .
  • the traffic analysis system 210 receives traffic data from a cache site on the Internet 100 . The goal is to accurately measure the number of page views by individual users, and therefore the number of advertising impressions.
  • the advertisement sampling system 220 uses the anonymous traffic data to determine which URLs to include in the sample retrieved from the Web server 112 .
  • the advertisement sampling system 220 contacts the Web server 112 through the Internet 100 to retrieve a URL 114 , 116 , 118 and extract the advertisements therein along with the accompanying characteristics that describe the advertisements.
  • the success rate for retrieval of creatives is high. Analysis indicates that the present invention captures over 95% of creatives served.
  • the advertisement sampling system 220 stores these advertisement characteristics in the database 200 .
  • the advertisement sampling system 220 for example, the Cloudprober, Online Media Network Intelligent Agent Collection (“OMNIAC”), or the Cloudprober, repeatedly probes prominent Web sites, extracts advertisements from each Web page returned by the probe, and classifies the advertisements in each Web page by type, technology and advertiser.
  • OMI Online Media Network Intelligent Agent Collection
  • the traffic analysis system 210 and the advertisement sampling system also present the data retrieved from the Internet 100 to the statistical summarization system 230 for periodic processing.
  • the statistical summarization system 230 calculates the advertising frequency, impressions, and spending on per site per week basis.
  • the graphical user interface for the present invention includes the user interface 240 and Web front end 250 .
  • the account manager 260 , operator 262 , and media editor 264 access the user interface 240 to administer access by the client 140 to the Web front end 250 (e.g., account and password management), define sites and probe instructions, and manage the advertising taxonomy, content classification, and rate card collection for the advertising prevalence system 130 .
  • the Web front end 250 is the Web browser interface that a client 140 uses to retrieve the advertisement measurement results from the database 200 as generated by the traffic analysis system 210 , advertisement sampling system 220 , and the statistical summarization system 230 .
  • FIG. 3 further expands the detail of the advertising prevalence system 130 to depict the logical components comprising the elements of the advertising prevalence system 130 shown in FIG. 2 .
  • FIG. 3 also depicts the relationships between the network environment and those logical components.
  • the traffic analysis system 210 includes an anonymity system 310 and traffic summarization process 312 .
  • the anonymity system 310 cleanses the data received from the traffic sampling system 120 by removing information that identifies a particular user on the Internet.
  • the data is rendered anonymous by passing all user information (e.g., originating internet protocol (“IP”) number or cookies) through a cryptographically secure one-way hash function; this assures the utmost privacy for Web users without devaluing the resulting data.
  • IP internet protocol
  • the anonymity system 310 presents the cleansed data to the traffic summarization system 312 which in turn stores the aggregated URL count information in database 200 .
  • the traffic summarization process 312 receives cleansed data from the anonymity system 310 .
  • the anonymous traffic data is summarized to yield traffic totals by week or month for individual URLs, domains, and Web sites.
  • the traffic summarization process 312 scales the data by weighting factors to extrapolate total global traffic from the sample.
  • the advertisement sampling system 220 in FIG. 3 includes a probe mapping system 320 , Web page retrieval system 322 , Web browser emulation environment 324 , advertisement extractor 326 , and a structural classifier 328 .
  • the probe mapping system 320 generates a probe map, i.e., the URLs 114 , 116 , 118 that the advertisement sampling system 220 will visit. This probe map assists the advertisement sampling system 220 with the measurement of the rotation of advertisements on individual Web sites.
  • the preferred embodiment of the present invention continuously fetches various Web pages in the probe map. In an alternative embodiment, the present invention visits each URL in the probe map approximately every 6 minutes. Another embodiment can vary the fetching rate by considering several factors including the amount of traffic that visits the Web site as a whole and the individual Web page in question, the number of advertisements historically seen on the Web page, and the similarlity of the historically observed ad rotation to other sampled pages.
  • the Web page retrieval system 322 uses this probe map generated by the probe mapping system 320 to determine which Web pages it needs to sample and the frequency of the sampling. For each URL in the probe map generated by the probe mapping system 320 , the Web page retrieval system 322 fetches a Web page, extracts each advertisement from the Web page, and stores the advertisement's attributes in the database 200 . The data retrieved from each URL in the probe map is used to calculate the frequency with which each advertisement is shown on a particular Web site
  • the Web browser emulation environment 324 simulates the display of the Web page in a browser. This simulation guarantees that the present invention will detect not only static advertisements, but also dynamic advertisements generated by software programs written in a language such as JavaScript, Perl, Java, C, C++, or HTML that can be embedded in a Web page.
  • the advertisement extractor 326 extracts the online advertisements from the result of the simulation performed by the Web browser emulation environment 324 .
  • the advertisement extractor 326 identifies features of the advertising content (i.e., “fragments”) extracted from the Web pages returned by the probe mapping system 320 that are of particular interest. Advertisements are the most interesting dynamic feature to extract, however, an alternative embodiment of the present invention may use the extraction technology to collect any type of digital content including promotions, surveys, and news stories.
  • the advertisement extractor 326 can use various advertisement extraction methods, including rule-based extraction, heuristic extraction, and comparison extraction.
  • Rule-based extraction relies upon a media editor 264 to use the user interface 240 to create rules.
  • the user interface 240 stores the rules in the database 200 and the advertisement extractor 326 applies the rules to each Web page that the Web page retrieval system 322 retrieves.
  • the effect of running a rule is to identify and extract an HTML fragment from the Web page (i.e., the part of the page containing the advertisement).
  • the advertisement extractor 326 first converts the HTML representation of the fetched Web page into a well-formed XML representation. Following this conversion, the rules are applied to the parse tree of the XML representation of the Web page.
  • Heuristic extraction relies upon the similarity of advertisements at the HTML or XML source code level because the advertisements are typically inserted by an advertisement server when the Web page is generated in response to the Web browser emulation environment 324 request to display the Web page. Heuristic extraction analyzes the source code for clues (e.g., references to the names of known advertisement servers) and extracts fragments that surround those clues. The advantage of this method is that the extraction is automatic and the media editor need not create the rules.
  • Comparison extraction repeatedly fetches the same Web page. This extraction method compares the different versions of the Web page to determine whether the content varies from version to version. The portion of the Web page that varies with some degree of frequency is usually an advertisement and is extracted.
  • the structural classifier 328 parses each advertisement and stores the structural components in the database 200 and passes those components to the statistical summarization system 230 .
  • Each advertisement fragment extracted by the advertisement extractor 326 is analyzed by the structural classifier 328 .
  • the process performed by the structural classifier 328 comprises duplicate fragment elimination, structural fragment analysis, duplicate advertisement detection.
  • the structural classifier 328 performs duplicate fragment elimination by comparing the current advertisement fragment to other fragments in the database 200 . Two advertisement fragments are duplicates if the fragments are identical (e.g., each fragment has the exact same HTML content). If the structural classifier 328 determines that the current fragment is a duplicate of a fragment in the database, the advertisement sampling system 220 logs another observation of the fragment and continues processing fragments.
  • the structural classifier 328 performs structural fragment analysis on the XML representation of the Web page by determining the “physical type” of the fragment (i.e., the HTML source code used to construct the advertisement).
  • Physical types that present invention recognizes include banner, form, single link, and embedded content.
  • Banner advertisement fragments include a single HTML link having one or two enclosed images and no FORM or IFRAME tag.
  • Form advertisement fragments include a single HTML form having no IFRAME tag.
  • Single link advertisement fragments include a link with textual, but no IMG, FORM, or IFRAME tags.
  • Embedded content advertisement fragments reference an external entity using an IFRAME tag. After performing this analysis, the structural classifier 328 updates the advertisement fragment in the database.
  • the structural classifier 328 stores the link and image URL's in the database 200 .
  • a form advertisement fragment requires the creation of a URL by simulating a user submission that sets each HTML control to its default value.
  • the structural classifier 328 stores this URL and the “form signature” (i.e., a string that uniquely describes the content of all controls in the form) in the database 200 .
  • the structural classifier 328 stores the URL for the link and all text contained within the link in the database 200 .
  • the structural classifier 328 stores the URL associated with the external reference in the database 200 . This URL is loaded by the system, and the referenced document is loaded. Once the loaded document has been structurally analyzed, the original fragment inherits any attributes that result from analysis of the new fragment.
  • the structural classifier 328 performs duplicate advertisement detection on each advertisement fragment that has a known physical type because these fragments represent advertisements.
  • Each unique advertisement has information, including which site definitions are associated with the fragment, stored in the database 200 .
  • the structural classifier 328 determination of uniqueness depends on different criteria for each type of fragment. The first step for every type of definition is to resolve all URLs associated with the record. URLs that refer to images are loaded, and duplicate images are noted. HTML link URLs, also known as “click URLs”, are followed each time a new ad is created. The final destination for a click URL, after following all HTTP redirects, is noted. This is also done for simulated link submission URLs associated with form definitions. Once all URLs have been resolved, the structural classifier 328 determines whether the advertisement is unique.
  • Banner advertisement fragments are considered unique if they have the same number of images, if the images are identical, and if the destination URL is identical.
  • Form advertisement fragments are considered unique if they have the same signature, and the same destination URL.
  • Single link advertisement fragments are considered unique if they have the same textual content and the same destination URL.
  • the statistical summarization system 230 calculates the advertisement statistics for each unique advertisement in the database 200 .
  • the present invention calculates, for each Web site, the advertising impressions (i.e., the number of times a human being views an advertisement).
  • the present invention also calculates the spending, S, using the formula S ⁇ I ⁇ RC, where I is the advertising impressions for a Web site, and RC is the rate code for the Web site.
  • Most advertising buys are complicated deals with volume purchasing discounts so our numbers do not necessarily represent the actual cost of the total buy.
  • the Web front end 250 is a graphical user interface that provides a client 140 with a query interface to the database 200 populated by the traffic analysis system 210 , advertisement sampling system 220 , and the statistical summarization system 230 .
  • the client 140 can use the Web front end 250 to create, store, edit and download graphical and tabular reports for one or more industry categories depending on the level of service the client 140 selects.
  • the user interface 240 in FIG. 3 includes a separate component for system account management 340 , site administration 342 , taxonomy administration 344 , advertising content classification 346 , and rate card collection 348 .
  • the account manager 260 uses the system account management 340 module of the user interface 240 to simplify the administration of the Web front end 250 .
  • the account manager 260 uses the system account management 340 module to create and delete user accounts, manage user account passwords, and check on the overall health of the Web front end 250 .
  • the operator 262 uses the site administration 342 module of the user interface 240 to simplify the administration of the site definitions.
  • Analysts from the Internet Advertising Bureau estimate that over 90% of all Web advertising dollars are spent on the top fifty Web sites.
  • Site selection begins by choosing the top 100 advertising by considering data from Media Metrix, Neilsen/Net Ratings, and the proxy traffic data in the database 200 . These lists are periodically updated to demote Web sites with low traffic levels and promote new sites with high traffic levels.
  • the present invention also includes Web sites that provide significant content in key industries.
  • a site chosen for inclusion in the site definitions must have the structure of the site analyzed to remove sections that do not serve advertisements, originate from foreign countries, or are part of a frame set.
  • Web sites that originate from a foreign country such as yahoo.co.jp, sell advertising in the host country, and therefore are not applicable to the measurements calculated by the present invention.
  • Web sites that use an HTML frameset are treated very carefully to only apply rotation rates to the traffic from the sections of the frameset that contain the advertisement. These combined exclusions are key to making accurate estimates of advertising impressions.
  • the present invention also tags sections that cannot be measured directly, due to registration requirements (e.g., mail pages). Since Web sites change frequency, this structural analysis is repeated periodically. Eventually the analysis stage will automatically flag altered sites to allow even more timely updates.
  • the media editor 264 uses the taxonomy administration 344 , advertising content classification 346 , and rate card collection 348 modules of the user interface 240 .
  • the taxonomy administration 344 module simplifies the creation and maintenance of the attributes assigned to advertisements during content classification including the advertisements industry, company, and products.
  • the taxonomy names each attribute and specifies its type, ancestry and segment membership. For example, a company Honda, might be parented by the Automotive industry and belong to the industry segment Automotive Manufactures.
  • the advertising content classification 346 component assists the media editor 264 with performing the content classification.
  • the structural classifier 328 performs automated advertisable assignment to determine what the advertisement is advertising. This process include assigning “advertiseables” (i.e., attributes describing each “thing” that the advertisement is advertising) to each advertisement fragment.
  • the advertisement sampling system 220 uses an extensible set of heuristics to assign advertisables to each advertisement. In the preferred embodiment, however, the only automatic method employed is location classification. Location classification relies on the destination URL in order to assign a set of advertisables to an advertisement. A media editor 264 uses the user interface 240 to maintain the set of classified locations.
  • a classified location comprises a host, URL path prefix, and set of advertisables. Location classification assigns a classified location advertisables to an advertisement if the host in the destination URL matches the host of the classified location and the path prefix in the classified location matches the beginning of the path in the destination URL.
  • the structural classifier 328 performs human advertisable assignment and verification as a quality check of the advertisable data. This phase is the most human intensive.
  • a media editor 264 uses a graphical user interface module in the user interface 240 to display each advertisement, verifies automatic advertisable assignments, and assigns any other appropriate advertisables that appear appropriate after inspection of the advertisement and the destination of the advertisement.
  • the location classification database is also typically maintained at this time.
  • the media editor 264 uses the rate card collection 348 module to enter the contact and rate card information for a Web site identified by the traffic analysis system 210 , as well as, designated advertisers.
  • Rate card entry includes the applicable quarter (e.g., Q 4 2000), advertisement dimensions in pixels, fee structure (e.g., CPM, flat fee, or per click), cost schedule for buys of various quantities and duration.
  • the media editor also records the URL address of the online media kit and whether rates are published therein.
  • Contact information for a Web site or advertiser includes the homepage, name, phone and facsimile numbers, email address, and street address.
  • FIGS. 4A through 4C illustrate the preferred method for calculating the advertising prevalence statistics.
  • the calculation of the advertising prevalence statistics is an iterative process that uses expected values derived by the traffic analysis system 210 and observed values derived by the advertising prevalence system 220 to calculate the weighted values and the advertising prevalence statistics.
  • FIGS. 4A through 4C each depict a network on the Internet 100 that includes two Web sites served by Web server P 410 and Web server Q 420 .
  • FIG. 4A illustrates exemplary expected traffic values for the network.
  • FIG. 4B illustrates exemplary observed traffic values for the network.
  • FIG. 4C illustrates exemplary weighted traffic values for the network.
  • the first step in the process is to normalize the results from the traffic analysis system 210 .
  • the traffic analysis system 210 provides the traffic received by each Web page in the traffic data sample.
  • the probe map generated by the probe mapping system 320 includes an entry for each Web page 411 - 416 , 421 - 424 .
  • the probe map also includes an “area” that each Web page 411 - 416 , 421 - 424 consumes in the probe map.
  • the normalized results are calculated by dividing the area that a Web page consumes in the probe map by the sum of the area for each Web page in the traffic sample.
  • the normalized value, or chance, for Web page P 1 411 is the area for Web page P 1 (i.e., 15) divided by the sum of the area for Web page P 1 , P 2 , P 3 , P 4 , P 5 , P 6 , Q 1 , Q 2 , Q 3 , and Q 4 (i.e., 120 ).
  • the normalized value is, therefore, 0.125, or 12.5%.
  • the system determines the scale by dividing the traffic for a Web page by the area for the Web page.
  • the scale for Web page P 1 411 is the traffic for Web page P 1 (i.e., 150 ) divided by the area for Web page P 1 (i.e., 15), therefore, the scale for Web page P 1 is 10 .
  • Table I summarizes the scale and chance values for the remaining Web page in FIG. 4A . TABLE 1 Web Page Area Scale Chance P1 15 10 12.5% P2 10 1 8.3% P3 14 1 12% P4 12 0.25 10% P5 8 0.5 6.7% P6 4 1 3.3% Q1 30 0.5 25% Q2 4 0.5 3.3% Q3 15 2 12.5% Q4 8 0.5 6.7%
  • the next step in the calculation process is to calculate the Scaled Fetches for each Web site 410 , 420 by summing the product of the observed fetches from FIG. 4B and the scale from FIG.
  • the next in the calculation process is to compute the Scaled Observations for each advertisement on each Web site 410 , 420 by summing the product of the advertisement views from FIG. 4B and the scale from FIG. 4A , for each Web page 411 - 416 , 421 - 424 in the Web site 410 , 420 .
  • the final step in the calculation is to compute the advertising prevalence statistics (i.e., Frequency, Impressions, and Spending) for each advertisement in each Web site 410 , 420 .
  • Frequency is computed by dividing the scaled observations by the scaled fetches for each advertisement in each Web site 410 , 420 .
  • FIG. 4D illustrates an alternative embodiment for calculating the advertising prevalence statistics.
  • the prober is tuned to optimize rotation measurement accuracy.
  • Statistical estimates of accuracy in the field are difficult to perform, due to the non-stationary nature of advertising servers.
  • it When probing every 6 minutes, it has a 0.06% resolution in rotational frequency over a one-week measurement period.
  • the probes are distributed among the sites to accurately measure ad rotation on each site.
  • the number of probing URLs assigned to a site is determined from three variables. The first is a constant across all sites; a certain number of probing URLs are required to accurately measure rotation on even the smallest site. Half of the probes are assigned with this variable. The second variable, weighted at 40%, is the amount of traffic going to a site, as each probing URL represents a proportion of total Internet traffic. The twenty largest sites receive over 75% of these probes. Finally the complexity of site, as measured by the total number of unique URLs found in our proxy traffic data, is taken into account, with more complicated sites receiving extra probing URLs. This accounts for the remaining 10% of the probe distribution.
  • Probing URLs can be chosen using a Site Shredder algorithm to break the site into regions (i.e., sets of pages whose advertisement rotation characteristics are likely to be similar) for probing.
  • regions i.e., sets of pages whose advertisement rotation characteristics are likely to be similar
  • the distribution of regions is mathematically designed to maximize site coverage and, therefore, advertisement rotation accuracy.
  • a single URL is chosen to represent the advertising rotation from each region. This URL is chosen as the most heavily trafficked page containing advertisements in that region.
  • the algorithm avoids date specific pages or pages referring to a time-limited event such as the August 1999 total lunar eclipse.
  • FIG. 4D calculates advertisement impressions by combining the estimates of rotation and traffic for each Web site 430 . To do this the system breaks the site down into its constituent stems using the Site Shredder algorithm. The rotation of advertisements in each advertisement slot is calculated and applied to estimate advertising impressions on its associated stem. The advertisement rotation on stems without probes is estimated from an average, weighted by traffic, of advertisement rotation of probes on a similar level.
  • the sample site tree has five probes URLs 431 - 435 , P 1-5 , placed on five main branches off a main page and 14 secondary branches.
  • the number on each page is the sample traffic going to that page.
  • Probe P 1 on the home page, “www.testsite.com” measures the rotation, R, to be applied to the traffic going to that main page, with traffic of 88 page views.
  • Branch A has a single probe, P 2 , placed on the top-level page of that branch with a probing URL “www.testsite.com/A/”. The rotation of this single probing URL is estimated as R A and is applied to the traffic for that entire stem, a total of 21 page views.
  • Branch C has a probe, P 3 , on a heavily trafficked secondary branch page, with a probing URL “www.testsite.com/C/third.html”.
  • the rotation, R C of this page is applied to all the secondary branch pages on that stem and also up one level in the tree, across a total of 25 page views.
  • Branch E receives a large portion of the traffic for the site, a total of 61 page views, and therefore is assigned two probes, P 4 and P 5 . These are on two secondary branch pages, “www.testsite.com/E/first.html” and “www.testsite.com/E/third.html”.
  • the rotation of each is applied the traffic to those individual pages.
  • FIG. 5 illustrates a database structure that the advertising prevalence system 130 may use to store information retrieved by the traffic sampling system 120 and the Web page retrieval system 320 .
  • the preferred embodiment segments the database 200 into partitions. Each partition can perform functions similar to an independent database such as the database 200 .
  • a partitioned database simplifies the administration of the data in the partition.
  • the present invention contemplates consolidation of these partitions into a single database, as well as making each partition an independent database and distributing each database to a separate general purpose computer workstation or server.
  • the partitions for the database 200 of the present invention include sampling records 510 , probing definitions 520 , advertising support data 530 , and advertising summary 540 .
  • the preferred embodiment of the present invention uses a relational database management system, such as the Oracle8i product by Oracle Corporation, to create and manage the database and partitions. Even though the preferred embodiment uses a relational database, the present invention contemplates the use of other database architectures such as an object-oriented database management system.
  • the sampling records 510 partition of database 200 comprises database tables that are logically segmented into traffic data 512 , advertisement view logging 514 , and advertising structure 516 areas.
  • the traffic data 512 area contains data processed by the traffic sampling system 120 , anonymity system 310 , and statistical summarization system 230 .
  • the data stored in this schema includes a “munged” URL, and the count of traffic each URL receives per traffic source over a period of time.
  • a “munged” URL is an ordinary URL with the protocol field removed and the order of the dotted components in the hostname reversed.
  • the present invention transforms an ordinary URL, such as http://www.somesite.com/food, into a munged URL by removing the protocol field (i.e., “http://”) and reversing the order of the dotted components in the hostname (i.e., “www.somesite.com”).
  • the resulting munged URL in this example is “com.somesite.www/food”.
  • the present invention uses this proprietary URL format to greatly enhance the traffic data analysis process.
  • the traffic sampling system 120 populates the traffic data 512 area in database 200 .
  • the probe mapping system 320 accesses the data in the traffic data 512 area to assist the Web page retrieval system 322 and the statistical summarization system 230 with the calculation of the advertising impression and spending statistics.
  • the advertisement view logging 514 area logs the time, URL, and advertisement identifier for each advertisement encountered on the Internet 100 . This area also logs each time the system does not detect an advertisement in a Web page that previously included the advertisement. In addition, the system logs each time the system detects a potential advertisement, but fails to recognize the advertisement during structural classification.
  • the structural classifier 328 and the Web page retrieval system 322 of the advertisement sampling system 220 populate the advertisement view logging 514 area in database 200 .
  • the statistical summarization system 230 accesses the data in the advertisement view logging 514 area to determine the frequency that each advertisement appears on each site.
  • the advertisement structure 516 area contains data that characterizes each unique advertisement located by the system. This data includes the content of the advertisement, advertisement type (e.g., image, HTML form, Flash, etc.), the destination URL linked to the advertisement, and several items used during content classification and diagnostics, including where the advertisement was first seen, and which advertisement definition originally produced the advertisement.
  • the structural classifier 328 component of the advertisement sampling system 220 populates the advertisement structure 516 area in database 200 .
  • the user interface 240 accesses the data in the advertisement structure 516 area to display each advertisement to the media editor 264 during classification editing.
  • the Web front end 250 also accesses the data in the advertisement structure 516 area to display the advertisements to the client 140 .
  • the probing definitions 520 partition of database 200 comprises database tables that are logically segmented into site definition 522 , probe map 524 , and advertisement extraction rule definition 526 areas.
  • the site definition 522 area carves the portion of the Internet 100 that the system probes into regions.
  • the primary region definition is a “site”, a cohesive entity the system needs to analyze, sample, and summarize.
  • the system defines each site in terms of both inclusive and exclusive munged URL prefixes.
  • a “munged URL prefix” is a munged URL that represents the region of all munged URLs for which it is a prefix.
  • An “inclusive munged URL prefix” specifies that a URL is part of some entity.
  • An “exclusive munged URL prefix” specifies that a URL is not part of some entity, overriding portions of the entity included by an inclusive prefix. To illustrate, the following is list of munged URLs that may result from the processing of a set of URLs in a traffic sample.
  • the probe map 524 area contains a weight for each URL in each site that the system is measuring. This weight determines the likelihood that the system will choose a URL for each probe.
  • the system generates the weights by running complex iterative algorithms against the traffic data and the probing records in the database 200 . An analysis of the traffic data can discern which URLs have been visited, how often users have visited those URLs. The result of the analysis guarantees that the system performs advertisement sampling of these URLs in similar proportions, given certain constraints such as a maximum number of probes to allocate to any single URL.
  • the data in the sampling records 510 partition of the database 200 is useful for determining which URLs are in need of special handling due to past behavior (e.g., a URL is sampled less infrequently if the system has never detected an advertisement in the URL).
  • the probe mapping system 320 populates the probe map 524 areas in the database 200 .
  • the probe mapping system 320 accesses the data in the probe map 524 area to allocate the probes.
  • the statistical summarization system 230 accesses the data in the probe map 524 area to determine which URLs should have their rotations scaled to counter the effect of probe map constraint enforcement.
  • the advertisement extraction rule definition 526 area describes Extensible Markup Language (“XML”) tags, typically representing a normalized HTML document, that indicate those portions of the content that the system considers to be advertisements.
  • the system defines an extraction rule in terms of “XML structure” and “XML features”.
  • XML structure refers to the positioning of various XML nodes relative to others XML nodes. For example, an anchor (“A”) node containing an image (“IMG”) node is likely an advertisement. After using this structural detection process to match the advertisement content, the system examines the features of the content to determine if the content is an advertisement. To continue the previous example, if the image node contains a link (“href”) feature that contains the sub-string “adserver”, it is very likely an advertisement.
  • href link
  • Another form of extraction rule may point to a specific node in an XML structure using some form of XML path specification, such as a “Xpointer”.
  • the media editor 264 populates the advertisement extraction rule definition 526 area in the database 200 .
  • the advertisement extractor 326 of the advertisement sampling system 220 accesses the data in the advertisement extraction rule definition 326 area to determine which portions of each probed page represent an advertisement.
  • the advertising support data 530 partition of database 200 comprises database tables that are logically segmented into advertisable taxonomy 532 , advertising information 534 , rate card 536 , and extended advertisable information 538 areas.
  • the advertisable taxonomy 532 area contains a hierarchical taxonomy of advertisables, attributes that describe what the advertisement is advertising. This taxonomy includes industries, companies, products, Web sites, Web sub-sites, messages, etc. Each node in the hierarchy has a type that specifies what kind of entity it represents and a parent node. For example, the hierarchy may specify that products live within companies, which in turn live within industries.
  • the media editor 264 populates the advertisable taxonomy 532 area in the database 200 .
  • the user interface 240 accesses the data in the advertisable taxonomy 532 area to generate statistical data and record where companies, industries, etc. tend to advertise.
  • the Web front end 250 also accesses the data in the advertisable taxonomy 532 area to display this information to the client 140 .
  • the advertising information 534 area contains the data that describe what each unique advertisement recorded by the system advertises. This tables in this area associate advertisables with advertisements. For example, the system may associate a company type of advertisable with a specific advertisement to indicate that the advertisement is advertising the company. The system uses the following methods to associate an advertisable with an advertisement:
  • a “direct classification” assigns an advertisable directly to the advertisement. For example, a media editor 264 creates a direct classification by specifying that a particular advertisement advertises the “Honda” advertisable.
  • a “location classification” assigns an advertisable to a location prefix that the system uses to match the destination of the advertisement. For example, a media editor 264 creates a location classification by specifying that the location “com.honda” indicates an advertisement for Honda. An advertisement that points to “com.honda.wwv/cars”, therefore, associates the advertisement with Honda.
  • An “ancestral classification” assigns an ancestor of the advertisable to an advertisement. For example, if a direct classification assigns Honda to an advertisement, the “automotive” industry advertisable is a predecessor of Nissan. Ancestral classification uses this relationship to associate automotive to the advertisement.
  • the media editor 264 populates the advertising information 534 area in the database 200 .
  • the user interface 240 accesses the data in the advertising information 534 area to generate statistical data.
  • the rate card 536 area contains data describing the cost of advertisements on a Web site. These costs include monetary values for each specific shape, size, or length of run that advertisers on the Internet 100 use to determine the cost of advertisement purchases.
  • the system stores rate card data for each Web site that the system probes.
  • the media editor 264 populates the rate card 536 area in the database 200 .
  • the user interface 240 accesses the data in the rate card 536 area to generate statistical data.
  • the extended advertisable information 538 area contains additional information about specific types of advertisables not readily captured in the taxonomy hierarchy. Specifically, this includes additional information related to Web sites and companies, such as company contact information, Web site, and media kit URLs. This information extends the usefulness of the system by providing additional information to the client 140 about probed entities. For example, a client 140 may follow a hyperlink to company contact information directly from a system report. The media editor 264 populates the extended advertisable information 538 area in the database 200 . The Web front end 250 accesses the data in the extended advertisable information 538 area to deliver value-added information to a client 140 .
  • the advertising summary 540 partition of database 200 comprises database tables that are logically segmented into advertising statistics 542 , data integrity 544 , and advertising information summary 546 areas.
  • the advertising statistics 542 area describes how often an advertisement appears on each Web site.
  • the system calculates and stores the following statistics in this area.
  • the system determines this statistic by measuring traffic levels for the Web site using the site definition and traffic data, and multiplying that measurement by the proportion of page view calculated above.
  • the system determines this statistic by applying the rate card information to the number of impressions that the advertisement receives calculated above.
  • the statistical summarization system 230 populates the advertising statistics area 542 in the database 200 .
  • the Web front end 250 accesses the data in the advertising statistics 542 area to report spending, impressions, and advertising rotation to the client 140 .
  • the data integrity 544 area contains in-depth information about statistical outliers and other potential anomalies resulting from trend and time slice analyses. This automated monitoring and analysis guarantees that the system will contain accurate analysis data. In addition, the system uses real world advertising information, as an input to the system, to verify the accuracy of the analysis data.
  • the data integrity analysis system performed by the statistical summarization system 230 , populates the data integrity 544 area in the database 200 .
  • the operator 262 accesses the data integrity 544 area to detect potential errors and monitor general system health.
  • the advertising information summary 546 area summarizes advertising information in a format that is compact and easy to distribute.
  • the system extracts the data in this area from the advertising support data 530 partition. While the data is not as descriptive as the data in the advertising support data 530 partition, it provides the ability to quickly perform a precise query.
  • the advertising support data 530 partition associates each advertisement with a company, product, or industry. If the system associates multiple advertisables of the same type with an advertisement, a single advertisable is chosen to summary those associates using an assignment priority system, as follows:
  • Advertisables associated with an advertisement using direct classification receive the highest possible priority, “M”.
  • Advertisables associated with an advertisement using location classification receive priority equal to the string length of the location prefix to which they are assigned, therefore, a long location prefix string will receive a higher priority than a short location prefix string.
  • Advertisables associated with an advertisement using ancestral classification receive the priority of the assigned ancestor.
  • the advertisement receives the highest priority advertisable in each type.
  • the statistical summarization system 230 populates the advertising information summary 546 area in the database 200 .
  • the Web front end 250 accesses the advertising information summary 546 area to generate reports for the client 140 .
  • the table structure comprises three environments, the core schema, analysis schema, and front end.
  • the core schema describes the back-end environment which allow the Cloudprober to direct live autonomous processes that continuously scour the Web noting advertising activity and operators and media editors for the present invention to direct, monitor and augment information provided by the Cloudprober.
  • the analysis schema is the back-end environment that allows the advertisement sampling system, also known as OMNIAC, to apply rigorous data analysis procedures to information gathered from the Web.
  • the front end schema assists a client of the present invention with accessing data, building database query strings, and generating reports.
  • the database objects comprising the “core schema” are most frequently used by various components of the OMNIAC system. Code bases that rely on this schema include implementation of the back end processes that pull advertisements from the Web. Additionally, database schemas utilized by other components associated with OMNIAC are composed of some or all of the tables in the core schema.
  • the core schema is conceptually composed of four sub-schemas including advertising, advertisements, probing, and sites.
  • the advertising sub-schema holds information about “advertiseable” entities along with which entities each advertisement is advertising.
  • the advertisements sub-schema describes the advertisements that the system has located and analyzed.
  • the probing sub-schema defines “when”, “where”, and “how” for the probing process.
  • the sites sub-schema describes Web sites, including structural site definitions and rate card information.
  • ADVERTISABLE which defines advertisables.
  • Many of the conceptual entities in OMNIAC's universe are advertisables: industries, companies, products, services and Web sites are all defined here.
  • the type field, referencing the ADVERTISABLE_TYPE table, differentiates between different types of advertisables, and the parent field organizes records hierarchically, establishing such relationships as industry-contains-company and company-produces-product.
  • ADVERTISABLE_GROUP_MEMBER is used to further group advertisables. Examples of groups defined in this way include automotive classes, travel industry segments, and types of computer hardware.
  • ADVERTISES is used to associate advertisables directly with advertisements.
  • LOCATION_ADVERTISES, CLASSIFIED_LOCATION and LOCATION_MATCHES also indirectly associate advertisables with advertisements via the advertisement's destination location.
  • Advertisements are references to records in AD, the primary table in the Advertisements sub-schema.
  • the Advertisements sub-schema serves to define each advertisement in OMNIAC's universe. Every unique advertisement has a record in AD, along with one or more records in AD_DEFINITION.
  • Advertisement definitions are unique XML fragments OMNIAC has retrieved from the Web.
  • Ads are unique advertisements defined by sets of advertisement definitions determined to be equivalent during automated classification.
  • Advertisements contain advertisement attributes, referenced by AD and AD_DEFINITION.
  • AD_TECHNOLOGY describes known Web technologies used to render advertisements, while TEXT describes textual content for certain advertisements.
  • FUZZY_WEB_LOCATION contains fuzzy locations found in advertisements.
  • Afuzzy location is a URL that needs to be processed by the system, such as an anchor or image. Once OMNIAC has loaded a fuzzy location, a reference is made to MIME_CONTENT if the URL references an image, or DEST_WEB_LOCATION if the URL references another HTML page.
  • a target set is a conceptual construct that instructs OMNIAC to fetch a set of pages at certain intervals, extracting advertisements using a set of rules called extraction rules.
  • Each target set is defined by a row in TARGET_SET.
  • the frequencies, locations, and extraction rules that make up each target set are defined in STROBE, AD_WEB_LOCATION, and EXTRACTION_RULE, respectively.
  • the many-to-many relationships between rows in these tables are defined in TS_RUNS_AT, TS_PROBES, and TS_APPLIES.
  • the fourth and final sub-schema is Sites, which simply records information about Web sites.
  • Each site or subsite defined in the advertisable hierarchy has a corresponding record in SITE_INFO, along with a number of rows in SITE_DOMAIN and SITE_MONTHLY_DATA.
  • SITE_DOMAIN describes the physical structure of a site in terms of inclusive and exclusive URL stems.
  • SITE_MONTHLY_DATA records advertising rate cards, third party traffic estimates, and cache statistics for each site on a monthly basis.
  • the analysis schema is an extension to the core schema that includes a number of additional tables populated by OMNIAC's analysis module.
  • the analysis module is the unit in charge of processing information held in the core schema, producing a trim dataset that accurately describes advertising activity.
  • the analysis schema is composed of four conceptual sub-schemas composed of tables implementing common functionality. These sub-schemas include advertising decomposition, advertisement view summarization, slot statistics, and site statisitics.
  • the advertising decomposition sub-schema holds information about each advertisement in the system, including attributes and what the advertisement is advertising.
  • the advertisement view summarization sub-schema summarizes advertisement views, recording how many times each advertisement was seen in each slot over the course of a day.
  • the slot statistics sub-schema describes advertisement rotation for each slot during each time period.
  • the site statistics sub-schema describes site information, including advertisement rotation for each time period.
  • AD_INFO contains de-normalized records describing advertisement attributes.
  • AD_INFO records are keyed off of ID's in the AD table; an AD_INFO record exists for each AD record that has been completely classified and represents a valid advertisement.
  • AD_INFO is populated by the analysis module from the advertising relationships described in the core schema tables ADVERTISES and LOCATION_ADVERTISES.
  • Fields in AD_INFO that specify what is advertised by an advertisement are: CATEGORY (industry), ORGANIZATION (company), ORGANIZATION_GROUP (industry segment), ORGANIZATION_OVERGROUP, COMMODITY (product/service), COMMODITY_GROUP (product/service segment), COMMODITY_OVERGROUP, and MESSAGE.
  • AD_INFO also includes fields describing a number of non-advertising attributes.
  • FORMAT referencing AD_SLOT_TYPE.ID, specifies the form factor of an advertisement.
  • TECHNOLOGY referencing AD_TECHNOLOGY2.ID, specifies the technology used to implement the advertisement.
  • DEFINITION, IMAGE, and DESTINATION specify the AD_DEFINITION, IMAGE, and DEST_WEB_LOCATION records associated with the advertisement.
  • the Advertising Decomposition schema contains a few tables in addition to AD_INFO.
  • ADV_IMPLICATION is a cache of advertisable implications derived from the hierarchy in ADVERTISABLE. This is used to speed operation of the analysis module.
  • AD_INFO_FLATTENED is a more readily queried version of AD_INFO containing advertisement/advertisable pairs for each of the fields in AD_INFO that reference ADVERTISABLE.
  • AD_TECHNOLOGY2 describes advertisement technologies understood by the analysis module that are presentable to the user in the front end.
  • the Advertisement View Summarization sub-schema covers the single table PLACEMENT_SUMMARY.
  • PLACEMENT_SUMMARY is keyed off of day, advertisement, and slot, and contains, in the CNT field, the number of times an advertisement was seen in a slot on a particular day.
  • the analysis module populates PLACEMENT_SUMMARY by aggregating hits recorded in the APD n tables, one of which exists for each day, n being the ID of the day in question. These tables are created and populated by the back-end as advertisement hits flow into the system.
  • the third sub-schema in the Analysis schema is Slot Statistics. This sub-schema describes advertisement behavior in the context of slots in addition to information about the slots themselves.
  • a slot is a location on the Web in which advertisements rotate, currently defined in terms of the location ID (a reference to AD_WEB_LOCATION.ID) and extraction rule ID (a reference to EXTRACTION_RULE.ID).
  • the primary table in the Slot Statistics is SLOT_AD_VIEWS, which records the total views and relative frequency for each advertisement in each slot during each time period.
  • the primary key of this table is composed of the fields PERIOD_TYPE, PERIOD, LOCATION_ID, RULE_ID and AD_ID. Two fields exist outside of the primary key: CNT holds the total number of advertisement views, and FREQUENCY holds the relative frequency.
  • SLOT_SUMMARY which records general slot information outside the context of individual advertisements. Accordingly, this table is keyed off the PERIOD_TYPE, —PERIOD, LOCATION_ID and RULE_ID fields.
  • the CNT field records total advertisement views in the slot; this field is divided into the SLOT_AD_VIEWS.CNT to determine relative frequency.
  • SLOT_SUMMARY also specifies the type of advertisement seen most frequently in the slot, and SITE_ID, which specifies which site the slot resides within.
  • SLOT_TYPE_COUNT This table is used to determine which value to use in SLOT_SUMMARY.SLOT_TYPE. The number of times each advertisement format was seen is recorded, and the slot type that receives the most views is stuck into SLOT_SUMMARY.SLOT_TYPE.
  • FIG. 6 is a functional block diagram of the advertising prevalence system 130 .
  • Memory 610 of the advertising prevalence system 130 stores the software components, in accordance with the present invention, that analyze traffic data on the Internet 100 , sample the advertising data from that traffic data, and generate summarization data that characterizes the advertising data.
  • the system bus 612 connects the memory 610 of the advertising prevalence system 130 to the transmission control protocol/internet protocol (“TCP/IP”) network adapter 614 , database 200 , and central processor 616 .
  • TCP/IP network adapter 614 is the mechanism that facilitates the passage of network traffic between the advertising prevalence system 130 and the Internet 100 .
  • the central processor 616 executes the programmed instructions stored in the memory 610 .
  • FIG. 6 shows the functional modules of the advertising prevalence system 130 arranged as an object model.
  • the object model groups the object-oriented software programs into components that perform the major functions and applications in the advertising prevalence system 130 .
  • a suitable implementation of the object-oriented software program components of FIG. 6 may use the Enterprise JavaBeans specification.
  • the book by Paul J. Perrone et al., entitled “Building Java Enterprise Systems with J2EE” (Sams Publishing, June 2000) provides a description of a Java enterprise application developed using the Enterprise JavaBeans specification.
  • the book by Matthew Reynolds, entitled “Beginning E-Commerce” (Wrox Press Inc., 2000) provides a description of the use of an object model in the design of a Web server for an Electronic Commerce application.
  • the object model for the memory 610 of the advertising prevalence system 130 employs a three-tier architecture that includes the presentation tier 620 , infrastructure objects partition 630 , and business logic tier 640 .
  • the object model further divides the business logic tier 640 into two partitions, the application service objects partition 650 and data objects partition 660 .
  • the presentation tier 620 retains the programs that manage the graphical user interface to the advertising prevalence system 130 for the client 140 , account manager 260 , operator 262 , and media editor 264 .
  • the presentation tier 620 includes the TCP/IP interface 622 , the Web front end 624 , and the user interface 626 .
  • a suitable implementation of the presentation tier 620 may use Java servlets to interact with the client 140 , account manager 260 , operator 262 , and media editor 264 of the present invention via the hypertext transfer protocol (“HTTP”).
  • HTTP hypertext transfer protocol
  • the Java servlets run within a request/response server that handles request messages from the client 140 , account manager 260 , operator 262 , and media editor 264 and returns response messages to the client 140 , account manager 260 , operator 262 , and media editor 264 .
  • a Java servlet is a Java program that runs within a Web server environment. A Java servlet takes a request as input, parses the data, performs logic operations, and issues a response back to the client 140 , account manager 260 , operator 262 , and media editor 264 .
  • the Java runtime platform pools the Java servlets to simultaneously service many requests.
  • a TCP/IP interface 622 that uses Java servlets functions as a Web server that communicates with the client 140 , account manager 260 , operator 262 , and media editor 264 using the HTTP protocol.
  • the TCP/IP interface 622 accepts HTTP requests from the client 140 , account manager 260 , operator 262 , and media editor 264 and passes the information in the request to the visit object 642 in the business logic tier 640 .
  • Visit object 642 passes result information returned from the business logic tier 640 to the TCP/IP interface 622 .
  • the TCP/IP interface 622 sends these results back to the client 140 , account manager 260 , operator 262 , and media editor 264 in an HTTP response.
  • the TCP/IP interface 622 uses the TCP/IP network adapter 614 to exchange data via the Internet 100 .
  • the infrastructure objects partition 630 retains the programs that perform administrative and system functions on behalf of the business logic tier 640 .
  • the infrastructure objects partition 630 includes the operating system 636 , and an object oriented software program component for the database management system (“DBMS”) interface 632 , system administrator interface 634 , and Java runtime platform 638 .
  • DBMS database management system
  • the business logic tier 640 retains the programs that perform the substance of the present invention.
  • the business logic tier 640 in FIG. 6 includes multiple instances of the visit object 642 .
  • a separate instance of the visit object 642 exists for each client session initiated by either the Web front end 624 or user interface 626 via the TCP/IP interface 622 .
  • Each visit object 642 is a stateful session bean that includes a persistent storage area from initiation through termination of the client session, not just during a single interaction or method call.
  • the persistent storage area retains information associated with either the URL 114 , 116 , 118 or the client 140 , account manager 260 , operator 262 , and media editor 264 .
  • the persistent storage area retains data exchanged between the advertising prevalence system 130 and the traffic sampling system 120 via the TCP/IP interface 622 such as the query result sets from a database 200 query.
  • the traffic sampling system 120 When the traffic sampling system 120 finishes collecting information about a URL 114 , 116 , 118 , it sends a message to the TCP/IP interface 622 that invokes a method to create a visit object 642 and stores information about the connection in the visit object 642 state. Visit object 642 , in turn, invokes a method in the traffic analysis application 652 to process the information retrieved by the traffic sampling system 120 .
  • the traffic analysis application 652 stores the processed data from the anonymity system 310 and probe mapping system 320 in the traffic analysis data 662 state and the database 200 .
  • FIGS. 7A and 7B describe, in greater detail, the process that the traffic analysis application 652 follows for each URL 114 , 116 , 118 obtained from the traffic sampling system 120 . Even though FIG. 6 depicts the central processor 616 as controlling the traffic analysis application 652 , it is to be understood that the function performed by the traffic analysis application 652 can be distributed to a separate system configured similarly to the advertising prevalence system 130 .
  • the visit object 642 invokes a method in the advertising sampling application 654 to retrieve the URL 114 , 116 , 118 from the Web site 110 .
  • the advertising sampling application 654 processes the retrieved Web page by extracting embedded advertisements and classifying those advertisements.
  • the advertising sampling application 654 stores the data retrieved by the Web page retrieval system 322 and processed by the Web browser emulation environment 324 , advertisement extractor 326 , and the structural classifier 328 in the advertising sampling data 664 state and the database 200 .
  • FIG. 6 depicts the central processor 616 as controlling the advertising sampling application 654 , a person skilled in the art will realize that the processing performed by the advertising sampling application 654 can be distributed to a separate system configured similarly to the advertising prevalence system 130 .
  • the visit object 642 invokes a method in the statistical summarization application 656 to compute summary statistics for the data.
  • the statistical summarization application 656 computes the advertising impression, spending, and valuation statistics for each advertisement embedded in URL 114 , 116 , 118 .
  • the statistical summarization application 656 stores the statistical data in the statistical summarization data 666 state and the database 200 .
  • FIG. 6 depicts the central processor 616 as controlling the statistical summarization application 656 , a person skilled in the art realizes that the function performed by the statistical summarization application 656 can be distributed to a separate system configured similarly to the advertising prevalence system 130 .
  • FIG. 7A is a flow diagram of a process in the advertising prevalence system 130 that measures the value online advertisements by tracking and comparing online advertising activity across all major industries, channels, advertising formats, and types.
  • Process 700 begins, at step 710 , by sampling traffic data from the Internet 100 .
  • FIG. 7B describes step 710 in greater detail.
  • Step 720 uses the sampled traffic data from step 710 to perform site selection, and define and refine site definitions for the advertising prevalence system 130 .
  • Step 730 uses the result of the site selection and definition process to generate a probe map based on the sampled traffic data.
  • FIG. 7C describes step 730 in greater detail.
  • Step 740 uses the probe map from step 730 to visit the Internet 100 to gather sample data from the probe sites identified in step 730 .
  • FIG. 7A is a flow diagram of a process in the advertising prevalence system 130 that measures the value online advertisements by tracking and comparing online advertising activity across all major industries, channels, advertising formats, and types.
  • Process 700 begins, at step 710 ,
  • step 7D describes step 740 in greater detail.
  • step 750 extracts the advertisements from the URL
  • step 760 classifies each advertisement
  • step 770 calculates the statistics for each advertisement.
  • FIGS. 7E and 7F describe, respectively, steps 760 and 770 in greater detail.
  • process 700 performs data integrity checks in step 780 to verify the integrity of the data and analysis results in the system.
  • FIG. 7B is a flow diagram that describes, in greater detail, the process of sampling traffic data from FIG. 7A , step 710 .
  • Process 710 begins in step 711 by gathering data from a Web traffic monitor such as the traffic sampling system 120 .
  • Process 710 strips the user information from the data retrieved by the Web traffic monitor in step 712 to cleanse the data and guarantee the anonymity of the sample.
  • step 713 measures the number of Web page views observed in the traffic data.
  • Step 714 completes process 710 by statistically extrapolating the measured number of Web page view in the sample to whole universe of the Internet 100 .
  • FIG. 7C is a flow diagram that describes, in greater detail, the process of generating a probe map based on sampled traffic data from FIG. 7A , step 730 .
  • Process 730 begins in step 731 by analyzing a subset of the sample traffic data that falls within eligible site definitions. Following the analysis in step 731 , step 732 builds an initial probe map based on the sample traffic data.
  • Step 733 analyzes the historic advertisement measurement results in the database 200 for the URLs in the initial probe map.
  • Step 734 uses these historic results, as well as, system parameters to optimize the sampling plan.
  • Step 735 completes process 730 by monitoring the sample results and adjusting the system as necessary.
  • FIG. 7D is a flow diagram that describes, in greater detail, the process of probing the Internet 100 to gather sample data from FIG. 7A , step 740 .
  • Process 740 begins in step 741 by fetching a Web page from the Internet 100 .
  • the Web page from step 741 is passed to a Web browser emulation environment in step 742 to simulate the display of that Web page in a browser.
  • This simulation allows the advertising prevalence system 130 to detect advertisements embedded in the Web page. These advertisements may be embedded in JavaScript code, Java applet or servlet code, or common gateway interface code such as a Perl script.
  • the simulation in step 742 allows the advertising prevalence system 130 to detect dynamic and interactive advertisements in the Web page.
  • step 743 extracts the advertisement data from the Web page and step 744 stores the advertisement data in the database 200 .
  • Step 745 determines whether process 740 needs to fetch another Web page to gather more sample data.
  • process 740 continuously samples Web pages from the Internet 100 .
  • a person skilled in the art realizes that the functionality performed by step 745 can be associated with a scheduling system that will schedule the probing of the Internet 100 to gather the sample advertising data.
  • FIG. 7E is a flow diagram that describes, in greater detail, the process of classifying the advertising data from FIG. 7A , step 760 .
  • Process 760 begins the analysis of advertisement fragments in step 761 by determining whether the fragment is a duplicate.
  • step 762 analyzes the internal structure of the fragment.
  • step 763 retrieves the external content of the advertisement from the Web page.
  • Step 764 compares the external content to previously observed advertisements.
  • Step 765 analyzes the result of the comparison in step 764 to determine whether the advertisement is a duplicate.
  • step 766 begins processing the new advertisement by recording the structure of the new advertisement in the database 200 .
  • step 767 then performs automated advertisement classification and stores the classification types in the database 200 .
  • step 768 completes processing of a new advertisement by performing human verification of the advertisement classifications.
  • step 769 updates the advertisement viewing log in the database 200 to indicate the observation of the advertisement.
  • FIG. 7F is a flow diagram that describes, in greater detail, the process of calculating advertising statistics from FIG. 7A , step 770 .
  • Process 770 begins the calculation of the advertising statistics in step 771 by summarizing the advertising measurement results.
  • process 770 uses the probe map generated in step 730 to weight the advertising measurement results.
  • the advertising frequency is calculated in step 773 for each Web page request.
  • Step 774 uses the sample traffic data from step 710 and the advertising frequency from step 773 to calculate the advertising impressions for each advertisement.
  • Step 775 completes process 770 by calculating the advertisement spending by combining the advertising impressions from step 774 and the rate card data input by the media editor 264 with the rate card collection 348 module of the user interface 240 .

Abstract

The present invention is a system, method, and computer program product for tracking and measuring digital content that is distributed on a computer network such as the Internet. The system collects online advertisement data, analyzes the data, and uses the data to calculate measurements of the prevalence of those advertisements. The system processes raw traffic data by cleansing and summarizing the traffic data prior to storing the processed data in a database. An advertisement sampling system uses site selection and definition criteria and a probe map to retrieve Web pages from the Internet, extract advertisements from those Web pages, classify each advertisement, and store the data in a database. A statistical summarization system accesses the processed raw traffic data and the advertisement data in the database to calculate advertising prevalence statistics including the advertising frequency, impressions, and spending.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • The present application claims priority from, and incorporates by reference, the provisional application for letters patent, No. 60/175,665, filed in the United States Patent and Trademark Office on Jan. 12, 2000, and provisional application for letters patent, No. 60/231,195, filed in the United States Patent and Trademark Office on Sep. 7, 2000.
  • FIELD OF THE INVENTION
  • The present invention relates generally to a system, method, and computer program product for tracking and measuring digital content that is distributed on a computer network such as the Internet. More particularly, the present invention relates to a system, method, and computer program product that collects online advertisement data, analyzes the data, and uses the data to calculate measurements of the prevalence of those advertisements.
  • BACKGROUND OF THE INVENTION
  • The increase in the popularity of the Internet and the World-Wide-Web (“Web”) is due, in part, to the interactive technologies that a Web page can employ. These interactive technologies directly affect the Web as an advertising medium because the technologies introduced new advertising formats such as fixed icon sponsorship advertisements, rotating banners and buttons, and interstitial advertisements (i.e., online advertisements that interrupts the user's work and takes over a significant percentage of the screen display). Even though the creation of the advertisement is different, the affect on the viewer is similar to traditional advertising. For example, a banner advertisement or logo icon on a Web page creates an impression of the product for the viewer that is equivalent to a traditional billboard advertisement that promotes a product by presenting the brand name or slogan. Similarly, a sponsor's logo on a Web page creates an impression of the sponsor for the viewer that is equivalent to seeing a sponsor logo on the scoreboard at a college basketball game.
  • The rapid and volatile growth of the Internet over the last several years has created a high demand for quality statistics quantifying its magnitude and rate of expansion. Several traditional measurement methodologies produce useful statistics about the Internet and its users, but the complexity of the Internet has left some of these methodologies unable to answer many important questions.
  • Online advertising is one area where traditional methodologies do not lend well to measurement. Each day, thousands upon thousands of electronic advertisements appear and then disappear from millions of Web pages. The transitory nature of online advertising activity warrants a novel methodology to accurately measure advertising activity.
  • Existing advertisement tracking and measurement systems automate the collection of Web pages, but fail to automate the collection of the online advertisements. Since the content of an online advertisement changes or rotates over time, accurate reconstruction of the frequency of specific advertisements requires continuous sampling of relevant Web pages in the correct proportions. Furthermore, due to the sheer size of the Web, sampling algorithms must be finely tuned to optimize the allocation of resources (i.e., network bandwidth, database storage, processor time, etc.) and simultaneously enable maximum Internet coverage. The existing advertisement tracking and measurement systems fail to meet these needs because they are not optimized for resource allocation and do not continuously sample relevant Web pages in the correct proportion.
  • In view of the deficiencies of the existing systems described above, there is a need for an advertisement tracking and measurement system that uses resources more intelligently, is friendlier to the Web sites that it visits, is scalable, and produces accurate measurements. The invention disclosed herein addresses this need.
  • SUMMARY OF THE INVENTION
  • The present invention is a system, method, and computer program product for tracking and measuring digital content that is distributed on a computer network such as the Internet. The system collects online advertisement data, analyzes the data, and uses the data to calculate measurements of the prevalence of those advertisements.
  • In the preferred embodiment, traffic data from a variety of sources and complimentary methodologies fuels the traffic analysis system, an intelligent agent (i.e., software that interact with, learn from, and adapt to an environment). The traffic analysis system processes raw traffic data by cleansing and summarizing the traffic data prior to storing the processed data in a database. When the statistical summarization system calculates the advertising frequency, impressions, and spending, it relies upon the processed data from the traffic analysis system.
  • The advertisement sampling system, also known as the “prober” or “Cloudprober”, use a robust methodology that continually seek out the most significant and influential Web sites to probe (i.e., monitor). Moreover, the site selection and definition performed by the present invention dictates the Web pages that comprise each Web site to ensure that complete, singularly branded entities are reported as such. The advertisement sampling system uses intelligent agent technology to retrieve Web pages at various frequencies to obtain a representative sample. This allows the Cloudprober to accurately assess how frequently each advertisement appears in the traffic data. After the Cloudprober fetches a Web page, the advertisement sampling system extracts the advertisements from the Web page. In the preferred embodiment, the advertisement extractor, also known as the “extractor”, invokes an automatic advertisement detection (“AAD”) process, a heuristic extraction process, to automatically extract all of the advertisements from the Web page.
  • Following extraction of the advertisements from the Web page, the advertisement sampling system invokes a classification engine to analyze the advertisement fragments. The classifier processes each fragment to determine a classification for the fragment and then stores the fragment and classification data in a database. The result of the analyses and processing performed by the advertisement sampling system is a rich catalog of advertising activity that can be easily queried by a client.
  • The present invention uses a Web front end and user interface to access and update the data in the database. The Web front end provides a client, or user, of the present invention with a query interface to the database populated by the traffic analysis, advertisement sampling, and the statistical summarization systems. The user interface is a graphical user interface that includes a separate component for system account management, site administration, taxonomy administration, advertising content classification, and rate card collection. The user interface allows an account manager and operator to maintain and administer the present invention. The user interface also allows a media editor to review the data in the database to verify the accuracy and integrity of the vast amount of data collected by the present invention. This data integrity process routinely investigates unusual or outlying data points to calibrate the system and adapt it to an ever-changing environment.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The accompanying figures best illustrate the details of the present invention, both as to its structure and operation. Like reference numbers and designations in these figures refer to like elements.
  • FIG. 1 is a network diagram depicting the environment for an advertising prevalence system according to the present invention.
  • FIG. 2 depicts the network diagram of FIG. 1, in greater detail, to show the relationships between the network environment and the elements that comprise the advertising prevalence system.
  • FIG. 3 depicts the network diagram of FIG. 2, in greater detail, to show the elements and sub-elements that comprise the advertising prevalence system and the connections to the network environment.
  • FIG. 4A is an exemplary Web site that illustrates the expected values used in the calculation of the advertising prevalence statistics.
  • FIG. 4B is an exemplary Web site that illustrates the observed values used in the calculation of the advertising prevalence statistics.
  • FIG. 4C is an exemplary Web site that illustrates the weighted values used in the calculation of the advertising prevalence statistics.
  • FIG. 4D is an exemplary Web site that illustrates an alternative method for the calculation of the advertising prevalence statistics.
  • FIG. 5 illustrates an example of a database structure that the advertising prevalence system may use.
  • FIG. 6 is a functional block diagram of the advertising prevalence system that shows the configuration of the hardware and software components.
  • FIG. 7A is a flow diagram of a process in the advertising prevalence system that measures the quality of online advertising and the activity generated by an online advertisement.
  • FIG. 7B is a flow diagram that describes, in greater detail, the process of sampling traffic data from FIG. 7A.
  • FIG. 7C is a flow diagram that describes, in greater detail, the process of generating a probe map based on sampled traffic data from FIG. 7A.
  • FIG. 7D is a flow diagram that describes, in greater detail, the process of probing the Internet 100 to gather sample data from FIG. 7A.
  • FIG. 7E is a flow diagram that describes, in greater detail, the process of classifying the advertising data from FIG. 7A.
  • FIG. 7F is a flow diagram that describes, in greater detail, the process of calculating advertising statistics from FIG. 7A.
  • DETAILED DESCRIPTION OF THE INVENTION
  • FIG. 1 depicts the environment for the preferred embodiment of the present invention that includes the Internet 100, and a Web site 110, traffic sampling system 120, advertising prevalence system 130, and client 140. The present invention uses intelligent agent technology to gather data related to the attributes, placement, and prevalence of online advertisements. This data provides a user with up-to-date estimates of advertisement statistics and helps the user to gain a competitive advantage.
  • As shown in FIG. 1, the Internet 100 is a public communication network that allows the traffic sampling system 120 and advertising prevalence system 130 to communicate with a client 140 and a Web site 110. Even though the preferred embodiment uses the Internet 100, the present invention contemplates the use of other public or private network architectures such as an intranet or extranet. An intranet is a private communication network that functions similar to the Internet 100. An organization, such as a corporation, creates an intranet to provide a secure means for members of the organization to access the resources on the organization's network. An extranet is also a private communication network that functions similar to the Internet 100. In contrast to an intranet, an extranet provides a secure means for the organization to authorize non-members of the organization to access certain resources on the organization's network. The present invention also contemplates using a network protocol such as Ethernet or Token Ring, as well as, proprietary network protocols.
  • The traffic sampling system 120 is a program that monitors and records Web activity on the Internet 100. The traffic sampling system 120 is an intermediary repository of traffic data between a Web surfer (not shown) on the Internet 100 and a Web server 112. The Web server 112 shown in FIG. 1 is a conventional personal computer or computer workstation that includes the proper operating system, hardware, communications protocol (e.g., Transmission Control Protocol/Internet Protocol), and Web server software to host a collection of Web pages. The Web surfer (not shown) communicates with the Web server 112 by requesting a Uniform Resource Locator (“URL”) 114, 116, 118 associated with the Web site 110, typically using a Web browser. Any program or device that can record a request for a URL made by a Web surfer (not shown) to a Web server 112 can perform the functions that the present invention requires of the traffic sampling system 120. The traffic sampling system 120 then aggregates the traffic data for each Web site 110 for use by the advertising prevalence system 130.
  • The present invention can use any commercially available traffic sampling system that provides functionality similar to the Media Metrix audience measurement product. Other possible mechanisms to obtain a traffic data sample include:
  • 1. “Proxy Cache Sampling” gathers data such as user clickstream data, and Web page requests from a global distributed hierarchy of proxy cache servers. This data passes through an intermediate mechanism that provides pre-fetch and caching services for Web objects. As of May 1999, traffic statistics calculated by the present invention represent the distillation of raw data from nine first-tier and approximately 400 second-tier caches in the United States, as well as an additional 1100 worldwide.
  • 2. “Client-Side Panel Collection” retrieves sample data from each panelist via a client-side mechanism and transfers that data to a collection repository. The client-side mechanism may monitor the browser location bar, use browser, a client-side proxy, or TCP/IP stack hooks.
  • 3. A “Transcoder” is a proxy that rewrites HTML, usually for the purpose of adding elements for generation of advertisement revenue or page headers/footers. Free Internet service providers (“ISPs”) typically use this technique.
  • 4. Any content distribution mechanism that replicates Web page or site content in a manner meant to ease network congestion or improve user experience.
  • 5. Any content filtering mechanism that evaluates requests for URLs and takes actions to allow or disallow such requests.
  • 6. From server logs maintained by Internet service providers (“ISPs”) or individual Web sites.
  • FIG. 2 expands the detail of the advertising prevalence system 130 in FIG. 1 to show the relationships between the network environment and the elements that comprise the advertising prevalence system 130. The advertising prevalence system 130 includes a traffic analysis system 210, advertisement sampling system 220, and statistical summarization system 230 that communicate data to the database 200 for storage. The account manager 260, operator 262, and media editor 264 can access the database 200 through the user interface 240 to perform administrative functions. The client 140 can access the database 200 through the Web front end 250.
  • The traffic analysis system 210 receives raw traffic data from the traffic sampling system 120. The traffic analysis system 210 cleanses the raw traffic data by removing information from the traffic data that may identify a particular user on the Internet 100 and then stores the anonymous data in the database 200. The traffic analysis system 210 estimates the global traffic to every significant Web site on the Internet 100. This present invention uses this data not only for computing the number of advertising impressions given an estimate of the frequency of rotation on that page, but also in the probe mapping system 320. In one embodiment, the traffic analysis system 210 receives traffic data from a cache site on the Internet 100. The goal is to accurately measure the number of page views by individual users, and therefore the number of advertising impressions.
  • The advertisement sampling system 220 uses the anonymous traffic data to determine which URLs to include in the sample retrieved from the Web server 112. The advertisement sampling system 220 contacts the Web server 112 through the Internet 100 to retrieve a URL 114, 116, 118 and extract the advertisements therein along with the accompanying characteristics that describe the advertisements. The success rate for retrieval of creatives is high. Analysis indicates that the present invention captures over 95% of creatives served. The advertisement sampling system 220 stores these advertisement characteristics in the database 200. The advertisement sampling system 220, for example, the Cloudprober, Online Media Network Intelligent Agent Collection (“OMNIAC”), or the Cloudprober, repeatedly probes prominent Web sites, extracts advertisements from each Web page returned by the probe, and classifies the advertisements in each Web page by type, technology and advertiser.
  • The traffic analysis system 210 and the advertisement sampling system also present the data retrieved from the Internet 100 to the statistical summarization system 230 for periodic processing. The statistical summarization system 230 calculates the advertising frequency, impressions, and spending on per site per week basis.
  • The graphical user interface for the present invention includes the user interface 240 and Web front end 250. The account manager 260, operator 262, and media editor 264 access the user interface 240 to administer access by the client 140 to the Web front end 250 (e.g., account and password management), define sites and probe instructions, and manage the advertising taxonomy, content classification, and rate card collection for the advertising prevalence system 130. The Web front end 250 is the Web browser interface that a client 140 uses to retrieve the advertisement measurement results from the database 200 as generated by the traffic analysis system 210, advertisement sampling system 220, and the statistical summarization system 230.
  • FIG. 3 further expands the detail of the advertising prevalence system 130 to depict the logical components comprising the elements of the advertising prevalence system 130 shown in FIG. 2. FIG. 3 also depicts the relationships between the network environment and those logical components.
  • The traffic analysis system 210 includes an anonymity system 310 and traffic summarization process 312.
  • The anonymity system 310 cleanses the data received from the traffic sampling system 120 by removing information that identifies a particular user on the Internet. The data is rendered anonymous by passing all user information (e.g., originating internet protocol (“IP”) number or cookies) through a cryptographically secure one-way hash function; this assures the utmost privacy for Web users without devaluing the resulting data. The anonymity system 310 presents the cleansed data to the traffic summarization system 312 which in turn stores the aggregated URL count information in database 200.
  • The traffic summarization process 312 receives cleansed data from the anonymity system 310. The anonymous traffic data is summarized to yield traffic totals by week or month for individual URLs, domains, and Web sites. The traffic summarization process 312 scales the data by weighting factors to extrapolate total global traffic from the sample.
  • The advertisement sampling system 220 in FIG. 3 includes a probe mapping system 320, Web page retrieval system 322, Web browser emulation environment 324, advertisement extractor 326, and a structural classifier 328.
  • The probe mapping system 320 generates a probe map, i.e., the URLs 114, 116, 118 that the advertisement sampling system 220 will visit. This probe map assists the advertisement sampling system 220 with the measurement of the rotation of advertisements on individual Web sites. The preferred embodiment of the present invention continuously fetches various Web pages in the probe map. In an alternative embodiment, the present invention visits each URL in the probe map approximately every 6 minutes. Another embodiment can vary the fetching rate by considering several factors including the amount of traffic that visits the Web site as a whole and the individual Web page in question, the number of advertisements historically seen on the Web page, and the similarlity of the historically observed ad rotation to other sampled pages.
  • The Web page retrieval system 322 uses this probe map generated by the probe mapping system 320 to determine which Web pages it needs to sample and the frequency of the sampling. For each URL in the probe map generated by the probe mapping system 320, the Web page retrieval system 322 fetches a Web page, extracts each advertisement from the Web page, and stores the advertisement's attributes in the database 200. The data retrieved from each URL in the probe map is used to calculate the frequency with which each advertisement is shown on a particular Web site
  • For each Web page, the Web browser emulation environment 324 simulates the display of the Web page in a browser. This simulation guarantees that the present invention will detect not only static advertisements, but also dynamic advertisements generated by software programs written in a language such as JavaScript, Perl, Java, C, C++, or HTML that can be embedded in a Web page.
  • The advertisement extractor 326 extracts the online advertisements from the result of the simulation performed by the Web browser emulation environment 324. The advertisement extractor 326 identifies features of the advertising content (i.e., “fragments”) extracted from the Web pages returned by the probe mapping system 320 that are of particular interest. Advertisements are the most interesting dynamic feature to extract, however, an alternative embodiment of the present invention may use the extraction technology to collect any type of digital content including promotions, surveys, and news stories. The advertisement extractor 326 can use various advertisement extraction methods, including rule-based extraction, heuristic extraction, and comparison extraction.
  • Rule-based extraction relies upon a media editor 264 to use the user interface 240 to create rules. The user interface 240 stores the rules in the database 200 and the advertisement extractor 326 applies the rules to each Web page that the Web page retrieval system 322 retrieves. The effect of running a rule is to identify and extract an HTML fragment from the Web page (i.e., the part of the page containing the advertisement). The advertisement extractor 326 first converts the HTML representation of the fetched Web page into a well-formed XML representation. Following this conversion, the rules are applied to the parse tree of the XML representation of the Web page.
  • Heuristic extraction relies upon the similarity of advertisements at the HTML or XML source code level because the advertisements are typically inserted by an advertisement server when the Web page is generated in response to the Web browser emulation environment 324 request to display the Web page. Heuristic extraction analyzes the source code for clues (e.g., references to the names of known advertisement servers) and extracts fragments that surround those clues. The advantage of this method is that the extraction is automatic and the media editor need not create the rules.
  • Comparison extraction repeatedly fetches the same Web page. This extraction method compares the different versions of the Web page to determine whether the content varies from version to version. The portion of the Web page that varies with some degree of frequency is usually an advertisement and is extracted.
  • The structural classifier 328 parses each advertisement and stores the structural components in the database 200 and passes those components to the statistical summarization system 230. Each advertisement fragment extracted by the advertisement extractor 326 is analyzed by the structural classifier 328. The process performed by the structural classifier 328 comprises duplicate fragment elimination, structural fragment analysis, duplicate advertisement detection.
  • The structural classifier 328 performs duplicate fragment elimination by comparing the current advertisement fragment to other fragments in the database 200. Two advertisement fragments are duplicates if the fragments are identical (e.g., each fragment has the exact same HTML content). If the structural classifier 328 determines that the current fragment is a duplicate of a fragment in the database, the advertisement sampling system 220 logs another observation of the fragment and continues processing fragments.
  • The structural classifier 328 performs structural fragment analysis on the XML representation of the Web page by determining the “physical type” of the fragment (i.e., the HTML source code used to construct the advertisement). Physical types that present invention recognizes include banner, form, single link, and embedded content. Banner advertisement fragments include a single HTML link having one or two enclosed images and no FORM or IFRAME tag. Form advertisement fragments include a single HTML form having no IFRAME tag. Single link advertisement fragments include a link with textual, but no IMG, FORM, or IFRAME tags. Embedded content advertisement fragments reference an external entity using an IFRAME tag. After performing this analysis, the structural classifier 328 updates the advertisement fragment in the database. For a banner advertisement fragment, the structural classifier 328 stores the link and image URL's in the database 200. A form advertisement fragment requires the creation of a URL by simulating a user submission that sets each HTML control to its default value. The structural classifier 328 stores this URL and the “form signature” (i.e., a string that uniquely describes the content of all controls in the form) in the database 200. For a single text advertisement fragment, the structural classifier 328 stores the URL for the link and all text contained within the link in the database 200. For embedded content advertisement fragments, the structural classifier 328 stores the URL associated with the external reference in the database 200. This URL is loaded by the system, and the referenced document is loaded. Once the loaded document has been structurally analyzed, the original fragment inherits any attributes that result from analysis of the new fragment.
  • The structural classifier 328 performs duplicate advertisement detection on each advertisement fragment that has a known physical type because these fragments represent advertisements. Each unique advertisement has information, including which site definitions are associated with the fragment, stored in the database 200. The structural classifier 328 determination of uniqueness depends on different criteria for each type of fragment. The first step for every type of definition is to resolve all URLs associated with the record. URLs that refer to images are loaded, and duplicate images are noted. HTML link URLs, also known as “click URLs”, are followed each time a new ad is created. The final destination for a click URL, after following all HTTP redirects, is noted. This is also done for simulated link submission URLs associated with form definitions. Once all URLs have been resolved, the structural classifier 328 determines whether the advertisement is unique. Banner advertisement fragments are considered unique if they have the same number of images, if the images are identical, and if the destination URL is identical. Form advertisement fragments are considered unique if they have the same signature, and the same destination URL. Single link advertisement fragments are considered unique if they have the same textual content and the same destination URL.
  • The statistical summarization system 230 calculates the advertisement statistics for each unique advertisement in the database 200. The present invention calculates, for each Web site, the advertising impressions (i.e., the number of times a human being views an advertisement). The present invention calculates the advertising impressions, I, using the formula I=T×R, where T is the traffic going to the site, and R is the rotation of advertisements on that site. The present invention also calculates the spending, S, using the formula S═I×RC, where I is the advertising impressions for a Web site, and RC is the rate code for the Web site. Most advertising buys are complicated deals with volume purchasing discounts so our numbers do not necessarily represent the actual cost of the total buy.
  • The Web front end 250 is a graphical user interface that provides a client 140 with a query interface to the database 200 populated by the traffic analysis system 210, advertisement sampling system 220, and the statistical summarization system 230. The client 140 can use the Web front end 250 to create, store, edit and download graphical and tabular reports for one or more industry categories depending on the level of service the client 140 selects.
  • The user interface 240 in FIG. 3 includes a separate component for system account management 340, site administration 342, taxonomy administration 344, advertising content classification 346, and rate card collection 348.
  • The account manager 260 uses the system account management 340 module of the user interface 240 to simplify the administration of the Web front end 250. The account manager 260 uses the system account management 340 module to create and delete user accounts, manage user account passwords, and check on the overall health of the Web front end 250.
  • The operator 262 uses the site administration 342 module of the user interface 240 to simplify the administration of the site definitions. Analysts from the Internet Advertising Bureau estimate that over 90% of all Web advertising dollars are spent on the top fifty Web sites. Site selection begins by choosing the top 100 advertising by considering data from Media Metrix, Neilsen/Net Ratings, and the proxy traffic data in the database 200. These lists are periodically updated to demote Web sites with low traffic levels and promote new sites with high traffic levels. The present invention also includes Web sites that provide significant content in key industries. A site chosen for inclusion in the site definitions must have the structure of the site analyzed to remove sections that do not serve advertisements, originate from foreign countries, or are part of a frame set. Sites that originate from a foreign country, such as yahoo.co.jp, sell advertising in the host country, and therefore are not applicable to the measurements calculated by the present invention. Web sites that use an HTML frameset are treated very carefully to only apply rotation rates to the traffic from the sections of the frameset that contain the advertisement. These combined exclusions are key to making accurate estimates of advertising impressions. The present invention also tags sections that cannot be measured directly, due to registration requirements (e.g., mail pages). Since Web sites change frequency, this structural analysis is repeated periodically. Eventually the analysis stage will automatically flag altered sites to allow even more timely updates.
  • The media editor 264 uses the taxonomy administration 344, advertising content classification 346, and rate card collection 348 modules of the user interface 240. The taxonomy administration 344 module simplifies the creation and maintenance of the attributes assigned to advertisements during content classification including the advertisements industry, company, and products. The taxonomy names each attribute and specifies its type, ancestry and segment membership. For example, a company Honda, might be parented by the Automotive industry and belong to the industry segment Automotive Manufactures. The advertising content classification 346 component assists the media editor 264 with performing the content classification.
  • The structural classifier 328 performs automated advertisable assignment to determine what the advertisement is advertising. This process include assigning “advertiseables” (i.e., attributes describing each “thing” that the advertisement is advertising) to each advertisement fragment. In another embodiment of the present invention, the advertisement sampling system 220 uses an extensible set of heuristics to assign advertisables to each advertisement. In the preferred embodiment, however, the only automatic method employed is location classification. Location classification relies on the destination URL in order to assign a set of advertisables to an advertisement. A media editor 264 uses the user interface 240 to maintain the set of classified locations. For example, the first time a media editor observes an advertisement in which the click-thru URL is www.honda.com, he can enter this URL as pertaining to the advertiser “Honda Motors”. Any subsequent advertisement that includes the same click-thru URL will also be recognized as a Honda advertisement. A classified location comprises a host, URL path prefix, and set of advertisables. Location classification assigns a classified location advertisables to an advertisement if the host in the destination URL matches the host of the classified location and the path prefix in the classified location matches the beginning of the path in the destination URL.
  • The structural classifier 328 performs human advertisable assignment and verification as a quality check of the advertisable data. This phase is the most human intensive. A media editor 264 uses a graphical user interface module in the user interface 240 to display each advertisement, verifies automatic advertisable assignments, and assigns any other appropriate advertisables that appear appropriate after inspection of the advertisement and the destination of the advertisement. The location classification database is also typically maintained at this time.
  • The media editor 264 uses the rate card collection 348 module to enter the contact and rate card information for a Web site identified by the traffic analysis system 210, as well as, designated advertisers. Rate card entry includes the applicable quarter (e.g., Q4 2000), advertisement dimensions in pixels, fee structure (e.g., CPM, flat fee, or per click), cost schedule for buys of various quantities and duration. The media editor also records the URL address of the online media kit and whether rates are published therein. Contact information for a Web site or advertiser includes the homepage, name, phone and facsimile numbers, email address, and street address.
  • FIGS. 4A through 4C illustrate the preferred method for calculating the advertising prevalence statistics. The calculation of the advertising prevalence statistics is an iterative process that uses expected values derived by the traffic analysis system 210 and observed values derived by the advertising prevalence system 220 to calculate the weighted values and the advertising prevalence statistics. FIGS. 4A through 4C each depict a network on the Internet 100 that includes two Web sites served by Web server P 410 and Web server Q 420. FIG. 4A illustrates exemplary expected traffic values for the network. FIG. 4B illustrates exemplary observed traffic values for the network. FIG. 4C illustrates exemplary weighted traffic values for the network.
  • The first step in the process is to normalize the results from the traffic analysis system 210. The traffic analysis system 210 provides the traffic received by each Web page in the traffic data sample. FIG. 4A depicts the exemplary traffic received at each Web page 411-416, 421-424 in the Internet 100 with the label “Traffic=”. The probe map generated by the probe mapping system 320 includes an entry for each Web page 411-416, 421-424. The probe map also includes an “area” that each Web page 411-416, 421-424 consumes in the probe map. FIG. 4A depicts the exemplary area that each Web page 411-416, 421-424 consumes in the probe map with the label “Area=”. The normalized results are calculated by dividing the area that a Web page consumes in the probe map by the sum of the area for each Web page in the traffic sample. In FIG. 4A, the normalized value, or chance, for Web page P1 411 is the area for Web page P1 (i.e., 15) divided by the sum of the area for Web page P1, P2, P3, P4, P5, P6, Q1, Q2, Q3, and Q4 (i.e., 120). The normalized value is, therefore, 0.125, or 12.5%. In addition to the normalized, the system also determines the scale by dividing the traffic for a Web page by the area for the Web page. In FIG. 4A, the scale for Web page P1 411 is the traffic for Web page P1 (i.e., 150) divided by the area for Web page P1 (i.e., 15), therefore, the scale for Web page P1 is 10. Table I summarizes the scale and chance values for the remaining Web page in FIG. 4A.
    TABLE 1
    Web Page Area Scale Chance
    P1
    15 10 12.5%
    P2
    10 1  8.3%
    P3
    14 1   12%
    P4
    12 0.25   10%
    P5
    8 0.5  6.7%
    P6
    4 1  3.3%
    Q1
    30 0.5   25%
    Q2
    4 0.5  3.3%
    Q3
    15 2 12.5%
    Q4
    8 0.5  6.7%
  • FIG. 4B depicts the exemplary Web page fetches at each Web page 411-416, 421-424 in the Internet 100 with the label “Fetches=”. FIG. 4B also depicts the exemplary number of views of advertisement that appear on each Web page 411-416, 421-424 with the label “A1 Views=” to indicate the number of views of advertisement A1, “A2 Views=” to indicate the number of views of advertisement A2, etc.
  • FIG. 4C depicts the exemplary Web page weighted fetches at each Web page 411-416, 421-424 in the Internet 100 with the label “Fetches=”. FIG. 4C also depicts the exemplary number of views of advertisement that appear on each Web page 411-416, 421-424 with the label “A1 Views=” to indicate the number of views of advertisement A1, “A2 Views=” to indicate the number of views of advertisement A2, etc. The next step in the calculation process is to calculate the Scaled Fetches for each Web site 410, 420 by summing the product of the observed fetches from FIG. 4B and the scale from FIG. 4A, for each Web page 411-416, 421-424 in the Web site. Next, the calculation computes the Traffic for each Web site 410, 420 by summing the traffic from FIG. 4A for each Web page 411-416, 421-424 in the Web site. The rate card, or CPM, is a value assigned by the media editor 264 for each Web site 410, 420. Table 2 summarizes the Scaled Fetches, Traffic, and CPM for FIGS. 4A through 4C.
    TABLE 2
    Site Scaled Fetches Traffic CPM
    P 193.5 185 $35.00
    Q 43 51 $50.00
  • The next in the calculation process is to compute the Scaled Observations for each advertisement on each Web site 410, 420 by summing the product of the advertisement views from FIG. 4B and the scale from FIG. 4A, for each Web page 411-416, 421-424 in the Web site 410, 420. The final step in the calculation is to compute the advertising prevalence statistics (i.e., Frequency, Impressions, and Spending) for each advertisement in each Web site 410, 420. Frequency is computed by dividing the scaled observations by the scaled fetches for each advertisement in each Web site 410, 420. Impressions is computed by multiplying the Frequency by the Traffic from Table 2 above for each advertisement in each Web site 410, 420. Spending is computed by multiplying the Impressions by the CPM from Table 2 above for each advertisement in each Web site 410, 420. Table 3 summarizes the Scaled Observations, Frequency, Impressions, and Spending for Web site P 410 using the data in FIGS. 4A through 4C. Table 4 summarizes the Scaled Observations, Frequency, Impressions, and Spending for Web site Q 410 using the data in FIGS. 4A through 4C.
    TABLE 3
    Scaled
    Observations Frequency Impressions Spending
    A1 55.0 0.28 52.58 $1.84
    A2 85.0 0.44 81.27 $2.84
    A3 6.0 0.03 5.74 $0.20
    A4 3.5 0.02 3.35 $0.12
    A5
  • TABLE 4
    Scaled
    Observations Frequency Impressions Spending
    A1 29.5 0.69 34.99 1.75
    A2 12.0 0.28 14.23 0.71
    A3 12.0 0.28 14.23 0.71
    A4 12.0 0.28 14.23 0.71
    A5 1.5 0.03 1.78 0.09
  • FIG. 4D illustrates an alternative embodiment for calculating the advertising prevalence statistics. In this embodiment, the prober is tuned to optimize rotation measurement accuracy. Statistical estimates of accuracy in the field are difficult to perform, due to the non-stationary nature of advertising servers. When probing every 6 minutes, it has a 0.06% resolution in rotational frequency over a one-week measurement period.
  • Also in the alternative embodiment of FIG. 4D, the probes are distributed among the sites to accurately measure ad rotation on each site. The number of probing URLs assigned to a site is determined from three variables. The first is a constant across all sites; a certain number of probing URLs are required to accurately measure rotation on even the smallest site. Half of the probes are assigned with this variable. The second variable, weighted at 40%, is the amount of traffic going to a site, as each probing URL represents a proportion of total Internet traffic. The twenty largest sites receive over 75% of these probes. Finally the complexity of site, as measured by the total number of unique URLs found in our proxy traffic data, is taken into account, with more complicated sites receiving extra probing URLs. This accounts for the remaining 10% of the probe distribution. Probing URLs can be chosen using a Site Shredder algorithm to break the site into regions (i.e., sets of pages whose advertisement rotation characteristics are likely to be similar) for probing. The distribution of regions is mathematically designed to maximize site coverage and, therefore, advertisement rotation accuracy. A single URL is chosen to represent the advertising rotation from each region. This URL is chosen as the most heavily trafficked page containing advertisements in that region. The algorithm avoids date specific pages or pages referring to a time-limited event such as the August 1999 total lunar eclipse.
  • The alternative embodiment of FIG. 4D calculates advertisement impressions by combining the estimates of rotation and traffic for each Web site 430. To do this the system breaks the site down into its constituent stems using the Site Shredder algorithm. The rotation of advertisements in each advertisement slot is calculated and applied to estimate advertising impressions on its associated stem. The advertisement rotation on stems without probes is estimated from an average, weighted by traffic, of advertisement rotation of probes on a similar level.
  • For instance, in FIG. 4D, the sample site tree has five probes URLs 431-435, P1-5, placed on five main branches off a main page and 14 secondary branches. The number on each page is the sample traffic going to that page. Probe P1 on the home page, “www.testsite.com”, measures the rotation, R, to be applied to the traffic going to that main page, with traffic of 88 page views. Branch A has a single probe, P2, placed on the top-level page of that branch with a probing URL “www.testsite.com/A/”. The rotation of this single probing URL is estimated as RA and is applied to the traffic for that entire stem, a total of 21 page views. Branch C has a probe, P3, on a heavily trafficked secondary branch page, with a probing URL “www.testsite.com/C/third.html”. The rotation, RC, of this page is applied to all the secondary branch pages on that stem and also up one level in the tree, across a total of 25 page views. Branch E receives a large portion of the traffic for the site, a total of 61 page views, and therefore is assigned two probes, P4 and P5. These are on two secondary branch pages, “www.testsite.com/E/first.html” and “www.testsite.com/E/third.html”. The rotation of each is applied the traffic to those individual pages. For the remaining 18 page views on that branch (ten page views from two secondary pages and eight from the top level page of that branch) a weighted rotation is calculated, RE=((13×RE1)+(30×RE3))/(13+30). The analysis of stem rotation results in advertising impressions for over 96% of the site. The impressions for the final two branches, B and D, are calculated with an average rotation from adjacent branches, weighted by traffic,
    R B =R D=((21×R A)+(25×R C)+(61×R E))÷(21+25+61).
    This analysis results in total impressions across the site for each unique advertisement. The final calculation performed by the alternative embodiment of FIG. 4D is spending, the product of the Impressions and the Rate Card.
  • FIG. 5 illustrates a database structure that the advertising prevalence system 130 may use to store information retrieved by the traffic sampling system 120 and the Web page retrieval system 320. The preferred embodiment segments the database 200 into partitions. Each partition can perform functions similar to an independent database such as the database 200. In addition, a partitioned database simplifies the administration of the data in the partition. Even though the preferred embodiment uses database partitions, the present invention contemplates consolidation of these partitions into a single database, as well as making each partition an independent database and distributing each database to a separate general purpose computer workstation or server. The partitions for the database 200 of the present invention include sampling records 510, probing definitions 520, advertising support data 530, and advertising summary 540. The preferred embodiment of the present invention uses a relational database management system, such as the Oracle8i product by Oracle Corporation, to create and manage the database and partitions. Even though the preferred embodiment uses a relational database, the present invention contemplates the use of other database architectures such as an object-oriented database management system.
  • The sampling records 510 partition of database 200 comprises database tables that are logically segmented into traffic data 512, advertisement view logging 514, and advertising structure 516 areas.
  • The traffic data 512 area contains data processed by the traffic sampling system 120, anonymity system 310, and statistical summarization system 230. The data stored in this schema includes a “munged” URL, and the count of traffic each URL receives per traffic source over a period of time. A “munged” URL is an ordinary URL with the protocol field removed and the order of the dotted components in the hostname reversed. For example, the present invention transforms an ordinary URL, such as http://www.somesite.com/food, into a munged URL by removing the protocol field (i.e., “http://”) and reversing the order of the dotted components in the hostname (i.e., “www.somesite.com”). The resulting munged URL in this example is “com.somesite.www/food”. The present invention uses this proprietary URL format to greatly enhance the traffic data analysis process. The traffic sampling system 120 populates the traffic data 512 area in database 200. The probe mapping system 320 accesses the data in the traffic data 512 area to assist the Web page retrieval system 322 and the statistical summarization system 230 with the calculation of the advertising impression and spending statistics.
  • The advertisement view logging 514 area logs the time, URL, and advertisement identifier for each advertisement encountered on the Internet 100. This area also logs each time the system does not detect an advertisement in a Web page that previously included the advertisement. In addition, the system logs each time the system detects a potential advertisement, but fails to recognize the advertisement during structural classification. The structural classifier 328 and the Web page retrieval system 322 of the advertisement sampling system 220 populate the advertisement view logging 514 area in database 200. The statistical summarization system 230 accesses the data in the advertisement view logging 514 area to determine the frequency that each advertisement appears on each site.
  • The advertisement structure 516 area contains data that characterizes each unique advertisement located by the system. This data includes the content of the advertisement, advertisement type (e.g., image, HTML form, Flash, etc.), the destination URL linked to the advertisement, and several items used during content classification and diagnostics, including where the advertisement was first seen, and which advertisement definition originally produced the advertisement. The structural classifier 328 component of the advertisement sampling system 220 populates the advertisement structure 516 area in database 200. The user interface 240 accesses the data in the advertisement structure 516 area to display each advertisement to the media editor 264 during classification editing. The Web front end 250 also accesses the data in the advertisement structure 516 area to display the advertisements to the client 140.
  • The probing definitions 520 partition of database 200 comprises database tables that are logically segmented into site definition 522, probe map 524, and advertisement extraction rule definition 526 areas.
  • The site definition 522 area carves the portion of the Internet 100 that the system probes into regions. The primary region definition is a “site”, a cohesive entity the system needs to analyze, sample, and summarize. The system defines each site in terms of both inclusive and exclusive munged URL prefixes. A “munged URL prefix” is a munged URL that represents the region of all munged URLs for which it is a prefix. An “inclusive munged URL prefix” specifies that a URL is part of some entity. An “exclusive munged URL prefix” specifies that a URL is not part of some entity, overriding portions of the entity included by an inclusive prefix. To illustrate, the following is list of munged URLs that may result from the processing of a set of URLs in a traffic sample.
      • 1. com.somesite/
      • 2. com.somesite/foo
      • 3. com.somesite/foo/bar
      • 4. com.somesite/foo/blah
      • 5. com.someothersite/
        If the site definition for “somesite” includes the inclusive URL prefix “com.somesite/” and the exclusive URL prefix “com.somesite/foo/bar”, the application of this site definition to above sample URLs yields a system that includes URL 1, 2, and 4. URL 3 is not part of the site definition due to the explicit exclusion of “com.somesite/foo/bar”. URL 5 is not part of the site definition because it was never included in the inclusive URL prefix “com.somesite/”. The user interface 240 populates the site definition 522 area in database 200. The probe mapping system 320 accesses the data in the site definition 522 area to determine which URLs to probe. The statistical summarization system 230 accesses the data in the site definition 522 area to determine traffic levels to sites by summing traffic to URLs included in a site.
  • The probe map 524 area contains a weight for each URL in each site that the system is measuring. This weight determines the likelihood that the system will choose a URL for each probe. The system generates the weights by running complex iterative algorithms against the traffic data and the probing records in the database 200. An analysis of the traffic data can discern which URLs have been visited, how often users have visited those URLs. The result of the analysis guarantees that the system performs advertisement sampling of these URLs in similar proportions, given certain constraints such as a maximum number of probes to allocate to any single URL. The data in the sampling records 510 partition of the database 200 is useful for determining which URLs are in need of special handling due to past behavior (e.g., a URL is sampled less infrequently if the system has never detected an advertisement in the URL). The probe mapping system 320 populates the probe map 524 areas in the database 200. The probe mapping system 320 accesses the data in the probe map 524 area to allocate the probes. The statistical summarization system 230 accesses the data in the probe map 524 area to determine which URLs should have their rotations scaled to counter the effect of probe map constraint enforcement.
  • The advertisement extraction rule definition 526 area describes Extensible Markup Language (“XML”) tags, typically representing a normalized HTML document, that indicate those portions of the content that the system considers to be advertisements. The system defines an extraction rule in terms of “XML structure” and “XML features”. “XML structure” refers to the positioning of various XML nodes relative to others XML nodes. For example, an anchor (“A”) node containing an image (“IMG”) node is likely an advertisement. After using this structural detection process to match the advertisement content, the system examines the features of the content to determine if the content is an advertisement. To continue the previous example, if the image node contains a link (“href”) feature that contains the sub-string “adserver”, it is very likely an advertisement. Features may match based on a simple sub-string, as in the example, or a more complicated regular expression. Another form of extraction rule may point to a specific node in an XML structure using some form of XML path specification, such as a “Xpointer”. The media editor 264 populates the advertisement extraction rule definition 526 area in the database 200. The advertisement extractor 326 of the advertisement sampling system 220 accesses the data in the advertisement extraction rule definition 326 area to determine which portions of each probed page represent an advertisement.
  • The advertising support data 530 partition of database 200 comprises database tables that are logically segmented into advertisable taxonomy 532, advertising information 534, rate card 536, and extended advertisable information 538 areas.
  • The advertisable taxonomy 532 area contains a hierarchical taxonomy of advertisables, attributes that describe what the advertisement is advertising. This taxonomy includes industries, companies, products, Web sites, Web sub-sites, messages, etc. Each node in the hierarchy has a type that specifies what kind of entity it represents and a parent node. For example, the hierarchy may specify that products live within companies, which in turn live within industries. The media editor 264 populates the advertisable taxonomy 532 area in the database 200. The user interface 240 accesses the data in the advertisable taxonomy 532 area to generate statistical data and record where companies, industries, etc. tend to advertise. The Web front end 250 also accesses the data in the advertisable taxonomy 532 area to display this information to the client 140.
  • The advertising information 534 area contains the data that describe what each unique advertisement recorded by the system advertises. This tables in this area associate advertisables with advertisements. For example, the system may associate a company type of advertisable with a specific advertisement to indicate that the advertisement is advertising the company. The system uses the following methods to associate an advertisable with an advertisement:
  • 1. A “direct classification” assigns an advertisable directly to the advertisement. For example, a media editor 264 creates a direct classification by specifying that a particular advertisement advertises the “Honda” advertisable.
  • 2. A “location classification” assigns an advertisable to a location prefix that the system uses to match the destination of the advertisement. For example, a media editor 264 creates a location classification by specifying that the location “com.honda” indicates an advertisement for Honda. An advertisement that points to “com.honda.wwv/cars”, therefore, associates the advertisement with Honda.
  • 3. An “ancestral classification” assigns an ancestor of the advertisable to an advertisement. For example, if a direct classification assigns Honda to an advertisement, the “automotive” industry advertisable is a predecessor of Honda. Ancestral classification uses this relationship to associate automotive to the advertisement.
  • The media editor 264 populates the advertising information 534 area in the database 200. The user interface 240 accesses the data in the advertising information 534 area to generate statistical data.
  • The rate card 536 area contains data describing the cost of advertisements on a Web site. These costs include monetary values for each specific shape, size, or length of run that advertisers on the Internet 100 use to determine the cost of advertisement purchases. The system stores rate card data for each Web site that the system probes. The media editor 264 populates the rate card 536 area in the database 200. The user interface 240 accesses the data in the rate card 536 area to generate statistical data.
  • The extended advertisable information 538 area contains additional information about specific types of advertisables not readily captured in the taxonomy hierarchy. Specifically, this includes additional information related to Web sites and companies, such as company contact information, Web site, and media kit URLs. This information extends the usefulness of the system by providing additional information to the client 140 about probed entities. For example, a client 140 may follow a hyperlink to company contact information directly from a system report. The media editor 264 populates the extended advertisable information 538 area in the database 200. The Web front end 250 accesses the data in the extended advertisable information 538 area to deliver value-added information to a client 140.
  • The advertising summary 540 partition of database 200 comprises database tables that are logically segmented into advertising statistics 542, data integrity 544, and advertising information summary 546 areas.
  • The advertising statistics 542 area describes how often an advertisement appears on each Web site. The system calculates and stores the following statistics in this area.
  • 1. The proportion of page views that display an advertisement to the total number of page view. The system determines this statistic by analyzing the probing records.
  • 2. The number of impressions that an advertisement received. The system determines this statistic by measuring traffic levels for the Web site using the site definition and traffic data, and multiplying that measurement by the proportion of page view calculated above.
  • 3. The amount of spending that an advertisement received. The system determines this statistic by applying the rate card information to the number of impressions that the advertisement receives calculated above.
  • The statistical summarization system 230 populates the advertising statistics area 542 in the database 200. The Web front end 250 accesses the data in the advertising statistics 542 area to report spending, impressions, and advertising rotation to the client 140.
  • The data integrity 544 area contains in-depth information about statistical outliers and other potential anomalies resulting from trend and time slice analyses. This automated monitoring and analysis guarantees that the system will contain accurate analysis data. In addition, the system uses real world advertising information, as an input to the system, to verify the accuracy of the analysis data. The data integrity analysis system, performed by the statistical summarization system 230, populates the data integrity 544 area in the database 200. The operator 262 accesses the data integrity 544 area to detect potential errors and monitor general system health.
  • The advertising information summary 546 area summarizes advertising information in a format that is compact and easy to distribute. The system extracts the data in this area from the advertising support data 530 partition. While the data is not as descriptive as the data in the advertising support data 530 partition, it provides the ability to quickly perform a precise query. The advertising support data 530 partition associates each advertisement with a company, product, or industry. If the system associates multiple advertisables of the same type with an advertisement, a single advertisable is chosen to summary those associates using an assignment priority system, as follows:
  • 1. Advertisables associated with an advertisement using direct classification receive the highest possible priority, “M”.
  • 2. Advertisables associated with an advertisement using location classification receive priority equal to the string length of the location prefix to which they are assigned, therefore, a long location prefix string will receive a higher priority than a short location prefix string.
  • 3. Advertisables associated with an advertisement using ancestral classification receive the priority of the assigned ancestor.
  • 4. The advertisement receives the highest priority advertisable in each type.
  • 5. When two ancestors having the same type and priority are assigned to an advertisement, a conflict occurs and must be corrected by the media editor 264.
  • The statistical summarization system 230 populates the advertising information summary 546 area in the database 200. The Web front end 250 accesses the advertising information summary 546 area to generate reports for the client 140.
  • The following description discusses one embodiment of the database structure illustrated in FIG. 5. This data model is encoded in an Oracle database. The table structure comprises three environments, the core schema, analysis schema, and front end. The core schema describes the back-end environment which allow the Cloudprober to direct live autonomous processes that continuously scour the Web noting advertising activity and operators and media editors for the present invention to direct, monitor and augment information provided by the Cloudprober. The analysis schema is the back-end environment that allows the advertisement sampling system, also known as OMNIAC, to apply rigorous data analysis procedures to information gathered from the Web. The front end schema assists a client of the present invention with accessing data, building database query strings, and generating reports.
  • The database objects comprising the “core schema” are most frequently used by various components of the OMNIAC system. Code bases that rely on this schema include implementation of the back end processes that pull advertisements from the Web. Additionally, database schemas utilized by other components associated with OMNIAC are composed of some or all of the tables in the core schema. The core schema is conceptually composed of four sub-schemas including advertising, advertisements, probing, and sites. The advertising sub-schema holds information about “advertiseable” entities along with which entities each advertisement is advertising. The advertisements sub-schema describes the advertisements that the system has located and analyzed. The probing sub-schema defines “when”, “where”, and “how” for the probing process. The sites sub-schema describes Web sites, including structural site definitions and rate card information.
  • Of the four sub-schemas, Advertising serves the most general purpose and is therefore the most frequently referenced. The primary table in this sub-schema is ADVERTISABLE, which defines advertisables. Many of the conceptual entities in OMNIAC's universe are advertisables: industries, companies, products, services and Web sites are all defined here. The type field, referencing the ADVERTISABLE_TYPE table, differentiates between different types of advertisables, and the parent field organizes records hierarchically, establishing such relationships as industry-contains-company and company-produces-product.
  • In addition to the inherent grouping implied by the parent-child relationship defined in ADVERTISABLE, ADVERTISABLE_GROUP_MEMBER is used to further group advertisables. Examples of groups defined in this way include automotive classes, travel industry segments, and types of computer hardware.
  • Other tables in the Advertising sub-schema serve to define what is advertised by each advertisement. ADVERTISES is used to associate advertisables directly with advertisements. LOCATION_ADVERTISES, CLASSIFIED_LOCATION and LOCATION_MATCHES also indirectly associate advertisables with advertisements via the advertisement's destination location.
  • “Advertisements” referred to above are references to records in AD, the primary table in the Advertisements sub-schema. The Advertisements sub-schema serves to define each advertisement in OMNIAC's universe. Every unique advertisement has a record in AD, along with one or more records in AD_DEFINITION. Advertisement definitions are unique XML fragments OMNIAC has retrieved from the Web. Ads are unique advertisements defined by sets of advertisement definitions determined to be equivalent during automated classification.
  • Other tables in Advertisements contain advertisement attributes, referenced by AD and AD_DEFINITION. AD_TECHNOLOGY describes known Web technologies used to render advertisements, while TEXT describes textual content for certain advertisements. FUZZY_WEB_LOCATION contains fuzzy locations found in advertisements. Afuzzy location is a URL that needs to be processed by the system, such as an anchor or image. Once OMNIAC has loaded a fuzzy location, a reference is made to MIME_CONTENT if the URL references an image, or DEST_WEB_LOCATION if the URL references another HTML page.
  • Moving on, the Probing sub-schema controls the behavior of OMNIAC's probing and advertisement extraction components. The primary purpose of this schema is to define target sets. A target set is a conceptual construct that instructs OMNIAC to fetch a set of pages at certain intervals, extracting advertisements using a set of rules called extraction rules. Each target set is defined by a row in TARGET_SET.
  • The frequencies, locations, and extraction rules that make up each target set are defined in STROBE, AD_WEB_LOCATION, and EXTRACTION_RULE, respectively. The many-to-many relationships between rows in these tables are defined in TS_RUNS_AT, TS_PROBES, and TS_APPLIES.
  • The fourth and final sub-schema is Sites, which simply records information about Web sites. Each site or subsite defined in the advertisable hierarchy has a corresponding record in SITE_INFO, along with a number of rows in SITE_DOMAIN and SITE_MONTHLY_DATA. SITE_DOMAIN describes the physical structure of a site in terms of inclusive and exclusive URL stems. SITE_MONTHLY_DATA records advertising rate cards, third party traffic estimates, and cache statistics for each site on a monthly basis.
  • The analysis schema is an extension to the core schema that includes a number of additional tables populated by OMNIAC's analysis module. The analysis module is the unit in charge of processing information held in the core schema, producing a trim dataset that accurately describes advertising activity.
  • Like the core schema, the analysis schema is composed of four conceptual sub-schemas composed of tables implementing common functionality. These sub-schemas include advertising decomposition, advertisement view summarization, slot statistics, and site statisitics. The advertising decomposition sub-schema holds information about each advertisement in the system, including attributes and what the advertisement is advertising. The advertisement view summarization sub-schema summarizes advertisement views, recording how many times each advertisement was seen in each slot over the course of a day. The slot statistics sub-schema describes advertisement rotation for each slot during each time period. The site statistics sub-schema describes site information, including advertisement rotation for each time period.
  • The primary table in the Advertising Decomposition sub-schema is AD_INFO, which contains de-normalized records describing advertisement attributes. AD_INFO records are keyed off of ID's in the AD table; an AD_INFO record exists for each AD record that has been completely classified and represents a valid advertisement. AD_INFO is populated by the analysis module from the advertising relationships described in the core schema tables ADVERTISES and LOCATION_ADVERTISES.
  • Fields in AD_INFO that specify what is advertised by an advertisement are: CATEGORY (industry), ORGANIZATION (company), ORGANIZATION_GROUP (industry segment), ORGANIZATION_OVERGROUP, COMMODITY (product/service), COMMODITY_GROUP (product/service segment), COMMODITY_OVERGROUP, and MESSAGE.
  • AD_INFO also includes fields describing a number of non-advertising attributes. FORMAT, referencing AD_SLOT_TYPE.ID, specifies the form factor of an advertisement. TECHNOLOGY, referencing AD_TECHNOLOGY2.ID, specifies the technology used to implement the advertisement. DEFINITION, IMAGE, and DESTINATION specify the AD_DEFINITION, IMAGE, and DEST_WEB_LOCATION records associated with the advertisement. These three fields mirror fields in the AD table.
  • The Advertising Decomposition schema contains a few tables in addition to AD_INFO. ADV_IMPLICATION is a cache of advertisable implications derived from the hierarchy in ADVERTISABLE. This is used to speed operation of the analysis module. AD_INFO_FLATTENED is a more readily queried version of AD_INFO containing advertisement/advertisable pairs for each of the fields in AD_INFO that reference ADVERTISABLE. Finally, AD_TECHNOLOGY2 describes advertisement technologies understood by the analysis module that are presentable to the user in the front end.
  • The Advertisement View Summarization sub-schema covers the single table PLACEMENT_SUMMARY. PLACEMENT_SUMMARY is keyed off of day, advertisement, and slot, and contains, in the CNT field, the number of times an advertisement was seen in a slot on a particular day.
  • The analysis module populates PLACEMENT_SUMMARY by aggregating hits recorded in the APD n tables, one of which exists for each day, n being the ID of the day in question. These tables are created and populated by the back-end as advertisement hits flow into the system.
  • The third sub-schema in the Analysis schema is Slot Statistics. This sub-schema describes advertisement behavior in the context of slots in addition to information about the slots themselves. A slot is a location on the Web in which advertisements rotate, currently defined in terms of the location ID (a reference to AD_WEB_LOCATION.ID) and extraction rule ID (a reference to EXTRACTION_RULE.ID).
  • The primary table in the Slot Statistics is SLOT_AD_VIEWS, which records the total views and relative frequency for each advertisement in each slot during each time period. The primary key of this table is composed of the fields PERIOD_TYPE, PERIOD, LOCATION_ID, RULE_ID and AD_ID. Two fields exist outside of the primary key: CNT holds the total number of advertisement views, and FREQUENCY holds the relative frequency.
  • Also in this sub-schema is SLOT_SUMMARY, which records general slot information outside the context of individual advertisements. Accordingly, this table is keyed off the PERIOD_TYPE, —PERIOD, LOCATION_ID and RULE_ID fields. The CNT field records total advertisement views in the slot; this field is divided into the SLOT_AD_VIEWS.CNT to determine relative frequency. Also in SLOT_SUMMARY is a SLOT_TYPE field that specifies the type of advertisement seen most frequently in the slot, and SITE_ID, which specifies which site the slot resides within.
  • The final table in the Slot Statistics sub-schema is SLOT_TYPE_COUNT. This table is used to determine which value to use in SLOT_SUMMARY.SLOT_TYPE. The number of times each advertisement format was seen is recorded, and the slot type that receives the most views is stuck into SLOT_SUMMARY.SLOT_TYPE.
  • FIG. 6 is a functional block diagram of the advertising prevalence system 130. Memory 610 of the advertising prevalence system 130 stores the software components, in accordance with the present invention, that analyze traffic data on the Internet 100, sample the advertising data from that traffic data, and generate summarization data that characterizes the advertising data. The system bus 612 connects the memory 610 of the advertising prevalence system 130 to the transmission control protocol/internet protocol (“TCP/IP”) network adapter 614, database 200, and central processor 616. The TCP/IP network adapter 614 is the mechanism that facilitates the passage of network traffic between the advertising prevalence system 130 and the Internet 100. The central processor 616 executes the programmed instructions stored in the memory 610.
  • FIG. 6 shows the functional modules of the advertising prevalence system 130 arranged as an object model. The object model groups the object-oriented software programs into components that perform the major functions and applications in the advertising prevalence system 130. A suitable implementation of the object-oriented software program components of FIG. 6 may use the Enterprise JavaBeans specification. The book by Paul J. Perrone et al., entitled “Building Java Enterprise Systems with J2EE” (Sams Publishing, June 2000) provides a description of a Java enterprise application developed using the Enterprise JavaBeans specification. The book by Matthew Reynolds, entitled “Beginning E-Commerce” (Wrox Press Inc., 2000) provides a description of the use of an object model in the design of a Web server for an Electronic Commerce application.
  • The object model for the memory 610 of the advertising prevalence system 130 employs a three-tier architecture that includes the presentation tier 620, infrastructure objects partition 630, and business logic tier 640. The object model further divides the business logic tier 640 into two partitions, the application service objects partition 650 and data objects partition 660.
  • The presentation tier 620 retains the programs that manage the graphical user interface to the advertising prevalence system 130 for the client 140, account manager 260, operator 262, and media editor 264. In FIG. 6, the presentation tier 620 includes the TCP/IP interface 622, the Web front end 624, and the user interface 626. A suitable implementation of the presentation tier 620 may use Java servlets to interact with the client 140, account manager 260, operator 262, and media editor 264 of the present invention via the hypertext transfer protocol (“HTTP”). The Java servlets run within a request/response server that handles request messages from the client 140, account manager 260, operator 262, and media editor 264 and returns response messages to the client 140, account manager 260, operator 262, and media editor 264. A Java servlet is a Java program that runs within a Web server environment. A Java servlet takes a request as input, parses the data, performs logic operations, and issues a response back to the client 140, account manager 260, operator 262, and media editor 264. The Java runtime platform pools the Java servlets to simultaneously service many requests. A TCP/IP interface 622 that uses Java servlets functions as a Web server that communicates with the client 140, account manager 260, operator 262, and media editor 264 using the HTTP protocol. The TCP/IP interface 622 accepts HTTP requests from the client 140, account manager 260, operator 262, and media editor 264 and passes the information in the request to the visit object 642 in the business logic tier 640. Visit object 642 passes result information returned from the business logic tier 640 to the TCP/IP interface 622. The TCP/IP interface 622 sends these results back to the client 140, account manager 260, operator 262, and media editor 264 in an HTTP response. The TCP/IP interface 622 uses the TCP/IP network adapter 614 to exchange data via the Internet 100.
  • The infrastructure objects partition 630 retains the programs that perform administrative and system functions on behalf of the business logic tier 640. The infrastructure objects partition 630 includes the operating system 636, and an object oriented software program component for the database management system (“DBMS”) interface 632, system administrator interface 634, and Java runtime platform 638.
  • The business logic tier 640 retains the programs that perform the substance of the present invention. The business logic tier 640 in FIG. 6 includes multiple instances of the visit object 642. A separate instance of the visit object 642 exists for each client session initiated by either the Web front end 624 or user interface 626 via the TCP/IP interface 622. Each visit object 642 is a stateful session bean that includes a persistent storage area from initiation through termination of the client session, not just during a single interaction or method call. The persistent storage area retains information associated with either the URL 114, 116, 118 or the client 140, account manager 260, operator 262, and media editor 264. In addition, the persistent storage area retains data exchanged between the advertising prevalence system 130 and the traffic sampling system 120 via the TCP/IP interface 622 such as the query result sets from a database 200 query.
  • When the traffic sampling system 120 finishes collecting information about a URL 114, 116, 118, it sends a message to the TCP/IP interface 622 that invokes a method to create a visit object 642 and stores information about the connection in the visit object 642 state. Visit object 642, in turn, invokes a method in the traffic analysis application 652 to process the information retrieved by the traffic sampling system 120. The traffic analysis application 652 stores the processed data from the anonymity system 310 and probe mapping system 320 in the traffic analysis data 662 state and the database 200. FIGS. 7A and 7B describe, in greater detail, the process that the traffic analysis application 652 follows for each URL 114, 116, 118 obtained from the traffic sampling system 120. Even though FIG. 6 depicts the central processor 616 as controlling the traffic analysis application 652, it is to be understood that the function performed by the traffic analysis application 652 can be distributed to a separate system configured similarly to the advertising prevalence system 130.
  • After the traffic analysis application 652 processes a URL 114, 116, 118 identified by the traffic sampling system 120, the visit object 642 invokes a method in the advertising sampling application 654 to retrieve the URL 114, 116, 118 from the Web site 110. The advertising sampling application 654 processes the retrieved Web page by extracting embedded advertisements and classifying those advertisements. The advertising sampling application 654 stores the data retrieved by the Web page retrieval system 322 and processed by the Web browser emulation environment 324, advertisement extractor 326, and the structural classifier 328 in the advertising sampling data 664 state and the database 200. FIGS. 7A, 7C, and 7D describe, in greater detail, the process that the advertising sampling application follows for each URL 114, 116, 118 identified by the traffic sampling system 120. Even though FIG. 6 depicts the central processor 616 as controlling the advertising sampling application 654, a person skilled in the art will realize that the processing performed by the advertising sampling application 654 can be distributed to a separate system configured similarly to the advertising prevalence system 130.
  • After the traffic analysis application 652 and the advertisement sampling system 654 process the URL 114, 116, 118 identified by the traffic sampling system 120, the visit object 642 invokes a method in the statistical summarization application 656 to compute summary statistics for the data. The statistical summarization application 656 computes the advertising impression, spending, and valuation statistics for each advertisement embedded in URL 114, 116, 118. The statistical summarization application 656 stores the statistical data in the statistical summarization data 666 state and the database 200. Even though FIG. 6 depicts the central processor 616 as controlling the statistical summarization application 656, a person skilled in the art realizes that the function performed by the statistical summarization application 656 can be distributed to a separate system configured similarly to the advertising prevalence system 130.
  • FIG. 7A is a flow diagram of a process in the advertising prevalence system 130 that measures the value online advertisements by tracking and comparing online advertising activity across all major industries, channels, advertising formats, and types. Process 700 begins, at step 710, by sampling traffic data from the Internet 100. FIG. 7B describes step 710 in greater detail. Step 720 uses the sampled traffic data from step 710 to perform site selection, and define and refine site definitions for the advertising prevalence system 130. Step 730 uses the result of the site selection and definition process to generate a probe map based on the sampled traffic data. FIG. 7C describes step 730 in greater detail. Step 740 uses the probe map from step 730 to visit the Internet 100 to gather sample data from the probe sites identified in step 730. FIG. 7D describes step 740 in greater detail. For each URL retrieved in step 740, step 750 extracts the advertisements from the URL, step 760 classifies each advertisement, and step 770 calculates the statistics for each advertisement. FIGS. 7E and 7F describe, respectively, steps 760 and 770 in greater detail. Finally, process 700 performs data integrity checks in step 780 to verify the integrity of the data and analysis results in the system.
  • FIG. 7B is a flow diagram that describes, in greater detail, the process of sampling traffic data from FIG. 7A, step 710. Process 710 begins in step 711 by gathering data from a Web traffic monitor such as the traffic sampling system 120. Process 710 strips the user information from the data retrieved by the Web traffic monitor in step 712 to cleanse the data and guarantee the anonymity of the sample. For each URL in the cleansed sample, step 713 measures the number of Web page views observed in the traffic data. Step 714 completes process 710 by statistically extrapolating the measured number of Web page view in the sample to whole universe of the Internet 100.
  • FIG. 7C is a flow diagram that describes, in greater detail, the process of generating a probe map based on sampled traffic data from FIG. 7A, step 730. Process 730 begins in step 731 by analyzing a subset of the sample traffic data that falls within eligible site definitions. Following the analysis in step 731, step 732 builds an initial probe map based on the sample traffic data. Step 733 analyzes the historic advertisement measurement results in the database 200 for the URLs in the initial probe map. Step 734 uses these historic results, as well as, system parameters to optimize the sampling plan. Step 735 completes process 730 by monitoring the sample results and adjusting the system as necessary.
  • FIG. 7D is a flow diagram that describes, in greater detail, the process of probing the Internet 100 to gather sample data from FIG. 7A, step 740. Process 740 begins in step 741 by fetching a Web page from the Internet 100. The Web page from step 741 is passed to a Web browser emulation environment in step 742 to simulate the display of that Web page in a browser. This simulation allows the advertising prevalence system 130 to detect advertisements embedded in the Web page. These advertisements may be embedded in JavaScript code, Java applet or servlet code, or common gateway interface code such as a Perl script. In addition, the simulation in step 742 allows the advertising prevalence system 130 to detect dynamic and interactive advertisements in the Web page. After the simulation in step 742, step 743 extracts the advertisement data from the Web page and step 744 stores the advertisement data in the database 200. Step 745 determines whether process 740 needs to fetch another Web page to gather more sample data. In the preferred embodiment, process 740 continuously samples Web pages from the Internet 100. A person skilled in the art realizes that the functionality performed by step 745 can be associated with a scheduling system that will schedule the probing of the Internet 100 to gather the sample advertising data.
  • FIG. 7E is a flow diagram that describes, in greater detail, the process of classifying the advertising data from FIG. 7A, step 760. Process 760 begins the analysis of advertisement fragments in step 761 by determining whether the fragment is a duplicate. When the advertising prevalence system 130 encounters an advertisement fragment for the first time, step 762 analyzes the internal structure of the fragment. Following step 762, or when step 761 determines that the advertisement fragment is a duplicate, step 763 retrieves the external content of the advertisement from the Web page. Step 764 then compares the external content to previously observed advertisements. Step 765 analyzes the result of the comparison in step 764 to determine whether the advertisement is a duplicate. When the advertising prevalence system 130 encounters an advertisement for the first time, step 766 begins processing the new advertisement by recording the structure of the new advertisement in the database 200. Step 767 then performs automated advertisement classification and stores the classification types in the database 200. Step 768 completes processing of a new advertisement by performing human verification of the advertisement classifications. Following step 768, or when step 765 determines that the advertisement is a duplicate, step 769 updates the advertisement viewing log in the database 200 to indicate the observation of the advertisement.
  • FIG. 7F is a flow diagram that describes, in greater detail, the process of calculating advertising statistics from FIG. 7A, step 770. Process 770 begins the calculation of the advertising statistics in step 771 by summarizing the advertising measurement results. In step 772, process 770 uses the probe map generated in step 730 to weight the advertising measurement results. The advertising frequency is calculated in step 773 for each Web page request. Step 774 uses the sample traffic data from step 710 and the advertising frequency from step 773 to calculate the advertising impressions for each advertisement. Step 775 completes process 770 by calculating the advertisement spending by combining the advertising impressions from step 774 and the rate card data input by the media editor 264 with the rate card collection 348 module of the user interface 240.
  • Although embodiments disclosed in the present invention describe a fully functioning system, it is to be understood that other embodiments exist which are equivalent to the embodiments disclosed herein. Since numerous modifications and variations will occur to those who review the instant application, the present invention is not limited to the exact construction and operation illustrated and described herein. Accordingly, all suitable modifications and equivalents which may be resorted to are intended to fall within the scope of the claims.

Claims (2)

1. A system for estimating the prevalence of digital content on the World-Wide-Web, comprising:
an estimating device for estimating the global traffic to a plurality of Web sites to provide traffic data;
a sampling device for statistically sampling the contents of said plurality of Web sites to provide sampling data;
a storage device for storing said traffic data and said sampling data; and an accessing device for accessing said traffic data and said sampling data stored in said storage device.
2-53. (canceled)
US11/144,110 2000-01-12 2005-06-03 System and method for estimating prevalence of digital content on the World-Wide-Web Abandoned US20050235030A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US11/144,110 US20050235030A1 (en) 2000-01-12 2005-06-03 System and method for estimating prevalence of digital content on the World-Wide-Web

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US17566500P 2000-01-12 2000-01-12
US23119500P 2000-09-07 2000-09-07
US09/695,216 US8661111B1 (en) 2000-01-12 2000-10-25 System and method for estimating prevalence of digital content on the world-wide-web
US11/144,110 US20050235030A1 (en) 2000-01-12 2005-06-03 System and method for estimating prevalence of digital content on the World-Wide-Web

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/695,216 Continuation US8661111B1 (en) 2000-01-12 2000-10-25 System and method for estimating prevalence of digital content on the world-wide-web

Publications (1)

Publication Number Publication Date
US20050235030A1 true US20050235030A1 (en) 2005-10-20

Family

ID=26871460

Family Applications (3)

Application Number Title Priority Date Filing Date
US09/695,216 Active 2024-11-08 US8661111B1 (en) 2000-01-12 2000-10-25 System and method for estimating prevalence of digital content on the world-wide-web
US11/144,110 Abandoned US20050235030A1 (en) 2000-01-12 2005-06-03 System and method for estimating prevalence of digital content on the World-Wide-Web
US14/147,618 Expired - Lifetime US9514479B2 (en) 2000-01-12 2014-01-06 System and method for estimating prevalence of digital content on the world-wide-web

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09/695,216 Active 2024-11-08 US8661111B1 (en) 2000-01-12 2000-10-25 System and method for estimating prevalence of digital content on the world-wide-web

Family Applications After (1)

Application Number Title Priority Date Filing Date
US14/147,618 Expired - Lifetime US9514479B2 (en) 2000-01-12 2014-01-06 System and method for estimating prevalence of digital content on the world-wide-web

Country Status (7)

Country Link
US (3) US8661111B1 (en)
EP (1) EP1252735B1 (en)
JP (1) JP5072160B2 (en)
AT (1) ATE522036T1 (en)
AU (1) AU2001217524A1 (en)
CA (1) CA2396565A1 (en)
WO (1) WO2001052462A2 (en)

Cited By (98)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030212667A1 (en) * 2002-05-10 2003-11-13 International Business Machines Corporation Systems, methods, and computer program products to browse database query information
US20040104925A1 (en) * 2002-12-03 2004-06-03 Lockheed Martin Corporation Visualization toolkit for data cleansing applications
US20070088693A1 (en) * 2003-09-30 2007-04-19 Google Inc. Document scoring based on traffic associated with a document
US20070239532A1 (en) * 2006-03-31 2007-10-11 Scott Benson Determining advertising statistics for advertisers and/or advertising networks
US20080004955A1 (en) * 2006-06-28 2008-01-03 Microsoft Corporation Use of business heuristics and data to optimize online advertisement and marketing
US20080004947A1 (en) * 2006-06-28 2008-01-03 Microsoft Corporation Online keyword buying, advertisement and marketing
US20080056575A1 (en) * 2006-08-30 2008-03-06 Bradley Jeffery Behm Method and system for automatically classifying page images
US20080071612A1 (en) * 2006-09-18 2008-03-20 Microsoft Corporation Logocons: ad product for brand advertisers
WO2008080104A1 (en) * 2006-12-21 2008-07-03 Google Inc. Estimating statistics for online advertising campaigns
US20080183561A1 (en) * 2007-01-26 2008-07-31 Exelate Media Ltd. Marketplace for interactive advertising targeting events
US20080235622A1 (en) * 2007-03-21 2008-09-25 Yahoo! Inc. Traffic production index and related metrics for analysis of a network of related web sites
WO2008118441A1 (en) * 2007-03-26 2008-10-02 Mix & Burn, Llc Systems and methods for enabling users to sample and acquire content
US20080249832A1 (en) * 2007-04-04 2008-10-09 Microsoft Corporation Estimating expected performance of advertisements
US20090037253A1 (en) * 2007-07-30 2009-02-05 Davidow Dorothy Young System and method for online lead generation
US20090070336A1 (en) * 2007-09-07 2009-03-12 Sap Ag Method and system for managing transmitted requests
US20090138427A1 (en) * 2007-11-27 2009-05-28 Umber Systems Method and apparatus for storing data on application-level activity and other user information to enable real-time multi-dimensional reporting about user of a mobile data network
US20090216579A1 (en) * 2008-02-22 2009-08-27 Microsoft Corporation Tracking online advertising using payment services
US20090248680A1 (en) * 2008-03-26 2009-10-01 Umber Systems System and Method for Sharing Anonymous User Profiles with a Third Party
US20100094860A1 (en) * 2008-10-09 2010-04-15 Google Inc. Indexing online advertisements
US20100145902A1 (en) * 2008-12-09 2010-06-10 Ita Software, Inc. Methods and systems to train models to extract and integrate information from data sources
US20100205215A1 (en) * 2009-02-11 2010-08-12 Cook Robert W Systems and methods for enforcing policies to block search engine queries for web-based proxy sites
US20100205291A1 (en) * 2009-02-11 2010-08-12 Richard Baldry Systems and methods for enforcing policies in the discovery of anonymizing proxy communications
US20100205297A1 (en) * 2009-02-11 2010-08-12 Gurusamy Sarathy Systems and methods for dynamic detection of anonymizing proxies
US20100205665A1 (en) * 2009-02-11 2010-08-12 Onur Komili Systems and methods for enforcing policies for proxy website detection using advertising account id
WO2010138512A1 (en) * 2009-05-26 2010-12-02 Facebook, Inc. Measuring impact of online advertising campaigns
US20100318418A1 (en) * 2009-06-16 2010-12-16 Microsoft Corporation Advertising inventory prediction for frequency-capped lines
US20110016121A1 (en) * 2009-07-16 2011-01-20 Hemanth Sambrani Activity Based Users' Interests Modeling for Determining Content Relevance
US7895076B2 (en) 1995-06-30 2011-02-22 Sony Computer Entertainment Inc. Advertisement insertion, profiling, impression, and feedback
US20110044663A1 (en) * 2009-08-19 2011-02-24 Sony Corporation Moving image recording apparatus, moving image recording method and program
US7996519B1 (en) 2007-03-07 2011-08-09 Comscore, Inc. Detecting content and user response to content
US20110209216A1 (en) * 2010-01-25 2011-08-25 Meir Zohar Method and system for website data access monitoring
US20110302164A1 (en) * 2010-05-05 2011-12-08 Saileshwar Krishnamurthy Order-Independent Stream Query Processing
US20120095842A1 (en) * 2001-06-21 2012-04-19 Fogelson Bruce A Method and system for creating ad-books
US20120150628A1 (en) * 2001-04-30 2012-06-14 Ari Rosenberg System and method for the presentation of advertisements
US20120158525A1 (en) * 2010-12-20 2012-06-21 Yahoo! Inc. Automatic classification of display ads using ad images and landing pages
US8267783B2 (en) 2005-09-30 2012-09-18 Sony Computer Entertainment America Llc Establishing an impression area
EP2510487A2 (en) * 2009-12-08 2012-10-17 comScore, Inc. Systems and methods for identification and reporting of ad delivery hierarchy
US8316446B1 (en) * 2005-04-22 2012-11-20 Blue Coat Systems, Inc. Methods and apparatus for blocking unwanted software downloads
US20120303349A1 (en) * 2008-11-07 2012-11-29 Roy H Scott Enhanced matching through explore/exploit schemes
US8416247B2 (en) 2007-10-09 2013-04-09 Sony Computer Entertaiment America Inc. Increasing the number of advertising impressions in an interactive environment
US8554602B1 (en) 2009-04-16 2013-10-08 Exelate, Inc. System and method for behavioral segment optimization based on data exchange
US20130290854A1 (en) * 2012-04-27 2013-10-31 Adobe Systems Inc. Method and apparatus for isolating analytics logic from content creation in a rich internet application
US8621068B2 (en) 2009-08-20 2013-12-31 Exelate Media Ltd. System and method for monitoring advertisement assignment
US8626584B2 (en) 2005-09-30 2014-01-07 Sony Computer Entertainment America Llc Population of an advertisement reference list
US8645992B2 (en) 2006-05-05 2014-02-04 Sony Computer Entertainment America Llc Advertisement rotation
US8676900B2 (en) 2005-10-25 2014-03-18 Sony Computer Entertainment America Llc Asynchronous advertising placement based on metadata
US8763090B2 (en) 2009-08-11 2014-06-24 Sony Computer Entertainment America Llc Management of ancillary content delivery and presentation
US8763157B2 (en) 2004-08-23 2014-06-24 Sony Computer Entertainment America Llc Statutory license restricted digital media playback on portable devices
US20140181303A1 (en) * 2012-12-21 2014-06-26 Scott Andrew Meyer Custom local content provision
US8769558B2 (en) 2008-02-12 2014-07-01 Sony Computer Entertainment America Llc Discovery and analytics for episodic downloaded media
US20140214560A1 (en) * 2006-09-06 2014-07-31 Mediamath, Inc. System and method for dynamic online advertisement creation and management
US8838784B1 (en) 2010-08-04 2014-09-16 Zettics, Inc. Method and apparatus for privacy-safe actionable analytics on mobile data usage
US8892495B2 (en) 1991-12-23 2014-11-18 Blanding Hovenweep, Llc Adaptive pattern recognition based controller apparatus and method and human-interface therefore
US9269049B2 (en) 2013-05-08 2016-02-23 Exelate, Inc. Methods, apparatus, and systems for using a reduced attribute vector of panel data to determine an attribute of a user
US9349134B1 (en) * 2007-05-31 2016-05-24 Google Inc. Detecting illegitimate network traffic
US9436953B1 (en) * 2009-10-01 2016-09-06 2Kdirect, Llc Automatic generation of electronic advertising messages containing one or more automatically selected stock photography images
US9535563B2 (en) 1999-02-01 2017-01-03 Blanding Hovenweep, Llc Internet appliance system and method
US9578044B1 (en) * 2014-03-24 2017-02-21 Amazon Technologies, Inc. Detection of anomalous advertising content
WO2017058276A1 (en) * 2015-09-29 2017-04-06 Fastly, Inc. Persistent edge state of end user devices at cache nodes
US9621472B1 (en) 2013-03-14 2017-04-11 Moat, Inc. System and method for dynamically controlling sample rates and data flow in a networked measurement system by dynamic determination of statistical significance
US9734508B2 (en) 2012-02-28 2017-08-15 Microsoft Technology Licensing, Llc Click fraud monitoring based on advertising traffic
US20170316466A1 (en) * 2012-06-30 2017-11-02 Oracle America, Inc. System and Methods for Discovering Advertising Traffic Flow and Impinging Entities
US9858526B2 (en) 2013-03-01 2018-01-02 Exelate, Inc. Method and system using association rules to form custom lists of cookies
US9864998B2 (en) 2005-10-25 2018-01-09 Sony Interactive Entertainment America Llc Asynchronous advertising
US9873052B2 (en) 2005-09-30 2018-01-23 Sony Interactive Entertainment America Llc Monitoring advertisement impressions
US10037543B2 (en) * 2012-08-13 2018-07-31 Amobee, Inc. Estimating conversion rate in display advertising from past performance data
US10049391B2 (en) 2010-03-31 2018-08-14 Mediamath, Inc. Systems and methods for providing a demand side platform
US10049377B1 (en) * 2011-06-29 2018-08-14 Google Llc Inferring interactions with advertisers
US10068188B2 (en) 2016-06-29 2018-09-04 Visual Iq, Inc. Machine learning techniques that identify attribution of small signal stimulus in noisy response channels
US10068250B2 (en) 2013-03-14 2018-09-04 Oracle America, Inc. System and method for measuring mobile advertising and content by simulating mobile-device usage
US10115124B1 (en) * 2007-10-01 2018-10-30 Google Llc Systems and methods for preserving privacy
US10223703B2 (en) 2010-07-19 2019-03-05 Mediamath, Inc. Systems and methods for determining competitive market values of an ad impression
US10332156B2 (en) 2010-03-31 2019-06-25 Mediamath, Inc. Systems and methods for using server side cookies by a demand side platform
US10346871B2 (en) * 2016-04-22 2019-07-09 Facebook, Inc. Automatic targeting of content by clustering based on user feedback data
US10354276B2 (en) 2017-05-17 2019-07-16 Mediamath, Inc. Systems, methods, and devices for decreasing latency and/or preventing data leakage due to advertisement insertion
US20190335327A1 (en) * 2018-04-27 2019-10-31 T-Mobile Usa, Inc. Partitioning network addresses in network cell data to address user privacy
US10467659B2 (en) 2016-08-03 2019-11-05 Mediamath, Inc. Methods, systems, and devices for counterfactual-based incrementality measurement in digital ad-bidding platform
US10467652B2 (en) 2012-07-11 2019-11-05 Oracle America, Inc. System and methods for determining consumer brand awareness of online advertising using recognition
US10504155B2 (en) * 2015-04-27 2019-12-10 Google Llc System and method of detection and recording of realization actions in association with content rendering
US10600089B2 (en) 2013-03-14 2020-03-24 Oracle America, Inc. System and method to measure effectiveness and consumption of editorial content
US10657538B2 (en) 2005-10-25 2020-05-19 Sony Interactive Entertainment LLC Resolution of advertising rules
CN111260341A (en) * 2020-05-06 2020-06-09 武汉中科通达高新技术股份有限公司 Traffic violation data auditing method, computer equipment and readable storage medium
US10715864B2 (en) 2013-03-14 2020-07-14 Oracle America, Inc. System and method for universal, player-independent measurement of consumer-online-video consumption behaviors
US10755300B2 (en) 2011-04-18 2020-08-25 Oracle America, Inc. Optimization of online advertising assets
US10812612B2 (en) 2015-09-09 2020-10-20 Fastly, Inc. Execution of per-user functions at cache nodes
US10846779B2 (en) 2016-11-23 2020-11-24 Sony Interactive Entertainment LLC Custom product categorization of digital media content
US10860987B2 (en) 2016-12-19 2020-12-08 Sony Interactive Entertainment LLC Personalized calendar for digital media content-related events
US10891661B2 (en) 2008-01-22 2021-01-12 2Kdirect, Llc Automatic generation of electronic advertising messages
US10931991B2 (en) 2018-01-04 2021-02-23 Sony Interactive Entertainment LLC Methods and systems for selectively skipping through media content
US11004089B2 (en) 2005-10-25 2021-05-11 Sony Interactive Entertainment LLC Associating media content files with advertisements
US11068925B2 (en) * 2013-01-13 2021-07-20 Adfin Solutions, Inc. Real-time digital asset sampling apparatuses, methods and systems
US11164219B1 (en) 2009-08-06 2021-11-02 2Kdirect, Inc. Automatic generation of electronic advertising messages
US11182829B2 (en) 2019-09-23 2021-11-23 Mediamath, Inc. Systems, methods, and devices for digital advertising ecosystems implementing content delivery networks utilizing edge computing
US11348142B2 (en) 2018-02-08 2022-05-31 Mediamath, Inc. Systems, methods, and devices for componentization, modification, and management of creative assets for diverse advertising platform environments
US11516277B2 (en) 2019-09-14 2022-11-29 Oracle International Corporation Script-based techniques for coordinating content selection across devices
US11580163B2 (en) * 2019-08-16 2023-02-14 Palo Alto Networks, Inc. Key-value storage for URL categorization
US11748433B2 (en) 2019-08-16 2023-09-05 Palo Alto Networks, Inc. Communicating URL categorization information
US11829901B2 (en) 2014-10-31 2023-11-28 The Nielsen Company (Us), Llc Methods and apparatus to identify publisher advertising behavior

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7493655B2 (en) 2000-03-22 2009-02-17 Comscore Networks, Inc. Systems for and methods of placing user identification in the header of data packets usable in user demographic reporting and collecting usage data
US7930285B2 (en) 2000-03-22 2011-04-19 Comscore, Inc. Systems for and methods of user demographic reporting usable for identifying users and collecting usage data
US7260837B2 (en) 2000-03-22 2007-08-21 Comscore Networks, Inc. Systems and methods for user identification, user demographic reporting and collecting usage data usage biometrics
US7181412B1 (en) 2000-03-22 2007-02-20 Comscore Networks Inc. Systems and methods for collecting consumer data
JP2003044685A (en) * 2001-07-27 2003-02-14 Hitachi Kokusai Electric Inc Information display device
DE10332717A1 (en) * 2003-07-18 2005-02-03 Abb Research Ltd. User guidance method e.g. for web portal, involves web portal exhibiting, in hierarchical structure and being linked with one another and user information about popularity of all sides of web portal and subordinate branches indicated
US20050021472A1 (en) * 2003-07-25 2005-01-27 David Gettman Transactions in virtual property
US7467356B2 (en) 2003-07-25 2008-12-16 Three-B International Limited Graphical user interface for 3d virtual display browser using virtual display windows
GB2404546B (en) 2003-07-25 2005-12-14 Purple Interactive Ltd A method of organising and displaying material content on a display to a viewer
US8341259B2 (en) * 2005-06-06 2012-12-25 Adobe Systems Incorporated ASP for web analytics including a real-time segmentation workbench
US20090150198A1 (en) * 2007-12-10 2009-06-11 Yaroslav Volovich Estimating tv ad impressions
US8171156B2 (en) * 2008-07-25 2012-05-01 JumpTime, Inc. Method and system for determining overall content values for content elements in a web network and for optimizing internet traffic flow through the web network
US8185431B2 (en) 2008-11-13 2012-05-22 Kwabena Benoni Abboa-Offei System and method for forecasting and pairing advertising with popular web-based media
JP5238612B2 (en) * 2009-05-29 2013-07-17 デジタル・アドバタイジング・コンソーシアム株式会社 Advertising volume estimation device and program
WO2011146391A2 (en) * 2010-05-16 2011-11-24 Access Business Group International Llc Data collection, tracking, and analysis for multiple media including impact analysis and influence tracking
US8910259B2 (en) 2010-08-14 2014-12-09 The Nielsen Company (Us), Llc Systems, methods, and apparatus to monitor mobile internet activity
US8886773B2 (en) 2010-08-14 2014-11-11 The Nielsen Company (Us), Llc Systems, methods, and apparatus to monitor mobile internet activity
US9124920B2 (en) 2011-06-29 2015-09-01 The Nielson Company (Us), Llc Methods, apparatus, and articles of manufacture to identify media presentation devices
US8594617B2 (en) 2011-06-30 2013-11-26 The Nielsen Company (Us), Llc Systems, methods, and apparatus to monitor mobile internet activity
US8972460B2 (en) * 2012-10-23 2015-03-03 Oracle International Corporation Data model optimization using multi-level entity dependencies
US10356579B2 (en) 2013-03-15 2019-07-16 The Nielsen Company (Us), Llc Methods and apparatus to credit usage of mobile devices
US9301173B2 (en) 2013-03-15 2016-03-29 The Nielsen Company (Us), Llc Methods and apparatus to credit internet usage
US10878457B2 (en) * 2014-08-21 2020-12-29 Oracle International Corporation Tunable statistical IDs
US9762688B2 (en) 2014-10-31 2017-09-12 The Nielsen Company (Us), Llc Methods and apparatus to improve usage crediting in mobile devices
US11423420B2 (en) 2015-02-06 2022-08-23 The Nielsen Company (Us), Llc Methods and apparatus to credit media presentations for online media distributions
US20160232579A1 (en) 2015-02-11 2016-08-11 The Nielsen Company (Us), Llc Methods and apparatus to detect advertisements embedded in online media
CN104866555A (en) * 2015-05-15 2015-08-26 浪潮软件集团有限公司 Automatic acquisition method based on web crawler
CN105677764B (en) 2015-12-30 2020-05-08 百度在线网络技术(北京)有限公司 Information extraction method and device
US11595275B2 (en) * 2021-06-30 2023-02-28 The Nielsen Company (Us), Llc Methods and apparatus to determine main pages from network traffic

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5878426A (en) * 1996-12-23 1999-03-02 Unisys Corporation Statistical database query using random sampling of records
US5995943A (en) * 1996-04-01 1999-11-30 Sabre Inc. Information aggregation and synthesization system

Family Cites Families (324)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3540003A (en) 1968-06-10 1970-11-10 Ibm Computer monitoring system
US3696297A (en) 1970-09-01 1972-10-03 Richard J Otero Broadcast communication system including a plurality of subscriber stations for selectively receiving and reproducing one or more of a plurality of transmitted programs each having a unique identifying cone associated therewith
US3818458A (en) 1972-11-08 1974-06-18 Comress Method and apparatus for monitoring a general purpose digital computer
US3906454A (en) 1973-05-18 1975-09-16 Bell Telephone Labor Inc Computer monitoring system
JPS5248046B2 (en) 1974-04-17 1977-12-07
US4044376A (en) 1976-08-13 1977-08-23 Control Data Corporation TV monitor
US4058829A (en) 1976-08-13 1977-11-15 Control Data Corporation TV monitor
US4166290A (en) 1978-05-10 1979-08-28 Tesdata Systems Corporation Computer monitoring system
US4236209A (en) 1978-10-31 1980-11-25 Honeywell Information Systems Inc. Intersystem transaction identification logic
US4356545A (en) 1979-08-02 1982-10-26 Data General Corporation Apparatus for monitoring and/or controlling the operations of a computer from a remote location
US4283709A (en) 1980-01-29 1981-08-11 Summit Systems, Inc. (Interscience Systems) Cash accounting and surveillance system for games
US4355372A (en) 1980-12-24 1982-10-19 Npd Research Inc. Market survey data collection method
US4516216A (en) 1981-02-02 1985-05-07 Paradyne Corporation In-service monitoring system for data communications network
US4814979A (en) 1981-04-01 1989-03-21 Teradata Corporation Network to transmit prioritized subtask pockets to dedicated processors
US4757456A (en) 1981-05-19 1988-07-12 Ralph Benghiat Device and method for utility meter reading
US4473824A (en) 1981-06-29 1984-09-25 Nelson B. Hunter Price quotation system
US4740912A (en) 1982-08-02 1988-04-26 Whitaker Ranald O Quinews-electronic replacement for the newspaper
US4725886A (en) 1983-04-21 1988-02-16 The Weather Channel, Inc. Communications system having an addressable receiver
US4916539A (en) 1983-04-21 1990-04-10 The Weather Channel, Inc. Communications system having receivers which can be addressed in selected classes
US4566030A (en) 1983-06-09 1986-01-21 Ctba Associates Television viewer data collection system
US4658290A (en) 1983-12-08 1987-04-14 Ctba Associates Television and market research data collection system and method
US4713791A (en) 1984-09-24 1987-12-15 Gte Communication Systems Corporation Real time usage meter for a processor system
US4603232A (en) 1984-09-24 1986-07-29 Npd Research, Inc. Rapid market survey collection and dissemination method
US4677552A (en) 1984-10-05 1987-06-30 Sibley Jr H C International commodity trade exchange
US4868866A (en) 1984-12-28 1989-09-19 Mcgraw-Hill Inc. Broadcast data distribution system
US4718025A (en) 1985-04-15 1988-01-05 Centec Corporation Computer management control system
US4751578A (en) 1985-05-28 1988-06-14 David P. Gordon System for electronically controllably viewing on a television updateable television programming information
JPH0727349B2 (en) 1985-07-01 1995-03-29 株式会社日立製作所 Multi-window display control method
US4706121B1 (en) 1985-07-12 1993-12-14 Insight Telecast, Inc. Tv schedule system and process
US4677466A (en) 1985-07-29 1987-06-30 A. C. Nielsen Company Broadcast program identification method and apparatus
US4695880A (en) 1985-07-30 1987-09-22 Postron Corp. Electronic information dissemination system
US4700378A (en) 1985-08-08 1987-10-13 Brown Daniel G Data base accessing system
US4907188A (en) 1985-09-12 1990-03-06 Kabushiki Kaisha Toshiba Image information search network system
US4745559A (en) 1985-12-27 1988-05-17 Reuters Limited Method and system for dynamically controlling the content of a local receiver data base from a transmitted data base in an information retrieval communication network
US4792921A (en) 1986-03-18 1988-12-20 Wang Laboratories, Inc. Network event identifiers
JPH0648811B2 (en) 1986-04-04 1994-06-22 株式会社日立製作所 Complex network data communication system
US4849879A (en) 1986-09-02 1989-07-18 Digital Equipment Corp Data processor performance advisor
US4827508A (en) 1986-10-14 1989-05-02 Personal Library Software, Inc. Database usage metering and protection system and method
US5050213A (en) 1986-10-14 1991-09-17 Electronic Publishing Resources, Inc. Database usage metering and protection system and method
US4977594A (en) 1986-10-14 1990-12-11 Electronic Publishing Resources, Inc. Database usage metering and protection system and method
US4831582A (en) 1986-11-07 1989-05-16 Allen-Bradley Company, Inc. Database access machine for factory automation network
US4845658A (en) 1986-12-01 1989-07-04 Massachusetts Institute Of Technology Information method and apparatus using simplex and duplex communications
US4935870A (en) 1986-12-15 1990-06-19 Keycom Electronic Publishing Apparatus for downloading macro programs and executing a downloaded macro program responding to activation of a single key
JPH0738183B2 (en) 1987-01-29 1995-04-26 日本電気株式会社 Communication processing method between central processing units
US4774658A (en) 1987-02-12 1988-09-27 Thomas Lewin Standardized alarm notification transmission alternative system
US4817080A (en) 1987-02-24 1989-03-28 Digital Equipment Corporation Distributed local-area-network monitoring system
GB2203573A (en) 1987-04-02 1988-10-19 Ibm Data processing network with upgrading of files
US5062147A (en) 1987-04-27 1991-10-29 Votek Systems Inc. User programmable computer monitoring system
US4887308A (en) 1987-06-26 1989-12-12 Dutton Bradley C Broadcast data storage and retrieval system
US4823290A (en) 1987-07-21 1989-04-18 Honeywell Bull Inc. Method and apparatus for monitoring the operating environment of a computer system
US4924488A (en) 1987-07-28 1990-05-08 Enforcement Support Incorporated Multiline computerized telephone monitoring system
CA1288516C (en) 1987-07-31 1991-09-03 Leendert M. Bijnagte Apparatus and method for communicating textual and image information between a host computer and a remote display terminal
US4972367A (en) 1987-10-23 1990-11-20 Allen-Bradley Company, Inc. System for generating unsolicited messages on high-tier communication link in response to changed states at station-level computers
GB8801628D0 (en) 1988-01-26 1988-02-24 British Telecomm Evaluation system
US5049873A (en) 1988-01-29 1991-09-17 Network Equipment Technologies, Inc. Communications network state and topology monitor
US4972504A (en) 1988-02-11 1990-11-20 A. C. Nielsen Company Marketing research system and method for obtaining retail data on a real time basis
SE460449B (en) 1988-02-29 1989-10-09 Ericsson Telefon Ab L M CELL DIVIDED DIGITAL MOBILE RADIO SYSTEM AND PROCEDURE TO TRANSFER INFORMATION IN A DIGITAL CELL DIVIDED MOBILE RADIO SYSTEM
US4954699A (en) 1988-04-13 1990-09-04 Npd Research, Inc. Self-administered survey questionnaire and method
US4912552A (en) 1988-04-19 1990-03-27 Control Data Corporation Distributed monitoring system
US5101402A (en) 1988-05-24 1992-03-31 Digital Equipment Corporation Apparatus and method for realtime monitoring of network sessions in a local area network
CA1337132C (en) 1988-07-15 1995-09-26 Robert Filepp Reception system for an interactive computer network and method of operation
US4977455B1 (en) 1988-07-15 1993-04-13 System and process for vcr scheduling
US5249260A (en) 1988-08-12 1993-09-28 Hitachi, Ltd. Data input system
US5247575A (en) 1988-08-16 1993-09-21 Sprague Peter J Information distribution system
US4912522A (en) 1988-08-17 1990-03-27 Asea Brown Boveri Inc. Light driven remote system and power supply therefor
JP2865675B2 (en) 1988-09-12 1999-03-08 株式会社日立製作所 Communication network control method
US5023929A (en) 1988-09-15 1991-06-11 Npd Research, Inc. Audio frequency based market survey method
US4912466A (en) 1988-09-15 1990-03-27 Npd Research Inc. Audio frequency based data capture tablet
US4989230A (en) 1988-09-23 1991-01-29 Motorola, Inc. Cellular cordless telephone
US5023907A (en) 1988-09-30 1991-06-11 Apollo Computer, Inc. Network license server
US4958284A (en) 1988-12-06 1990-09-18 Npd Group, Inc. Open ended question analysis system and method
US5161109A (en) 1988-12-16 1992-11-03 Pitney Bowes Inc. Up/down loading of databases
JP2702769B2 (en) 1989-03-28 1998-01-26 松下電器産業株式会社 Information input / output device and information input / output method
CA2053261A1 (en) 1989-04-28 1990-10-29 Gary D. Hornbuckle Method and apparatus for remotely controlling and monitoring the use of computer software
US5220522A (en) 1989-05-09 1993-06-15 Ansan Industries, Ltd. Peripheral data acquisition, monitor, and control device for a personal computer
US5047867A (en) 1989-06-08 1991-09-10 North American Philips Corporation Interface for a TV-VCR system
US5038211A (en) 1989-07-05 1991-08-06 The Superguide Corporation Method and apparatus for transmitting and receiving television program information
JP2584113B2 (en) 1989-07-21 1997-02-19 松下電器産業株式会社 Data transfer method and data transfer device
GB2236454A (en) 1989-09-01 1991-04-03 Philips Electronic Associated Communications system for radio telephones
US5063610A (en) 1989-09-27 1991-11-05 Ing Communications, Inc. Broadcasting system with supplemental data transmission and storage
US5214792A (en) 1989-09-27 1993-05-25 Alwadish David J Broadcasting system with supplemental data transmission and storge
GB8922702D0 (en) 1989-10-09 1989-11-22 Videologic Ltd Radio television receiver
US5301350A (en) 1989-10-10 1994-04-05 Unisys Corporation Real time storage/retrieval subsystem for document processing in banking operations
US5339239A (en) 1989-10-13 1994-08-16 Mitsubishi Plastics Industries Limited Information collecting and/or service furnishing systems by which a user can request information from a central data base using a portable personal terminal and an access terminal
US5247517A (en) 1989-10-20 1993-09-21 Novell, Inc. Method and apparatus for analyzing networks
US5099319A (en) 1989-10-23 1992-03-24 Esch Arthur G Video information delivery method and apparatus
US5241671C1 (en) 1989-10-26 2002-07-02 Encyclopaedia Britannica Educa Multimedia search system using a plurality of entry path means which indicate interrelatedness of information
JPH03142678A (en) 1989-10-30 1991-06-18 Toshiba Corp Electronic filing system
JP2804125B2 (en) 1989-11-08 1998-09-24 株式会社日立製作所 Fault monitoring device and control method for information processing system
JPH03161873A (en) 1989-11-20 1991-07-11 Ricoh Co Ltd Electronic filing device having data base constructing function
US5159685A (en) 1989-12-06 1992-10-27 Racal Data Communications Inc. Expert system for communications network
US5267351A (en) 1989-12-22 1993-11-30 Avid Technology, Inc. Media storage and retrieval system
US5038374A (en) 1990-01-08 1991-08-06 Dynamic Broadcasting Network, Inc. Data transmission and storage
US5008929A (en) 1990-01-18 1991-04-16 U.S. Intelco Networks, Inc. Billing system for telephone signaling network
JPH03260757A (en) 1990-03-09 1991-11-20 Toshiba Corp Decentralized computer network
US5408607A (en) 1990-03-19 1995-04-18 Hitachi, Ltd. Information transfer system
JP2799038B2 (en) 1990-04-10 1998-09-17 株式会社東芝 Continuous scrolling device for large-scale images
US5150116A (en) 1990-04-12 1992-09-22 West Harold B Traffic-light timed advertising center
KR920010811B1 (en) 1990-05-10 1992-12-17 주식회사 금성사 Tv teletext apparatus
US5367677A (en) 1990-05-11 1994-11-22 Thinking Machines Corporation System for iterated generation from an array of records of a posting file with row segments based on column entry value ranges
US5276458A (en) 1990-05-14 1994-01-04 International Business Machines Corporation Display system
US5276789A (en) 1990-05-14 1994-01-04 Hewlett-Packard Co. Graphic display of network topology
US5226120A (en) 1990-05-21 1993-07-06 Synoptics Communications, Inc. Apparatus and method of monitoring the status of a local area network
CA2036205C (en) 1990-06-01 1996-11-19 Russell J. Welsh Program monitoring unit
US5032979A (en) 1990-06-22 1991-07-16 International Business Machines Corporation Distributed security auditing subsystem for an operating system
JPH04109352A (en) 1990-08-29 1992-04-10 Nec Corp On-line information processor
US5388252A (en) 1990-09-07 1995-02-07 Eastman Kodak Company System for transparent monitoring of processors in a network with display of screen images at a remote station for diagnosis by technical support personnel
DE69020899T2 (en) 1990-09-28 1995-12-07 Hewlett Packard Co Network monitoring system and device.
EP0553285B1 (en) 1990-10-16 2000-03-01 Consilium, Inc. Object-oriented architecture for factory floor management
US5204947A (en) 1990-10-31 1993-04-20 International Business Machines Corporation Application independent (open) hypermedia enablement services
US5297249A (en) 1990-10-31 1994-03-22 International Business Machines Corporation Hypermedia link marker abstract and search services
US5287363A (en) 1991-07-01 1994-02-15 Disk Technician Corporation System for locating and anticipating data storage media failures
US5241625A (en) 1990-11-27 1993-08-31 Farallon Computing, Inc. Screen image sharing among heterogeneous computers
US5239540A (en) 1990-11-27 1993-08-24 Scientific-Atlanta, Inc. Method and apparatus for transmitting, receiving and communicating digital data signals with corresponding program data signals which describe the digital data signals
US5327554A (en) 1990-11-29 1994-07-05 Palazzi Iii Michael A Interactive terminal for the access of remote database information
EP0491068A1 (en) 1990-12-18 1992-06-24 International Business Machines Corporation Selective data broadcasting receiver adapter for personal computers
US5210530A (en) 1991-01-04 1993-05-11 Codex Corporation Network management interface with internal dsd
US5231593A (en) 1991-01-11 1993-07-27 Hewlett-Packard Company Maintaining historical lan traffic statistics
US5321838A (en) 1991-02-28 1994-06-14 Hensley Billy W Event capturing for computer software evaluation
US5333302A (en) 1991-02-28 1994-07-26 Hensley Billy W Filtering event capture data for computer software evaluation
KR940007649B1 (en) 1991-04-03 1994-08-22 삼성전자 주식회사 Semiconductor device
US5223827A (en) 1991-05-23 1993-06-29 International Business Machines Corporation Process and apparatus for managing network event counters
US5237681A (en) 1991-05-24 1993-08-17 Bell Communications Research, Inc. Relational data base memory utilization analyzer
US5327237A (en) 1991-06-14 1994-07-05 Wavephore, Inc. Transmitting data with video
US5406269A (en) 1991-07-05 1995-04-11 David Baran Method and apparatus for the remote verification of the operation of electronic devices by standard transmission mediums
US5355484A (en) 1991-08-12 1994-10-11 International Business Machines Corporation Dynamically established event monitors in event management services of a computer system
US5237684A (en) 1991-08-12 1993-08-17 International Business Machines Corporation Customized and versatile event monitor within event management services of a computer system
US5260878A (en) 1991-09-06 1993-11-09 Automation, Inc. Web press monitoring system
US5371846A (en) 1991-10-16 1994-12-06 International Business Machines Corporation Non-linear scroll bar
US5355327A (en) 1991-11-26 1994-10-11 Davox Corporation Automated statistical data collection system
JPH06510150A (en) 1991-11-27 1994-11-10 テレフオンアクチーボラゲツト エル エム エリクソン Software structure of communication switching system
FR2685511B1 (en) 1991-12-19 1994-02-04 Bull Sa METHOD FOR CLASSIFYING COMPUTER ARCHITECTURES.
US5315093A (en) 1992-02-05 1994-05-24 A. C. Nielsen Company Market research method and system for collecting retail store market research data
US5495581A (en) 1992-02-25 1996-02-27 Tsai; Irving Method and apparatus for linking a document with associated reference information using pattern matching
US5351278A (en) 1992-03-09 1994-09-27 Hitachi, Ltd. X-ray tomography method and apparatus thereof
US5331544A (en) 1992-04-23 1994-07-19 A. C. Nielsen Company Market research method and system for collecting retail store and shopper market research data
US5262860A (en) 1992-04-23 1993-11-16 International Business Machines Corporation Method and system communication establishment utilizing captured and processed visually perceptible data within a broadcast video signal
US5281962A (en) 1992-05-08 1994-01-25 Motorola, Inc. Method and apparatus for automatic generation and notification of tag information corresponding to a received message
US5349662A (en) 1992-05-21 1994-09-20 International Business Machines Corporation Method of and apparatus for providing automatic detection of user activity
US5223924A (en) 1992-05-27 1993-06-29 North American Philips Corporation System and method for automatically correlating user preferences with a T.V. program information database
US5390281A (en) 1992-05-27 1995-02-14 Apple Computer, Inc. Method and apparatus for deducing user intent and providing computer implemented services
US5309243A (en) 1992-06-10 1994-05-03 Eastman Kodak Company Method and apparatus for extending the dynamic range of an electronic imaging system
US5361359A (en) 1992-08-31 1994-11-01 Trusted Information Systems, Inc. System and method for controlling the use of a computer
US5267314A (en) 1992-11-17 1993-11-30 Leon Stambler Secure transaction system and method utilized therein
US5485897A (en) 1992-11-24 1996-01-23 Sanyo Electric Co., Ltd. Elevator display system using composite images to display car position
US5317140A (en) 1992-11-24 1994-05-31 Dunthorn David I Diffusion-assisted position location particularly for visual pen detection
US5351293A (en) 1993-02-01 1994-09-27 Wave Systems Corp. System method and apparatus for authenticating an encrypted signal
US5483658A (en) 1993-02-26 1996-01-09 Grube; Gary W. Detection of unauthorized use of software applications in processing devices
US5375070A (en) 1993-03-01 1994-12-20 International Business Machines Corporation Information collection architecture and method for a data communications network
US5414809A (en) 1993-04-30 1995-05-09 Texas Instruments Incorporated Graphical display of data
US5461708A (en) 1993-08-06 1995-10-24 Borland International, Inc. Systems and methods for automated graphing of spreadsheet information
JP3165765B2 (en) 1993-09-20 2001-05-14 富士通株式会社 CAD design support equipment
US5499340A (en) 1994-01-12 1996-03-12 Isogon Corporation Method and apparatus for computer program usage monitoring
US5799292A (en) 1994-04-29 1998-08-25 International Business Machines Corporation Adaptive hypermedia presentation method and system
JPH07302236A (en) 1994-05-06 1995-11-14 Hitachi Ltd Information processing system, method therefor and service providing method in the information processing system
US5594911A (en) 1994-07-13 1997-01-14 Bell Communications Research, Inc. System and method for preprocessing and delivering multimedia presentations
US5604867A (en) 1994-07-22 1997-02-18 Network Peripherals System for transmitting data between bus and network having device comprising first counter for providing transmitting rate and second counter for limiting frames exceeding rate
US5655140A (en) 1994-07-22 1997-08-05 Network Peripherals Apparatus for translating frames of data transferred between heterogeneous local area networks
US5623652A (en) 1994-07-25 1997-04-22 Apple Computer, Inc. Method and apparatus for searching for information in a network and for controlling the display of searchable information on display devices in the network
US5926168A (en) 1994-09-30 1999-07-20 Fan; Nong-Qiang Remote pointers for interactive televisions
US5724521A (en) 1994-11-03 1998-03-03 Intel Corporation Method and apparatus for providing electronic advertisements to end users in a consumer best-fit pricing manner
US5717923A (en) 1994-11-03 1998-02-10 Intel Corporation Method and apparatus for dynamically customizing electronic information to individual end users
US5491820A (en) 1994-11-10 1996-02-13 At&T Corporation Distributed, intermittently connected, object-oriented database and management system
US5638443A (en) 1994-11-23 1997-06-10 Xerox Corporation System for controlling the distribution and use of composite digital works
US6029195A (en) * 1994-11-29 2000-02-22 Herz; Frederick S. M. System for customized electronic identification of desirable objects
US5682525A (en) 1995-01-11 1997-10-28 Civix Corporation System and methods for remotely accessing a selected group of items of interest from a database
US5835758A (en) * 1995-02-28 1998-11-10 Vidya Technologies, Inc. Method and system for respresenting and processing physical and conceptual entities
JPH08256174A (en) 1995-03-16 1996-10-01 Hitachi Ltd Electronic transmission and read system for publication
US5696702A (en) 1995-04-17 1997-12-09 Skinner; Gary R. Time and work tracker
US5963914A (en) 1995-04-17 1999-10-05 Skinner; Gary R. Network time and work tracker
US5748954A (en) * 1995-06-05 1998-05-05 Carnegie Mellon University Method for searching a queued and ranked constructed catalog of files stored on a network
US5675510A (en) 1995-06-07 1997-10-07 Pc Meter L.P. Computer use meter and analyzer
US5708780A (en) 1995-06-07 1998-01-13 Open Market, Inc. Internet server access control and monitoring systems
US5710918A (en) 1995-06-07 1998-01-20 International Business Machines Corporation Method for distributed task fulfillment of web browser requests
US5671283A (en) 1995-06-08 1997-09-23 Wave Systems Corp. Secure communication system with cross linked cryptographic codes
US5615264A (en) 1995-06-08 1997-03-25 Wave Systems Corp. Encrypted data package record for use in remote transaction metered data system
US5740549A (en) 1995-06-12 1998-04-14 Pointcast, Inc. Information and advertising distribution system and method
US6807558B1 (en) 1995-06-12 2004-10-19 Pointcast, Inc. Utilization of information “push” technology
US5724571A (en) * 1995-07-07 1998-03-03 Sun Microsystems, Inc. Method and apparatus for generating query responses in a computer-based document retrieval system
US5648965A (en) 1995-07-07 1997-07-15 Sun Microsystems, Inc. Method and apparatus for dynamic distributed packet tracing and analysis
US5634100A (en) 1995-08-07 1997-05-27 Apple Computer, Inc. System and method for event parameter interdependence and adjustment with pen input
US5568471A (en) 1995-09-06 1996-10-22 International Business Machines Corporation System and method for a workstation monitoring and control of multiple networks having different protocols
US5717860A (en) 1995-09-20 1998-02-10 Infonautics Corporation Method and apparatus for tracking the navigation path of a user on the world wide web
US5819285A (en) 1995-09-20 1998-10-06 Infonautics Corporation Apparatus for capturing, storing and processing co-marketing information associated with a user of an on-line computer service using the world-wide-web.
US5712979A (en) 1995-09-20 1998-01-27 Infonautics Corporation Method and apparatus for attaching navigational history information to universal resource locator links on a world wide web page
JPH0991308A (en) * 1995-09-28 1997-04-04 Nippon Telegr & Teleph Corp <Ntt> Information search system
US5572643A (en) 1995-10-19 1996-11-05 Judson; David H. Web browser with dynamic display of information objects during linking
US5737619A (en) 1995-10-19 1998-04-07 Judson; David Hugh World wide web browsing with content delivery over an idle connection and interstitial content display
US6279112B1 (en) 1996-10-29 2001-08-21 Open Market, Inc. Controlled transfer of information in computer networks
US5872588A (en) 1995-12-06 1999-02-16 International Business Machines Corporation Method and apparatus for monitoring audio-visual materials presented to a subscriber
US5708709A (en) 1995-12-08 1998-01-13 Sun Microsystems, Inc. System and method for managing try-and-buy usage of application programs
US5710915A (en) 1995-12-21 1998-01-20 Electronic Data Systems Corporation Method for accelerating access to a database clustered partitioning
US6264560B1 (en) 1996-01-19 2001-07-24 Sheldon F. Goldberg Method and system for playing games on a network
US5823879A (en) 1996-01-19 1998-10-20 Sheldon F. Goldberg Network gaming system
US5878213A (en) * 1996-02-15 1999-03-02 International Business Machines Corporation Methods, systems and computer program products for the synchronization of time coherent caching system
US6189030B1 (en) 1996-02-21 2001-02-13 Infoseek Corporation Method and apparatus for redirection of server external hyper-link references
US5751956A (en) 1996-02-21 1998-05-12 Infoseek Corporation Method and apparatus for redirection of server external hyper-link references
US5706502A (en) 1996-03-25 1998-01-06 Sun Microsystems, Inc. Internet-enabled portfolio manager system and method
US5964839A (en) 1996-03-29 1999-10-12 At&T Corp System and method for monitoring information flow and performing data collection
US5878384A (en) 1996-03-29 1999-03-02 At&T Corp System and method for monitoring information flow and performing data collection
US5848396A (en) 1996-04-26 1998-12-08 Freedom Of Information, Inc. Method and apparatus for determining behavioral profile of a computer user
US6018619A (en) 1996-05-24 2000-01-25 Microsoft Corporation Method, system and apparatus for client-side usage tracking of information server systems
US5787253A (en) 1996-05-28 1998-07-28 The Ag Group Apparatus and method of analyzing internet activity
US6014638A (en) 1996-05-29 2000-01-11 America Online, Inc. System for customizing computer displays in accordance with user preferences
US5673382A (en) 1996-05-30 1997-09-30 International Business Machines Corporation Automated management of off-site storage volumes for disaster recovery
US5715453A (en) 1996-05-31 1998-02-03 International Business Machines Corporation Web server mechanism for processing function calls for dynamic data queries in a web page
US5799100A (en) 1996-06-03 1998-08-25 University Of South Florida Computer-assisted method and apparatus for analysis of x-ray images using wavelet transforms
US5935207A (en) 1996-06-03 1999-08-10 Webtv Networks, Inc. Method and apparatus for providing remote site administrators with user hits on mirrored web sites
US5727129A (en) * 1996-06-04 1998-03-10 International Business Machines Corporation Network system for profiling and actively facilitating user activities
US5956483A (en) 1996-06-28 1999-09-21 Microsoft Corporation System and method for making function calls from a web browser to a local application
US6070145A (en) 1996-07-12 2000-05-30 The Npd Group, Inc. Respondent selection method for network-based survey
JP3996673B2 (en) * 1996-08-08 2007-10-24 義宇 江 Information collection method and information collection system on the Internet
US5931912A (en) 1996-08-09 1999-08-03 International Business Machines Corporation Traversal path-based approach to understanding user-oriented hypertext object usage
US5933811A (en) 1996-08-20 1999-08-03 Paul D. Angles System and method for delivering customized advertisements within interactive communication systems
US6108637A (en) * 1996-09-03 2000-08-22 Nielsen Media Research, Inc. Content display monitor
US5838919A (en) 1996-09-10 1998-11-17 Ganymede Software, Inc. Methods, systems and computer program products for endpoint pair based communications network performance testing
US6012083A (en) * 1996-09-24 2000-01-04 Ricoh Company Ltd. Method and apparatus for document processing using agents to process transactions created based on document content
US5960409A (en) * 1996-10-11 1999-09-28 Wexler; Daniel D. Third-party on-line accounting system and method therefor
JPH10124491A (en) * 1996-10-24 1998-05-15 Fujitsu Ltd System for sharing and aligning document and device for managing shared document and device for performing access to document
US5948061A (en) 1996-10-29 1999-09-07 Double Click, Inc. Method of delivery, targeting, and measuring advertising over networks
US6108782A (en) 1996-12-13 2000-08-22 3Com Corporation Distributed remote monitoring (dRMON) for networks
US5784635A (en) * 1996-12-31 1998-07-21 Integration Concepts, Inc. System and method for the rationalization of physician data
US5732218A (en) * 1997-01-02 1998-03-24 Lucent Technologies Inc. Management-data-gathering system for gathering on clients and servers data regarding interactions between the servers, the clients, and users of the clients during real use of a network of clients and servers
US6052730A (en) 1997-01-10 2000-04-18 The Board Of Trustees Of The Leland Stanford Junior University Method for monitoring and/or modifying web browsing sessions
US5819156A (en) 1997-01-14 1998-10-06 Compaq Computer Corp. PC/TV usage tracking and reporting device
US5986653A (en) 1997-01-21 1999-11-16 Netiq Corporation Event signaling in a foldable object tree
US5829001A (en) 1997-01-21 1998-10-27 Netiq Corporation Database updates over a network
US5999178A (en) 1997-01-21 1999-12-07 Netiq Corporation Selection, type matching and manipulation of resource objects by a computer program
US6049821A (en) * 1997-01-24 2000-04-11 Motorola, Inc. Proxy host computer and method for accessing and retrieving information between a browser and a proxy
US6366956B1 (en) * 1997-01-29 2002-04-02 Microsoft Corporation Relevance access of Internet information services
US6173311B1 (en) 1997-02-13 2001-01-09 Pointcast, Inc. Apparatus, method and article of manufacture for servicing client requests on a network
US6112238A (en) 1997-02-14 2000-08-29 Webtrends Corporation System and method for analyzing remote traffic data in a distributed computing environment
US5913030A (en) * 1997-03-18 1999-06-15 International Business Machines Corporation Method and system for client/server communications with user information revealed as a function of willingness to reveal and whether the information is required
US5958010A (en) 1997-03-20 1999-09-28 Firstsense Software, Inc. Systems and methods for monitoring distributed applications including an interface running in an operating system kernel
US5796952A (en) 1997-03-21 1998-08-18 Dot Com Development, Inc. Method and apparatus for tracking client interaction with a network resource and creating client profiles and resource database
US6643696B2 (en) 1997-03-21 2003-11-04 Owen Davis Method and apparatus for tracking client interaction with a network resource and creating client profiles and resource database
US6094684A (en) * 1997-04-02 2000-07-25 Alpha Microsystems, Inc. Method and apparatus for data communication
US6115718A (en) * 1998-04-01 2000-09-05 Xerox Corporation Method and apparatus for predicting document access in a collection of linked documents featuring link proprabilities and spreading activation
US6044376A (en) * 1997-04-24 2000-03-28 Imgis, Inc. Content stream analysis
US5878223A (en) * 1997-05-07 1999-03-02 International Business Machines Corporation System and method for predictive caching of information pages
US5999940A (en) 1997-05-28 1999-12-07 Home Information Services, Inc. Interactive information discovery tool and methodology
US6250930B1 (en) 1997-05-30 2001-06-26 Picante Communications Corporation Multi-functional communication and aggregation platform
US6091956A (en) 1997-06-12 2000-07-18 Hollenberg; Dennis D. Situation information system
US6353929B1 (en) 1997-06-23 2002-03-05 One River Worldtrek, Inc. Cooperative system for measuring electronic media
JP3470861B2 (en) * 1997-07-17 2003-11-25 株式会社日立情報システムズ Reference access information acquisition system
US5937392A (en) * 1997-07-28 1999-08-10 Switchboard Incorporated Banner advertising display system and method with frequency of advertisement control
US6112240A (en) 1997-09-03 2000-08-29 International Business Machines Corporation Web site client information tracker
US6115608A (en) 1997-09-10 2000-09-05 Northern Telecom Limited Intersystem handover method and apparatus
US6112212A (en) * 1997-09-15 2000-08-29 The Pangea Project Llc Systems and methods for organizing and analyzing information stored on a computer network
US5999929A (en) * 1997-09-29 1999-12-07 Continuum Software, Inc World wide web link referral system and method for generating and providing related links for links identified in web pages
US5951643A (en) 1997-10-06 1999-09-14 Ncr Corporation Mechanism for dependably organizing and managing information for web synchronization and tracking among multiple browsers
US6084875A (en) * 1997-10-29 2000-07-04 Ericsson Inc. Routing of internet traffic and related internet service provider services
US6230204B1 (en) * 1997-12-19 2001-05-08 Micron Electronics, Inc. Method and system for estimating usage of computer resources
US6167358A (en) 1997-12-19 2000-12-26 Nowonder, Inc. System and method for remotely monitoring a plurality of computer-based systems
US6467089B1 (en) 1997-12-23 2002-10-15 Nielsen Media Research, Inc. Audience measurement system incorporating a mobile handset
US6434532B2 (en) 1998-03-12 2002-08-13 Aladdin Knowledge Systems, Ltd. Interactive customer support for computer programs using network connection of user machine
US6141686A (en) * 1998-03-13 2000-10-31 Deterministic Networks, Inc. Client-side application-classifier gathering network-traffic statistics and application and user names using extensible-service provider plugin for policy-based network control
US6167402A (en) * 1998-04-27 2000-12-26 Sun Microsystems, Inc. High performance message store
US6275854B1 (en) 1998-05-15 2001-08-14 International Business Machines Corporation Method and apparatus for detecting actual viewing of electronic advertisements
US6572662B2 (en) * 1998-05-15 2003-06-03 International Business Machines Corporation Dynamic customized web tours
US6279036B1 (en) 1998-05-15 2001-08-21 International Business Machines Corporation Method and apparatus for detecting actual viewing or electronic advertisements
US6182097B1 (en) 1998-05-21 2001-01-30 Lucent Technologies Inc. Method for characterizing and visualizing patterns of usage of a web site by network users
US6434614B1 (en) * 1998-05-29 2002-08-13 Nielsen Media Research, Inc. Tracking of internet advertisements using banner tags
US6327619B1 (en) 1998-07-08 2001-12-04 Nielsen Media Research, Inc. Metering of internet content using a control
US6272176B1 (en) 1998-07-16 2001-08-07 Nielsen Media Research, Inc. Broadcast encoding system and method
US6609102B2 (en) 1998-07-20 2003-08-19 Usa Technologies, Inc. Universal interactive advertizing and payment system for public access electronic commerce and business related products and services
US6317787B1 (en) 1998-08-11 2001-11-13 Webtrends Corporation System and method for analyzing web-server log files
US6324546B1 (en) 1998-10-12 2001-11-27 Microsoft Corporation Automatic logging of application program launches
US6285983B1 (en) * 1998-10-21 2001-09-04 Lend Lease Corporation Ltd. Marketing systems and methods that preserve consumer privacy
US6487538B1 (en) * 1998-11-16 2002-11-26 Sun Microsystems, Inc. Method and apparatus for local advertising
US6564251B2 (en) * 1998-12-03 2003-05-13 Microsoft Corporation Scalable computing system for presenting customized aggregation of information
US6397359B1 (en) 1999-01-19 2002-05-28 Netiq Corporation Methods, systems and computer program products for scheduled network performance testing
US6892238B2 (en) 1999-01-27 2005-05-10 International Business Machines Corporation Aggregating and analyzing information about content requested in an e-commerce web environment to determine conversion rates
US6466970B1 (en) 1999-01-27 2002-10-15 International Business Machines Corporation System and method for collecting and analyzing information about content requested in a network (World Wide Web) environment
US6366298B1 (en) * 1999-06-03 2002-04-02 Netzero, Inc. Monitoring of individual internet usage
WO2000055783A1 (en) * 1999-03-12 2000-09-21 Netratings, Inc. Method and apparatus for measuring user access to image data
US6393479B1 (en) 1999-06-04 2002-05-21 Webside Story, Inc. Internet website traffic flow analysis
US6606657B1 (en) * 1999-06-22 2003-08-12 Comverse, Ltd. System and method for processing and presenting internet usage information
AU1354901A (en) * 1999-11-10 2001-06-06 Amazon.Com, Inc. Method and system for allocating display space
US7017143B1 (en) 1999-12-01 2006-03-21 Microsoft Corporation External resource files for application development and management
FR2802368B1 (en) 1999-12-14 2002-01-18 Net Value AUDIENCE MEASUREMENT ON COMMUNICATION NETWORK
US6625648B1 (en) 2000-01-07 2003-09-23 Netiq Corporation Methods, systems and computer program products for network performance testing through active endpoint pair based testing and passive application monitoring
US6671715B1 (en) 2000-01-21 2003-12-30 Microstrategy, Inc. System and method for automatic, real-time delivery of personalized informational and transactional data to users via high throughput content delivery device
US6662195B1 (en) 2000-01-21 2003-12-09 Microstrategy, Inc. System and method for information warehousing supporting the automatic, real-time delivery of personalized informational and transactional data to users via content delivery device
US6651063B1 (en) * 2000-01-28 2003-11-18 Andrei G. Vorobiev Data organization and management system and method
US6587835B1 (en) 2000-02-09 2003-07-01 G. Victor Treyz Shopping assistance with handheld computing device
US6834308B1 (en) 2000-02-17 2004-12-21 Audible Magic Corporation Method and apparatus for identifying media content presented on a media playing device
US6499565B1 (en) 2000-03-15 2002-12-31 Case Corporation Apparatus and method for cooling an axle
GB0008908D0 (en) 2000-04-11 2000-05-31 Hewlett Packard Co Shopping assistance service
US6738808B1 (en) 2000-06-30 2004-05-18 Bell South Intellectual Property Corporation Anonymous location service for wireless networks
US6647269B2 (en) 2000-08-07 2003-11-11 Telcontar Method and system for analyzing advertisements delivered to a mobile unit
US7225246B2 (en) 2000-08-21 2007-05-29 Webtrends, Inc. Data tracking using IP address filtering over a wide area network
US6745011B1 (en) 2000-09-01 2004-06-01 Telephia, Inc. System and method for measuring wireless device and network usage and performance metrics
US7680672B2 (en) 2000-10-20 2010-03-16 Adobe Systems, Incorporated Event collection architecture
US7600014B2 (en) 2000-11-16 2009-10-06 Symantec Corporation Method and system for monitoring the performance of a distributed application
US20020112048A1 (en) 2000-12-11 2002-08-15 Francois Gruyer System and method for providing behavioral information of a user accessing on-line resources
AU3072902A (en) 2000-12-18 2002-07-01 Wireless Valley Comm Inc Textual and graphical demarcation of location, and interpretation of measurements
US20020078191A1 (en) 2000-12-20 2002-06-20 Todd Lorenz User tracking in a Web session spanning multiple Web resources without need to modify user-side hardware or software or to store cookies at user-side hardware
US20020133393A1 (en) 2001-03-15 2002-09-19 Hidenori Tatsumi Viewing information collection system and method using data braodcasting, and broadcast receiver, viewing information server, shop terminal, and advertiser terminal used therein
US20020144283A1 (en) 2001-03-30 2002-10-03 Intertainer, Inc. Content distribution system
WO2002084507A1 (en) 2001-04-13 2002-10-24 Netiq Corporation User-side tracking of multimedia application usage within a web page
US6569095B2 (en) 2001-04-23 2003-05-27 Cardionet, Inc. Adaptive selection of a warning limit in patient monitoring
US6968178B2 (en) 2001-04-27 2005-11-22 Hewlett-Packard Development Company, L.P. Profiles for information acquisition by devices in a wireless network
US6684206B2 (en) 2001-05-18 2004-01-27 Hewlett-Packard Development Company, L.P. OLAP-based web access analysis method and system
US6609239B1 (en) 2001-08-23 2003-08-19 National Semiconductor Corporation Efficient integrated circuit layout for improved matching between I and Q paths in radio receivers
US6719660B2 (en) 2001-08-27 2004-04-13 Visteon Global Technologies, Inc. Power train assembly
US20030054757A1 (en) 2001-09-19 2003-03-20 Kolessar Ronald S. Monitoring usage of media data with non-program data elimination
EP1435058A4 (en) 2001-10-11 2005-12-07 Visualsciences Llc System, method, and computer program product for processing and visualization of information
US7155210B2 (en) 2001-12-13 2006-12-26 Ncr Corporation System and method for short-range wireless retail advertising aimed at short-range wireless protocol-enabled personal devices
US7185085B2 (en) 2002-02-27 2007-02-27 Webtrends, Inc. On-line web traffic sampling
US20030177488A1 (en) 2002-03-12 2003-09-18 Smith Geoff S. Systems and methods for media audience measurement
US7206647B2 (en) 2002-03-21 2007-04-17 Ncr Corporation E-appliance for mobile online retailing
US20030187677A1 (en) 2002-03-28 2003-10-02 Commerce One Operations, Inc. Processing user interaction data in a collaborative commerce environment
US20030208578A1 (en) 2002-05-01 2003-11-06 Steven Taraborelli Web marketing method and system for increasing volume of quality visitor traffic on a web site
US7143365B2 (en) 2002-06-18 2006-11-28 Webtrends, Inc. Method and apparatus for using a browser to configure a software program
US20040260470A1 (en) 2003-06-14 2004-12-23 Rast Rodger H. Conveyance scheduling and logistics system
US7483975B2 (en) 2004-03-26 2009-01-27 Arbitron, Inc. Systems and methods for gathering data concerning usage of media data
US20060004627A1 (en) 2004-06-30 2006-01-05 Shumeet Baluja Advertisements for devices with call functionality, such as mobile phones
US20070038516A1 (en) 2005-08-13 2007-02-15 Jeff Apple Systems, methods, and computer program products for enabling an advertiser to measure user viewing of and response to an advertisement
US20080086356A1 (en) 2005-12-09 2008-04-10 Steve Glassman Determining advertisements using user interest information and map-based location information
AU2006327157B2 (en) 2005-12-20 2013-03-07 Arbitron Inc. Methods and systems for conducting research operations
EP3010167B1 (en) 2006-03-27 2017-07-05 Nielsen Media Research, Inc. Methods and systems to meter media content presented on a wireless communication device
US7702317B2 (en) 2006-04-27 2010-04-20 M:Metrics, Inc. System and method to query wireless network offerings
CN101467171A (en) 2006-06-29 2009-06-24 尼尔逊媒介研究股份有限公司 Methods and apparatus to monitor consumer behavior associated with location-based web services

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5995943A (en) * 1996-04-01 1999-11-30 Sabre Inc. Information aggregation and synthesization system
US5878426A (en) * 1996-12-23 1999-03-02 Unisys Corporation Statistical database query using random sampling of records

Cited By (196)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8892495B2 (en) 1991-12-23 2014-11-18 Blanding Hovenweep, Llc Adaptive pattern recognition based controller apparatus and method and human-interface therefore
US7895076B2 (en) 1995-06-30 2011-02-22 Sony Computer Entertainment Inc. Advertisement insertion, profiling, impression, and feedback
US9535563B2 (en) 1999-02-01 2017-01-03 Blanding Hovenweep, Llc Internet appliance system and method
US9015747B2 (en) 1999-12-02 2015-04-21 Sony Computer Entertainment America Llc Advertisement rotation
US10390101B2 (en) 1999-12-02 2019-08-20 Sony Interactive Entertainment America Llc Advertisement rotation
US8272964B2 (en) 2000-07-04 2012-09-25 Sony Computer Entertainment America Llc Identifying obstructions in an impression area
US9195991B2 (en) 2001-02-09 2015-11-24 Sony Computer Entertainment America Llc Display of user selected advertising content in a digital environment
US9984388B2 (en) 2001-02-09 2018-05-29 Sony Interactive Entertainment America Llc Advertising impression determination
US9466074B2 (en) 2001-02-09 2016-10-11 Sony Interactive Entertainment America Llc Advertising impression determination
US8799059B2 (en) * 2001-04-30 2014-08-05 Performance Pricing Holdings, Llc System and method for the presentation of advertisements
US20120179554A1 (en) * 2001-04-30 2012-07-12 Ari Rosenberg System and method for the presentation of advertisements
US20120150628A1 (en) * 2001-04-30 2012-06-14 Ari Rosenberg System and method for the presentation of advertisements
US10929869B2 (en) * 2001-04-30 2021-02-23 Performance Pricing Holdings, Llc System and method for the presentation of advertisements
US20120095842A1 (en) * 2001-06-21 2012-04-19 Fogelson Bruce A Method and system for creating ad-books
US20030212667A1 (en) * 2002-05-10 2003-11-13 International Business Machines Corporation Systems, methods, and computer program products to browse database query information
US7225412B2 (en) * 2002-12-03 2007-05-29 Lockheed Martin Corporation Visualization toolkit for data cleansing applications
US20040104925A1 (en) * 2002-12-03 2004-06-03 Lockheed Martin Corporation Visualization toolkit for data cleansing applications
US9767478B2 (en) * 2003-09-30 2017-09-19 Google Inc. Document scoring based on traffic associated with a document
US8316029B2 (en) * 2003-09-30 2012-11-20 Google Inc. Document scoring based on traffic associated with a document
US20070088693A1 (en) * 2003-09-30 2007-04-19 Google Inc. Document scoring based on traffic associated with a document
US8763157B2 (en) 2004-08-23 2014-06-24 Sony Computer Entertainment America Llc Statutory license restricted digital media playback on portable devices
US10042987B2 (en) 2004-08-23 2018-08-07 Sony Interactive Entertainment America Llc Statutory license restricted digital media playback on portable devices
US9531686B2 (en) 2004-08-23 2016-12-27 Sony Interactive Entertainment America Llc Statutory license restricted digital media playback on portable devices
US9325738B2 (en) 2005-04-22 2016-04-26 Blue Coat Systems, Inc. Methods and apparatus for blocking unwanted software downloads
US8316446B1 (en) * 2005-04-22 2012-11-20 Blue Coat Systems, Inc. Methods and apparatus for blocking unwanted software downloads
US9873052B2 (en) 2005-09-30 2018-01-23 Sony Interactive Entertainment America Llc Monitoring advertisement impressions
US8574074B2 (en) 2005-09-30 2013-11-05 Sony Computer Entertainment America Llc Advertising impression determination
US8267783B2 (en) 2005-09-30 2012-09-18 Sony Computer Entertainment America Llc Establishing an impression area
US10789611B2 (en) 2005-09-30 2020-09-29 Sony Interactive Entertainment LLC Advertising impression determination
US10046239B2 (en) 2005-09-30 2018-08-14 Sony Interactive Entertainment America Llc Monitoring advertisement impressions
US10467651B2 (en) 2005-09-30 2019-11-05 Sony Interactive Entertainment America Llc Advertising impression determination
US8626584B2 (en) 2005-09-30 2014-01-07 Sony Computer Entertainment America Llc Population of an advertisement reference list
US11436630B2 (en) 2005-09-30 2022-09-06 Sony Interactive Entertainment LLC Advertising impression determination
US9129301B2 (en) 2005-09-30 2015-09-08 Sony Computer Entertainment America Llc Display of user selected advertising content in a digital environment
US8795076B2 (en) 2005-09-30 2014-08-05 Sony Computer Entertainment America Llc Advertising impression determination
US10657538B2 (en) 2005-10-25 2020-05-19 Sony Interactive Entertainment LLC Resolution of advertising rules
US11004089B2 (en) 2005-10-25 2021-05-11 Sony Interactive Entertainment LLC Associating media content files with advertisements
US9864998B2 (en) 2005-10-25 2018-01-09 Sony Interactive Entertainment America Llc Asynchronous advertising
US10410248B2 (en) 2005-10-25 2019-09-10 Sony Interactive Entertainment America Llc Asynchronous advertising placement based on metadata
US11195185B2 (en) 2005-10-25 2021-12-07 Sony Interactive Entertainment LLC Asynchronous advertising
US8676900B2 (en) 2005-10-25 2014-03-18 Sony Computer Entertainment America Llc Asynchronous advertising placement based on metadata
US9367862B2 (en) 2005-10-25 2016-06-14 Sony Interactive Entertainment America Llc Asynchronous advertising placement based on metadata
US8412569B1 (en) * 2006-03-31 2013-04-02 Google Inc. Determining advertising statistics for advertisers and/or advertising networks
US20070239532A1 (en) * 2006-03-31 2007-10-11 Scott Benson Determining advertising statistics for advertisers and/or advertising networks
US8645992B2 (en) 2006-05-05 2014-02-04 Sony Computer Entertainment America Llc Advertisement rotation
US20080004955A1 (en) * 2006-06-28 2008-01-03 Microsoft Corporation Use of business heuristics and data to optimize online advertisement and marketing
US20080004947A1 (en) * 2006-06-28 2008-01-03 Microsoft Corporation Online keyword buying, advertisement and marketing
US20080056575A1 (en) * 2006-08-30 2008-03-06 Bradley Jeffery Behm Method and system for automatically classifying page images
US8306326B2 (en) * 2006-08-30 2012-11-06 Amazon Technologies, Inc. Method and system for automatically classifying page images
US9594833B2 (en) 2006-08-30 2017-03-14 Amazon Technologies, Inc. Automatically classifying page images
US20140214560A1 (en) * 2006-09-06 2014-07-31 Mediamath, Inc. System and method for dynamic online advertisement creation and management
US20080071612A1 (en) * 2006-09-18 2008-03-20 Microsoft Corporation Logocons: ad product for brand advertisers
US8103547B2 (en) * 2006-09-18 2012-01-24 Microsoft Corporation Logocons: AD product for brand advertisers
WO2008080104A1 (en) * 2006-12-21 2008-07-03 Google Inc. Estimating statistics for online advertising campaigns
US20110015992A1 (en) * 2006-12-21 2011-01-20 Mark Liffiton Estimating statistics for online advertising campaigns
US20080183561A1 (en) * 2007-01-26 2008-07-31 Exelate Media Ltd. Marketplace for interactive advertising targeting events
US8402133B1 (en) 2007-03-07 2013-03-19 conScore, Inc. Detecting content and user response to content
US8874563B1 (en) 2007-03-07 2014-10-28 Comscore, Inc. Detecting content and user response to content
US8972565B1 (en) 2007-03-07 2015-03-03 Comscore, Inc. Detecting content and user response to content
US7996519B1 (en) 2007-03-07 2011-08-09 Comscore, Inc. Detecting content and user response to content
US8060601B1 (en) 2007-03-07 2011-11-15 Comscore, Inc. Detecting content and user response to content
US10002369B2 (en) 2007-03-07 2018-06-19 Comscore, Inc. Detecting content and user response to content
US9578118B2 (en) 2007-03-07 2017-02-21 Comscore, Inc. Detecting content and user response to content
US20080235622A1 (en) * 2007-03-21 2008-09-25 Yahoo! Inc. Traffic production index and related metrics for analysis of a network of related web sites
US7885942B2 (en) * 2007-03-21 2011-02-08 Yahoo! Inc. Traffic production index and related metrics for analysis of a network of related web sites
US20080249872A1 (en) * 2007-03-26 2008-10-09 Russell Stephen A Systems and Methods for Enabling Users to Sample and Acquire Content
WO2008118441A1 (en) * 2007-03-26 2008-10-02 Mix & Burn, Llc Systems and methods for enabling users to sample and acquire content
US20080249832A1 (en) * 2007-04-04 2008-10-09 Microsoft Corporation Estimating expected performance of advertisements
US9349134B1 (en) * 2007-05-31 2016-05-24 Google Inc. Detecting illegitimate network traffic
US20090037253A1 (en) * 2007-07-30 2009-02-05 Davidow Dorothy Young System and method for online lead generation
US8229780B2 (en) * 2007-07-30 2012-07-24 Silvercarrot, Inc. System and method for online lead generation
US20090070336A1 (en) * 2007-09-07 2009-03-12 Sap Ag Method and system for managing transmitted requests
US10929874B1 (en) * 2007-10-01 2021-02-23 Google Llc Systems and methods for preserving privacy
US11526905B1 (en) * 2007-10-01 2022-12-13 Google Llc Systems and methods for preserving privacy
US10115124B1 (en) * 2007-10-01 2018-10-30 Google Llc Systems and methods for preserving privacy
US8416247B2 (en) 2007-10-09 2013-04-09 Sony Computer Entertaiment America Inc. Increasing the number of advertising impressions in an interactive environment
US9272203B2 (en) 2007-10-09 2016-03-01 Sony Computer Entertainment America, LLC Increasing the number of advertising impressions in an interactive environment
US8935381B2 (en) * 2007-11-27 2015-01-13 Zettics, Inc. Method and apparatus for real-time collection of information about application level activity and other user information on a mobile data network
US8958313B2 (en) 2007-11-27 2015-02-17 Zettics, Inc. Method and apparatus for storing data on application-level activity and other user information to enable real-time multi-dimensional reporting about user of a mobile data network
US20090138427A1 (en) * 2007-11-27 2009-05-28 Umber Systems Method and apparatus for storing data on application-level activity and other user information to enable real-time multi-dimensional reporting about user of a mobile data network
US8732170B2 (en) 2007-11-27 2014-05-20 Zettics, Inc. Method and apparatus for real-time multi-dimensional reporting and analyzing of data on application level activity and other user information on a mobile data network
US8755297B2 (en) 2007-11-27 2014-06-17 Zettics, Inc. System and method for collecting, reporting, and analyzing data on application-level activity and other user information on a mobile data network
US20090138447A1 (en) * 2007-11-27 2009-05-28 Umber Systems Method and apparatus for real-time collection of information about application level activity and other user information on a mobile data network
US8195661B2 (en) 2007-11-27 2012-06-05 Umber Systems Method and apparatus for storing data on application-level activity and other user information to enable real-time multi-dimensional reporting about user of a mobile data network
US10891661B2 (en) 2008-01-22 2021-01-12 2Kdirect, Llc Automatic generation of electronic advertising messages
US11580578B2 (en) 2008-01-22 2023-02-14 2Kdirect, Inc. Generation of electronic advertising messages based on model web pages
US8769558B2 (en) 2008-02-12 2014-07-01 Sony Computer Entertainment America Llc Discovery and analytics for episodic downloaded media
US9525902B2 (en) 2008-02-12 2016-12-20 Sony Interactive Entertainment America Llc Discovery and analytics for episodic downloaded media
US20090216579A1 (en) * 2008-02-22 2009-08-27 Microsoft Corporation Tracking online advertising using payment services
US20090248680A1 (en) * 2008-03-26 2009-10-01 Umber Systems System and Method for Sharing Anonymous User Profiles with a Third Party
US8775391B2 (en) 2008-03-26 2014-07-08 Zettics, Inc. System and method for sharing anonymous user profiles with a third party
US20100094860A1 (en) * 2008-10-09 2010-04-15 Google Inc. Indexing online advertisements
US20120303349A1 (en) * 2008-11-07 2012-11-29 Roy H Scott Enhanced matching through explore/exploit schemes
US8560293B2 (en) * 2008-11-07 2013-10-15 Yahoo! Inc. Enhanced matching through explore/exploit schemes
US8805861B2 (en) 2008-12-09 2014-08-12 Google Inc. Methods and systems to train models to extract and integrate information from data sources
US20100145902A1 (en) * 2008-12-09 2010-06-10 Ita Software, Inc. Methods and systems to train models to extract and integrate information from data sources
US20100205665A1 (en) * 2009-02-11 2010-08-12 Onur Komili Systems and methods for enforcing policies for proxy website detection using advertising account id
US20100205291A1 (en) * 2009-02-11 2010-08-12 Richard Baldry Systems and methods for enforcing policies in the discovery of anonymizing proxy communications
US9734125B2 (en) 2009-02-11 2017-08-15 Sophos Limited Systems and methods for enforcing policies in the discovery of anonymizing proxy communications
US10803005B2 (en) 2009-02-11 2020-10-13 Sophos Limited Systems and methods for enforcing policies in the discovery of anonymizing proxy communications
US8695091B2 (en) 2009-02-11 2014-04-08 Sophos Limited Systems and methods for enforcing policies for proxy website detection using advertising account ID
US20100205297A1 (en) * 2009-02-11 2010-08-12 Gurusamy Sarathy Systems and methods for dynamic detection of anonymizing proxies
US20100205215A1 (en) * 2009-02-11 2010-08-12 Cook Robert W Systems and methods for enforcing policies to block search engine queries for web-based proxy sites
US8554602B1 (en) 2009-04-16 2013-10-08 Exelate, Inc. System and method for behavioral segment optimization based on data exchange
WO2010138512A1 (en) * 2009-05-26 2010-12-02 Facebook, Inc. Measuring impact of online advertising campaigns
US20100306043A1 (en) * 2009-05-26 2010-12-02 Robert Taaffe Lindsay Measuring Impact Of Online Advertising Campaigns
US20100318418A1 (en) * 2009-06-16 2010-12-16 Microsoft Corporation Advertising inventory prediction for frequency-capped lines
US20110016121A1 (en) * 2009-07-16 2011-01-20 Hemanth Sambrani Activity Based Users' Interests Modeling for Determining Content Relevance
US8612435B2 (en) 2009-07-16 2013-12-17 Yahoo! Inc. Activity based users' interests modeling for determining content relevance
US11164219B1 (en) 2009-08-06 2021-11-02 2Kdirect, Inc. Automatic generation of electronic advertising messages
US9474976B2 (en) 2009-08-11 2016-10-25 Sony Interactive Entertainment America Llc Management of ancillary content delivery and presentation
US10298703B2 (en) 2009-08-11 2019-05-21 Sony Interactive Entertainment America Llc Management of ancillary content delivery and presentation
US8763090B2 (en) 2009-08-11 2014-06-24 Sony Computer Entertainment America Llc Management of ancillary content delivery and presentation
US8532465B2 (en) * 2009-08-19 2013-09-10 Sony Corporation Moving image recording apparatus, moving image recording method and program
US20110044663A1 (en) * 2009-08-19 2011-02-24 Sony Corporation Moving image recording apparatus, moving image recording method and program
US8621068B2 (en) 2009-08-20 2013-12-31 Exelate Media Ltd. System and method for monitoring advertisement assignment
US11574343B2 (en) 2009-10-01 2023-02-07 2Kdirect, Inc. Automatic generation of electronic advertising messages containing one or more automatically selected stock photography images
US9436953B1 (en) * 2009-10-01 2016-09-06 2Kdirect, Llc Automatic generation of electronic advertising messages containing one or more automatically selected stock photography images
US10672037B1 (en) 2009-10-01 2020-06-02 2Kdirect, Llc Automatic generation of electronic advertising messages containing one or more automatically selected stock photography images
EP2510487A2 (en) * 2009-12-08 2012-10-17 comScore, Inc. Systems and methods for identification and reporting of ad delivery hierarchy
EP2510487A4 (en) * 2009-12-08 2014-11-19 Comscore Inc Systems and methods for identification and reporting of ad delivery hierarchy
US9390438B2 (en) 2009-12-08 2016-07-12 Comscore, Inc. Systems and methods for capturing and reporting metrics regarding user engagement including a canvas model
US20110209216A1 (en) * 2010-01-25 2011-08-25 Meir Zohar Method and system for website data access monitoring
US8949980B2 (en) * 2010-01-25 2015-02-03 Exelate Method and system for website data access monitoring
US11080763B2 (en) 2010-03-31 2021-08-03 Mediamath, Inc. Systems and methods for using server side cookies by a demand side platform
US10049391B2 (en) 2010-03-31 2018-08-14 Mediamath, Inc. Systems and methods for providing a demand side platform
US11055748B2 (en) 2010-03-31 2021-07-06 Mediamath, Inc. Systems and methods for providing a demand side platform
US10636060B2 (en) 2010-03-31 2020-04-28 Mediamath, Inc. Systems and methods for using server side cookies by a demand side platform
US11720929B2 (en) 2010-03-31 2023-08-08 Mediamath, Inc. Systems and methods for providing a demand side platform
US10332156B2 (en) 2010-03-31 2019-06-25 Mediamath, Inc. Systems and methods for using server side cookies by a demand side platform
US11610232B2 (en) 2010-03-31 2023-03-21 Mediamath, Inc. Systems and methods for using server side cookies by a demand side platform
US10628859B2 (en) 2010-03-31 2020-04-21 Mediamath, Inc. Systems and methods for providing a demand side platform
US11308526B2 (en) 2010-03-31 2022-04-19 Mediamath, Inc. Systems and methods for using server side cookies by a demand side platform
US8484243B2 (en) * 2010-05-05 2013-07-09 Cisco Technology, Inc. Order-independent stream query processing
US20110302164A1 (en) * 2010-05-05 2011-12-08 Saileshwar Krishnamurthy Order-Independent Stream Query Processing
US11521218B2 (en) 2010-07-19 2022-12-06 Mediamath, Inc. Systems and methods for determining competitive market values of an ad impression
US11049118B2 (en) 2010-07-19 2021-06-29 Mediamath, Inc. Systems and methods for determining competitive market values of an ad impression
US11195187B1 (en) 2010-07-19 2021-12-07 Mediamath, Inc. Systems and methods for determining competitive market values of an ad impression
US10592910B2 (en) 2010-07-19 2020-03-17 Mediamath, Inc. Systems and methods for determining competitive market values of an ad impression
US10223703B2 (en) 2010-07-19 2019-03-05 Mediamath, Inc. Systems and methods for determining competitive market values of an ad impression
US8838784B1 (en) 2010-08-04 2014-09-16 Zettics, Inc. Method and apparatus for privacy-safe actionable analytics on mobile data usage
US8732014B2 (en) * 2010-12-20 2014-05-20 Yahoo! Inc. Automatic classification of display ads using ad images and landing pages
US20120158525A1 (en) * 2010-12-20 2012-06-21 Yahoo! Inc. Automatic classification of display ads using ad images and landing pages
US10810613B1 (en) 2011-04-18 2020-10-20 Oracle America, Inc. Ad search engine
US10755300B2 (en) 2011-04-18 2020-08-25 Oracle America, Inc. Optimization of online advertising assets
US10049377B1 (en) * 2011-06-29 2018-08-14 Google Llc Inferring interactions with advertisers
US10719846B1 (en) * 2011-06-29 2020-07-21 Google Llc Inferring interactions with advertisers
US11120468B2 (en) * 2011-06-29 2021-09-14 Google Llc Inferring interactions with advertisers
US9734508B2 (en) 2012-02-28 2017-08-15 Microsoft Technology Licensing, Llc Click fraud monitoring based on advertising traffic
US20130290854A1 (en) * 2012-04-27 2013-10-31 Adobe Systems Inc. Method and apparatus for isolating analytics logic from content creation in a rich internet application
US9679297B2 (en) * 2012-04-27 2017-06-13 Adobe Systems Incorporated Method and apparatus for isolating analytics logic from content creation in a rich internet application
US20170316466A1 (en) * 2012-06-30 2017-11-02 Oracle America, Inc. System and Methods for Discovering Advertising Traffic Flow and Impinging Entities
US11023933B2 (en) * 2012-06-30 2021-06-01 Oracle America, Inc. System and methods for discovering advertising traffic flow and impinging entities
US10467652B2 (en) 2012-07-11 2019-11-05 Oracle America, Inc. System and methods for determining consumer brand awareness of online advertising using recognition
US10037543B2 (en) * 2012-08-13 2018-07-31 Amobee, Inc. Estimating conversion rate in display advertising from past performance data
US20140181303A1 (en) * 2012-12-21 2014-06-26 Scott Andrew Meyer Custom local content provision
US11068925B2 (en) * 2013-01-13 2021-07-20 Adfin Solutions, Inc. Real-time digital asset sampling apparatuses, methods and systems
US20220012767A1 (en) * 2013-01-13 2022-01-13 Adfin Solutions Real-time digital asset sampling apparatuses, methods and systems
US9858526B2 (en) 2013-03-01 2018-01-02 Exelate, Inc. Method and system using association rules to form custom lists of cookies
US9621472B1 (en) 2013-03-14 2017-04-11 Moat, Inc. System and method for dynamically controlling sample rates and data flow in a networked measurement system by dynamic determination of statistical significance
US10068250B2 (en) 2013-03-14 2018-09-04 Oracle America, Inc. System and method for measuring mobile advertising and content by simulating mobile-device usage
US10600089B2 (en) 2013-03-14 2020-03-24 Oracle America, Inc. System and method to measure effectiveness and consumption of editorial content
US10742526B2 (en) 2013-03-14 2020-08-11 Oracle America, Inc. System and method for dynamically controlling sample rates and data flow in a networked measurement system by dynamic determination of statistical significance
US10715864B2 (en) 2013-03-14 2020-07-14 Oracle America, Inc. System and method for universal, player-independent measurement of consumer-online-video consumption behaviors
US10075350B2 (en) 2013-03-14 2018-09-11 Oracle Amereica, Inc. System and method for dynamically controlling sample rates and data flow in a networked measurement system by dynamic determination of statistical significance
US9269049B2 (en) 2013-05-08 2016-02-23 Exelate, Inc. Methods, apparatus, and systems for using a reduced attribute vector of panel data to determine an attribute of a user
US9578044B1 (en) * 2014-03-24 2017-02-21 Amazon Technologies, Inc. Detection of anomalous advertising content
US11829901B2 (en) 2014-10-31 2023-11-28 The Nielsen Company (Us), Llc Methods and apparatus to identify publisher advertising behavior
US11610230B2 (en) 2015-04-27 2023-03-21 Google Llc System and method of detection and recording of realization actions in association with content rendering
US10504155B2 (en) * 2015-04-27 2019-12-10 Google Llc System and method of detection and recording of realization actions in association with content rendering
US10812612B2 (en) 2015-09-09 2020-10-20 Fastly, Inc. Execution of per-user functions at cache nodes
WO2017058276A1 (en) * 2015-09-29 2017-04-06 Fastly, Inc. Persistent edge state of end user devices at cache nodes
US11611628B2 (en) 2015-09-29 2023-03-21 Fastly, Inc. Persistent edge state of end user devices at network nodes
US10742754B2 (en) 2015-09-29 2020-08-11 Fastly, Inc. Persistent edge state of end user devices at cache nodes
US10346871B2 (en) * 2016-04-22 2019-07-09 Facebook, Inc. Automatic targeting of content by clustering based on user feedback data
US10068188B2 (en) 2016-06-29 2018-09-04 Visual Iq, Inc. Machine learning techniques that identify attribution of small signal stimulus in noisy response channels
US11556964B2 (en) 2016-08-03 2023-01-17 Mediamath, Inc. Methods, systems, and devices for counterfactual-based incrementality measurement in digital ad-bidding platform
US10977697B2 (en) 2016-08-03 2021-04-13 Mediamath, Inc. Methods, systems, and devices for counterfactual-based incrementality measurement in digital ad-bidding platform
US11170413B1 (en) 2016-08-03 2021-11-09 Mediamath, Inc. Methods, systems, and devices for counterfactual-based incrementality measurement in digital ad-bidding platform
US10467659B2 (en) 2016-08-03 2019-11-05 Mediamath, Inc. Methods, systems, and devices for counterfactual-based incrementality measurement in digital ad-bidding platform
US10846779B2 (en) 2016-11-23 2020-11-24 Sony Interactive Entertainment LLC Custom product categorization of digital media content
US10860987B2 (en) 2016-12-19 2020-12-08 Sony Interactive Entertainment LLC Personalized calendar for digital media content-related events
US11727440B2 (en) 2017-05-17 2023-08-15 Mediamath, Inc. Systems, methods, and devices for decreasing latency and/or preventing data leakage due to advertisement insertion
US10740795B2 (en) 2017-05-17 2020-08-11 Mediamath, Inc. Systems, methods, and devices for decreasing latency and/or preventing data leakage due to advertisement insertion
US10354276B2 (en) 2017-05-17 2019-07-16 Mediamath, Inc. Systems, methods, and devices for decreasing latency and/or preventing data leakage due to advertisement insertion
US10931991B2 (en) 2018-01-04 2021-02-23 Sony Interactive Entertainment LLC Methods and systems for selectively skipping through media content
US11348142B2 (en) 2018-02-08 2022-05-31 Mediamath, Inc. Systems, methods, and devices for componentization, modification, and management of creative assets for diverse advertising platform environments
US11810156B2 (en) 2018-02-08 2023-11-07 MediaMath Acquisition Corporation Systems, methods, and devices for componentization, modification, and management of creative assets for diverse advertising platform environments
US20190335327A1 (en) * 2018-04-27 2019-10-31 T-Mobile Usa, Inc. Partitioning network addresses in network cell data to address user privacy
US11580163B2 (en) * 2019-08-16 2023-02-14 Palo Alto Networks, Inc. Key-value storage for URL categorization
US20230108362A1 (en) * 2019-08-16 2023-04-06 Palo Alto Networks, Inc. Key-value storage for url categorization
US11748433B2 (en) 2019-08-16 2023-09-05 Palo Alto Networks, Inc. Communicating URL categorization information
US11516277B2 (en) 2019-09-14 2022-11-29 Oracle International Corporation Script-based techniques for coordinating content selection across devices
US11514477B2 (en) 2019-09-23 2022-11-29 Mediamath, Inc. Systems, methods, and devices for digital advertising ecosystems implementing content delivery networks utilizing edge computing
US11182829B2 (en) 2019-09-23 2021-11-23 Mediamath, Inc. Systems, methods, and devices for digital advertising ecosystems implementing content delivery networks utilizing edge computing
CN111260341A (en) * 2020-05-06 2020-06-09 武汉中科通达高新技术股份有限公司 Traffic violation data auditing method, computer equipment and readable storage medium

Also Published As

Publication number Publication date
WO2001052462A3 (en) 2002-01-31
EP1252735A2 (en) 2002-10-30
WO2001052462A2 (en) 2001-07-19
JP2004504649A (en) 2004-02-12
US8661111B1 (en) 2014-02-25
AU2001217524A1 (en) 2001-07-24
EP1252735B1 (en) 2011-08-24
US9514479B2 (en) 2016-12-06
ATE522036T1 (en) 2011-09-15
EP1252735A4 (en) 2009-07-29
CA2396565A1 (en) 2001-07-19
JP5072160B2 (en) 2012-11-14
US20140122224A1 (en) 2014-05-01

Similar Documents

Publication Publication Date Title
US9514479B2 (en) System and method for estimating prevalence of digital content on the world-wide-web
US6804701B2 (en) System and method for monitoring and analyzing internet traffic
US8554804B2 (en) System and method for monitoring and analyzing internet traffic
Eirinaki et al. Web mining for web personalization
US7100111B2 (en) Method and system for optimum placement of advertisements on a webpage
US6466970B1 (en) System and method for collecting and analyzing information about content requested in a network (World Wide Web) environment
US11341510B2 (en) Determining client system attributes
US8639575B2 (en) Audience segment estimation
US20050086105A1 (en) Optimization of advertising campaigns on computer networks
WO2001025896A1 (en) System and method for monitoring and analyzing internet traffic
Jamalzadeh Analysis of clickstream data
ES2371404T3 (en) SYSTEM AND PROCEDURE TO ESTIMATE THE PREVALENCE OF DIGITAL CONTENT ON THE WORLD-WIDE-WEB.
Raju Online Visitor Classification and Unified Creation With Clickstream Data
Smith et al. Personalizing e-commerce with data mining
Dalal et al. Ch. 12. The promise and challenge of mining web transaction data
Dalal et al. The Promise and Challenge of Mining Web
FRHAN A Model of Website Usage Visualization Estimated on Clickstream Data with Apache Flume Using Improved Markov Chain Approximation
Dalal et al. Mining Gold from E-Commerce Transactions: Challenges

Legal Events

Date Code Title Description
AS Assignment

Owner name: CITIBANK, N.A., AS COLATERAL AGENT, NEW YORK

Free format text: SECURITY AGREEMENT;ASSIGNOR:NETRATINGS, INC.;REEL/FRAME:019817/0774

Effective date: 20070809

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: NETRATINGS, LLC, NEW YORK

Free format text: RELEASE (REEL 019817 / FRAME 0774);ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:061671/0001

Effective date: 20221011